---
license: mit
title: 🇫🇷 Assistant RH — RAG Chatbot
sdk: gradio
emoji: 📚
colorFrom: indigo
colorTo: purple
app_file: app.py
pinned: true
short_description: 👉 RAG-powered AI assistant for French Human Resources
tags:
- gradio
- rag
- faiss
- anthropic
- claude
- openai
- hr
- human-resources
- law
- france
- french
- chatbot
thumbnail: >-
  https://cdn-uploads.huggingface.co/production/uploads/6668057ef7604601278857f5/JeivLn409aMRCqx6RwO2J.png
---

# 🇫🇷 RAG-powered HR Assistant

👉 **An AI assistant specialised in French Human Resources**  
Built with **Retrieval-Augmented Generation (RAG)** on top of **official public datasets**.  
It retrieves trusted information, generates concise answers, and always cites its sources.

🚀 **Live demo on Hugging Face** : [![Hugging Face Space](https://img.shields.io/badge/🤗-HuggingFace%20Space-blue)](https://huggingface.co/spaces/edouardfoussier/rag-rh-assistant)

![App Screenshot](assets/screenshot2.png)

---

## ✨ What is this?

This project is an **AI assistant** for HR topics in the **French labor law and public administration HR practices**.  
It combines **retrieval** over trusted sources with **LLM synthesis**, and cites its sources.

**Key features:**
- 🤖 **Multi-LLM support**: Choose between OpenAI or Anthropic (Claude) models
- 📚 **Trusted sources**: Built on official French government datasets
- 🔍 **Hybrid retrieval**: Semantic + full-text search for precise results
- 📊 **Evaluation-driven**: Custom metrics to measure and improve performance

**Tech stack:**
- UI: **Gradio**
- Retrieval: **FAISS** (fallback: NumPy) + PostgreSQL full-text search
- Embeddings: **HF Inference API**
- LLM: **Anthropic** or **OpenAI** (BYO API Key)

---

## 📚 Datasets & Attribution

This space relies on **public HR datasets** curated by [**AgentPublic**](https://huggingface.co/datasets/AgentPublic):
- [Service-Public dataset](https://huggingface.co/datasets/AgentPublic/service-public)
- [Travail-Emploi dataset](https://huggingface.co/datasets/AgentPublic/travail-emploi)

For this project, I built **cleaned and filtered derivatives** hosted under my profile:
- [edouardfoussier/service-public-filtered](https://huggingface.co/datasets/edouardfoussier/service-public-filtered)
- [edouardfoussier/travail-emploi-clean](https://huggingface.co/datasets/edouardfoussier/travail-emploi-clean)

---

## ⚙️ How it works

1. **Question** → User asks in French (e.g., "DPAE : quelles obligations ?").  
2. **Retrieve** → Hybrid search (semantic + full-text) finds relevant passages from the datasets.  
3. **Synthesize** → Your chosen LLM (Anthropic or OpenAI) writes a concise, factual answer with citations `[1], [2], …`.  
4. **Explain** → The "Sources" panel shows the original articles used for answer generation.

---

## 🔑 BYOK (Bring Your Own Key)

The app supports **Anthropic (Claude)** and **OpenAI** models.  
Your API key is never stored; it's used in-session only for secure, private inference.

**Supported models:**
- Anthropic: `claude-sonnet-4-5`, `claude-opus-4-1`, `claude-haiku-4-5`
- OpenAI: `gpt-4o`, `gpt-4o-mini`, `gpt-4-turbo`, `gpt-3.5-turbo`

---

## 🧩 Configuration notes

- FAISS is used when available; otherwise we fall back to NumPy dot-product search.
- The retriever loads vectors from the datasets and keeps a compressed cache at runtime (`/tmp/rag_index.npz`) to speed up cold starts.
- You can change the Top-K slider in the UI; it controls both retrieval and the number of passages given to the LLM.
- Provider and model selection in the sidebar allow you to compare different LLMs.

---

## 🚀 Run locally

### 1) Clone & install
git clone https://huggingface.co/spaces/edouardfoussier/rag-rh-assistant
cd rag-rh-assistant
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt### 2) Configure environment

Key env vars:

**Required:**
- `HF_API_TOKEN` → required for embeddings via HF Inference API

**Optional (for default provider/model):**
- `LLM_PROVIDER` → `openai` or `anthropic` (default: `openai`)
- `LLM_MODEL` → e.g., `gpt-4o-mini` or `claude-sonnet-4-5`
- `ANTHROPIC_API_KEY` → your Anthropic key (or enter in UI)
- `OPENAI_API_KEY` → your OpenAI key (or enter in UI)

**Other:**
- `HF_EMBEDDINGS_MODEL` → defaults to `BAAI/bge-m3`
- `EMBED_COL` → name of the embedding column (defaults to `embeddings_bge-m3`)
- `LLM_BASE_URL` → default `https://api.openai.com/v1`

### 3) Launch
python app.pyOpen http://127.0.0.1:7860 and select your preferred LLM provider in the sidebar.

---

## 📊 Roadmap

- ✅ Multi-LLM backends (Anthropic + OpenAI)
- ✅ Hybrid retrieval (semantic + full-text)
- ✅ Custom evaluation metrics
- 🔜 Reranking (cross-encoder)
- 🔜 Multi-turn conversation memory
- 🔜 More datasets (other ministries, legal codes)
- 🔜 Advanced hallucination detection

---

## 🙌 Credits

- Original data: [**AgentPublic**](https://huggingface.co/datasets/AgentPublic)
- Built with: Hugging Face Spaces, Gradio, FAISS, Anthropic, OpenAI