File size: 3,955 Bytes
b31e197
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d2de949
a1b3db3
5154308
 
 
a1b3db3
5154308
a1b3db3
4e442ed
a1b3db3
d2de949
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
827ad60
b5772af
11c9cf2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
---
license: mit
title: πŸ‡«πŸ‡· Assistant RH β€” RAG Chatbot
sdk: gradio
emoji: πŸ“š
colorFrom: indigo
colorTo: purple
app_file: app.py
pinned: true
short_description: πŸ‘‰ RAG-powered AI assistant for French Human Resources
tags:
- gradio
- rag
- faiss
- openai
- hr
- human-resources
- law
- france
- french
- chatbot
thumbnail: >-
  https://cdn-uploads.huggingface.co/production/uploads/6668057ef7604601278857f5/JeivLn409aMRCqx6RwO2J.png
---

# πŸ‡«πŸ‡· RAG-powered HR Assistant

πŸ‘‰ **An AI assistant specialised in French Human Resources**  
Built with **Retrieval-Augmented Generation (RAG)** on top of **official public datasets**.  
It retrieves trusted information, generates concise answers, and always cites its sources.

πŸš€ **Live demo on Hugging Face** : [![Hugging Face Space](https://img.shields.io/badge/πŸ€—-HuggingFace%20Space-blue)](https://huggingface.co/spaces/edouardfoussier/rag-rh-assistant)

![App Screenshot](assets/screenshot2.png)

---

## ✨ What is this?

This project is an **AI assistant** for HR topics in the **French labor law and public administration HR practices**.  
It combines **retrieval** over trusted sources with **LLM synthesis**, and cites its sources.

- UI: **Gradio**
- Retrieval: **FAISS** (fallback: NumPy)
- Embeddings: **HF Inference API**
- LLM: **OpenAI** (BYO API Key)

---

## πŸ“š Datasets & Attribution

This space relies on **public HR datasets** curated by [**AgentPublic**](https://huggingface.co/datasets/AgentPublic):
- [Service-Public dataset](https://huggingface.co/datasets/AgentPublic/service-public)
- [Travail-Emploi dataset](https://huggingface.co/datasets/AgentPublic/travail-emploi)

For this project, I built **cleaned and filtered derivatives** hosted under my profile:
- [edouardfoussier/service-public-filtered](https://huggingface.co/datasets/edouardfoussier/service-public-filtered)
- [edouardfoussier/travail-emploi-clean](https://huggingface.co/datasets/edouardfoussier/travail-emploi-clean)

---

## βš™οΈ How it works

1. **Question** β†’ User asks in French (e.g., β€œDPAE : quelles obligations ?”).  
2. **Retrieve** β†’ FAISS searches semantic vectors from the datasets.  
3. **Synthesize** β†’ The LLM writes a concise, factual answer with citations `[1], [2], …`.  
4. **Explain** β†’ The β€œSources” panel shows the original articles used for answer generation

---

## πŸ”‘ BYOK

The app never stores your OpenAI key; it’s used in-session only.

---

## 🧩 Configuration notes

- FAISS is used when available; otherwise we fall back to NumPy dot-product search.
- The retriever loads vectors from the datasets and keeps a compressed cache at runtime (/tmp/rag_index.npz) to speed up cold starts.
- You can change the Top-K slider in the UI; it controls both retrieval and the number of passages given to the LLM.

---

## πŸš€ Run locally

### 1) Clone & install
```bash
git clone https://huggingface.co/spaces/edouardfoussier/rag-rh-assistant
cd rag-rh-assistant
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
```

### 2) Configure environment
Key env vars:
- HF_API_TOKEN β†’ required for embeddings via HF Inference API
- HF_EMBEDDINGS_MODEL β†’ defaults to BAAI/bge-m3
- EMBED_COL β†’ name of the embedding column in the dataset (defaults to embeddings_bge-m3)
- OPENAI_API_KEY β†’ optional at startup (you can also enter it in the UI)
- LLM_MODEL β†’ e.g. gpt-4o-mini (configurable)
- LLM_BASE_URL β†’ default https://api.openai.com/v1

### 3) Launch
```bash
python app.py
```

Open http://127.0.0.1:7860 and enter your OpenAI API key in the sidebar (or set it in .env).

---

## πŸ“Š Roadmap

- Reranking (cross-encoder)
- Multi-turn memory
- More datasets (other ministries, codes)
- Hallucination checks & eval (faithfulness)
- Multi-LLM backends

---

## πŸ™Œ Credits

- Original data: [**AgentPublic**](https://huggingface.co/datasets/AgentPublic)
- Built with: Hugging Face Spaces, Gradio, FAISS, OpenAI