Spaces:

edouardfoussier
/

rag-rh-assistant

Sleeping

App Files Files Community

rag-rh-assistant / README.md

edouardfoussier

Update README.md

5154308 verified 17 days ago

preview code

raw

history blame contribute delete

3.96 kB

	---
	license: mit
	title: 🇫🇷 Assistant RH — RAG Chatbot
	sdk: gradio
	emoji: 📚
	colorFrom: indigo
	colorTo: purple
	app_file: app.py
	pinned: true
	short_description: 👉 RAG-powered AI assistant for French Human Resources
	tags:
	- gradio
	- rag
	- faiss
	- openai
	- hr
	- human-resources
	- law
	- france
	- french
	- chatbot
	thumbnail: >-
	https://cdn-uploads.huggingface.co/production/uploads/6668057ef7604601278857f5/JeivLn409aMRCqx6RwO2J.png
	---

	# 🇫🇷 RAG-powered HR Assistant

	👉 An AI assistant specialised in French Human Resources
	Built with Retrieval-Augmented Generation (RAG) on top of official public datasets.
	It retrieves trusted information, generates concise answers, and always cites its sources.

	🚀 Live demo on Hugging Face : [![Hugging Face Space](https://img.shields.io/badge/🤗-HuggingFace%20Space-blue)](https://huggingface.co/spaces/edouardfoussier/rag-rh-assistant)

	![App Screenshot](assets/screenshot2.png)

	---

	## ✨ What is this?

	This project is an AI assistant for HR topics in the French labor law and public administration HR practices.
	It combines retrieval over trusted sources with LLM synthesis, and cites its sources.

	- UI: Gradio
	- Retrieval: FAISS (fallback: NumPy)
	- Embeddings: HF Inference API
	- LLM: OpenAI (BYO API Key)

	---

	## 📚 Datasets & Attribution

	This space relies on public HR datasets curated by [AgentPublic](https://huggingface.co/datasets/AgentPublic):
	- [Service-Public dataset](https://huggingface.co/datasets/AgentPublic/service-public)
	- [Travail-Emploi dataset](https://huggingface.co/datasets/AgentPublic/travail-emploi)

	For this project, I built cleaned and filtered derivatives hosted under my profile:
	- [edouardfoussier/service-public-filtered](https://huggingface.co/datasets/edouardfoussier/service-public-filtered)
	- [edouardfoussier/travail-emploi-clean](https://huggingface.co/datasets/edouardfoussier/travail-emploi-clean)

	---

	## ⚙️ How it works

	1. Question → User asks in French (e.g., “DPAE : quelles obligations ?”).
	2. Retrieve → FAISS searches semantic vectors from the datasets.
	3. Synthesize → The LLM writes a concise, factual answer with citations `[1], [2], …`.
	4. Explain → The “Sources” panel shows the original articles used for answer generation

	---

	## 🔑 BYOK

	The app never stores your OpenAI key; it’s used in-session only.

	---

	## 🧩 Configuration notes

	- FAISS is used when available; otherwise we fall back to NumPy dot-product search.
	- The retriever loads vectors from the datasets and keeps a compressed cache at runtime (/tmp/rag_index.npz) to speed up cold starts.
	- You can change the Top-K slider in the UI; it controls both retrieval and the number of passages given to the LLM.

	---

	## 🚀 Run locally

	### 1) Clone & install
	```bash
	git clone https://huggingface.co/spaces/edouardfoussier/rag-rh-assistant
	cd rag-rh-assistant
	python -m venv .venv
	source .venv/bin/activate
	pip install -r requirements.txt
	```

	### 2) Configure environment
	Key env vars:
	- HF_API_TOKEN → required for embeddings via HF Inference API
	- HF_EMBEDDINGS_MODEL → defaults to BAAI/bge-m3
	- EMBED_COL → name of the embedding column in the dataset (defaults to embeddings_bge-m3)
	- OPENAI_API_KEY → optional at startup (you can also enter it in the UI)
	- LLM_MODEL → e.g. gpt-4o-mini (configurable)
	- LLM_BASE_URL → default https://api.openai.com/v1

	### 3) Launch
	```bash
	python app.py
	```

	Open http://127.0.0.1:7860 and enter your OpenAI API key in the sidebar (or set it in .env).

	---

	## 📊 Roadmap

	- Reranking (cross-encoder)
	- Multi-turn memory
	- More datasets (other ministries, codes)
	- Hallucination checks & eval (faithfulness)
	- Multi-LLM backends

	---

	## 🙌 Credits

	- Original data: [AgentPublic](https://huggingface.co/datasets/AgentPublic)
	- Built with: Hugging Face Spaces, Gradio, FAISS, OpenAI