Fiqa commited on
Commit
c494433
Β·
verified Β·
1 Parent(s): defa466

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -0
README.md CHANGED
@@ -10,3 +10,55 @@ pinned: false
10
  ---
11
 
12
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  ---
11
 
12
 
13
+
14
+ ---
15
+
16
+ # Chat With Documents πŸ€–πŸ“„
17
+
18
+ Welcome to the **Chat with Documents** app! πŸš€ This Streamlit app allows you to upload PDF and PPT files, extract their content, store the extracted text in a vector store, and interact with it using natural language queries! πŸ€–πŸ’¬
19
+
20
+ Built with **LangChain**, **OpenAI**, **Streamlit**, and **Astra DB**, this project leverages the power of LLMs (Large Language Models) to allow users to chat with their documents like never before. 🧠
21
+
22
+ ---
23
+
24
+ ### πŸš€ **Features**
25
+
26
+ - **PDF & PPT Extraction**: Upload PDF and PowerPoint files to extract text! πŸ“„βž‘οΈπŸ“
27
+ - **Vector Store**: Automatically stores extracted text in a **Cassandra** vector store. πŸ”πŸ“š
28
+ - **Ask Anything**: Ask questions about the document and get answers powered by **OpenAI**! πŸ€–β“
29
+
30
+ ---
31
+
32
+ ### πŸ› οΈ **Tech Stack**
33
+ - **Streamlit**: Frontend framework to interact with the app.
34
+ - **LangChain**: For seamless document processing and querying.
35
+ - **OpenAI**: For LLM integration to provide intelligent responses.
36
+ - **Astra DB**: Database for storing and managing vectorized text data.
37
+ - **Python Libraries**: PyPDF2, python-pptx, cassio, and more.
38
+
39
+ ---
40
+
41
+
42
+
43
+ ### πŸ’‘ **How It Works**
44
+
45
+ - Upload a **PDF** or **PPT** file using the file uploader. πŸ“€
46
+ - The app will extract text from the file using **PyPDF2** (for PDFs) or **python-pptx** (for PPTs). πŸ“„βž‘οΈπŸ“
47
+ - The extracted text is split into manageable chunks using **LangChain's CharacterTextSplitter**. βœ‚οΈ
48
+ - The chunks are then added to **Cassandra** as vectorized data using **OpenAI embeddings**. πŸ”„
49
+ - Ask any query about the content of your document, and the app will respond using the power of **OpenAI**! πŸ€–πŸ’¬
50
+
51
+ ---
52
+
53
+ ### 🎯 **Why Use This?**
54
+
55
+ - **Make documents interactive**: Easily explore the content of your documents by asking questions.
56
+ - **Quick retrieval**: With the text stored in a vector store, you can query the content efficiently.
57
+
58
+
59
+ ---
60
+
61
+ ### ✨ **Enjoy the App!** ✨
62
+ Now, go ahead and chat with your documents! πŸ˜„
63
+
64
+ ---