Spaces:
Running
Running
A newer version of the Streamlit SDK is available:
1.43.2
metadata
title: Chat With Documents
emoji: π
colorFrom: purple
colorTo: purple
sdk: streamlit
sdk_version: 1.13.0
app_file: app.py
pinned: false
Chat With Documents π€π
Welcome to the Chat with Documents app! π This Streamlit app allows you to upload PDF and PPT files, extract their content, store the extracted text in a vector store, and interact with it using natural language queries! π€π¬
Built with LangChain, OpenAI, Streamlit, and Astra DB, this project leverages the power of LLMs (Large Language Models) to allow users to chat with their documents like never before. π§
π Features
- PDF & PPT Extraction: Upload PDF and PowerPoint files to extract text! πβ‘οΈπ
- Vector Store: Automatically stores extracted text in a Cassandra vector store. ππ
- Ask Anything: Ask questions about the document and get answers powered by OpenAI! π€β
π οΈ Tech Stack
- Streamlit: Frontend framework to interact with the app.
- LangChain: For seamless document processing and querying.
- OpenAI: For LLM integration to provide intelligent responses.
- Astra DB: Database for storing and managing vectorized text data.
- Python Libraries: PyPDF2, python-pptx, cassio, and more.
π‘ How It Works
- Upload a PDF or PPT file using the file uploader. π€
- The app will extract text from the file using PyPDF2 (for PDFs) or python-pptx (for PPTs). πβ‘οΈπ
- The extracted text is split into manageable chunks using LangChain's CharacterTextSplitter. βοΈ
- The chunks are then added to Cassandra as vectorized data using OpenAI embeddings. π
- Ask any query about the content of your document, and the app will respond using the power of OpenAI! π€π¬
π― Why Use This?
- Make documents interactive: Easily explore the content of your documents by asking questions.
- Quick retrieval: With the text stored in a vector store, you can query the content efficiently.
β¨ Enjoy the App! β¨
Now, go ahead and chat with your documents! π