Spaces:
Running
Running
A newer version of the Streamlit SDK is available:
1.52.2
metadata
title: PDF Chatbot
emoji: π
colorFrom: blue
colorTo: purple
sdk: streamlit
app_port: 7860
pinned: false
license: mit
π PDF RAG Chatbot (Groq + LangChain)
A Retrieval-Augmented Generation (RAG) application that allows users to:
- Upload a PDF
- Ask questions based only on the PDF content
- Get accurate answers powered by Groq LLMs
- Runs fully on CPU (Hugging Face Free Tier)
π Features
- π PDF upload & processing
- βοΈ Intelligent text chunking
- π Semantic search using embeddings
- π§ Context-aware LLM responses
- π§Ή Memory clear & health endpoints
- β‘ Fast inference via Groq API
π§± Tech Stack
- Frontend: Streamlit
- Backend: FastAPI
- LLM: Groq (
llama-3.1-8b-instant) - Embeddings:
all-MiniLM-L6-v2 - Vector DB: Chroma (in-memory)
- Frameworks: LangChain
- Deployment: Docker + Hugging Face Spaces
π§ͺ How It Works (RAG Pipeline)
- Upload PDF
- Split text into chunks
- Generate embeddings
- Store in vector database
- Retrieve relevant chunks
- Generate answer using Groq LLM
π₯οΈ Usage
- Upload a PDF file
- Ask questions related to the document
- If the answer is not in the PDF, the assistant will reply:
"I cannot find this in the PDF."
π Environment Variables
The following secret must be added in Hugging Face Spaces:
| Variable | Description |
|---|---|
GROQ_API_KEY |
Groq API key |
β οΈ Do NOT commit
.envfiles to the repository.
β€οΈ Notes
- Runs on CPU only (no GPU required)
- Free-tier friendly
- First load may take a few minutes
- Space may sleep when idle
π¨βπ» Author
Abhishek Saxena
M.Tech Data Science, IIT Roorkee
β If you like this project
Give it a β on Hugging Face and feel free to fork!