1. Ingest
Users add PDFs, text files, CSV files, or a web page URL through the Next.js interface.
DocuMind
Add documents and web pages, ask natural language questions, and review the exact snippets behind every response. DocuMind uses a corrective retrieval loop to retry weak searches before it answers.
Model
Gemini for answer generation and embeddings
Storage
Qdrant Cloud configured
Add source
Index PDFs, text files, CSV files, and web pages. Every answer stays tied to the content you explicitly ingested.
Workspace
No sources indexed yet
Upload PDF, TXT, or CSV
DocuMind accepts .pdf, .txt, and .csv files up to 10MB.
Drop a file here or click to browse
Server-side extraction, chunking, embeddings, and retrieval happen after upload.
Indexing status
Extracting text
Read the source and extract clean text with source metadata where available.
Splitting into chunks
Break the source into overlapping sections for semantic retrieval.
Creating embeddings
Convert each chunk into vectors using Gemini embeddings for semantic retrieval.
Saving to vector database
Store chunk vectors in Qdrant Cloud, or an in-memory fallback for local use.
Ready to answer
The indexed workspace is ready for grounded questions and source-backed answers.
Ready to index a document.
Chat
Add one or more sources first, then ask questions like "Summarize the main argument" or "What does the policy say about pricing?".
Sources
Retrieved source snippets appear here after each answer so you can verify the context behind the response.
How it works
DocuMind keeps the pipeline transparent from ingestion through answer generation, and it retries retrieval only when the first pass looks weak.
Users add PDFs, text files, CSV files, or a web page URL through the Next.js interface.
Server-side code extracts readable text and preserves source metadata, including PDF page numbers and source labels.
A lightweight custom chunker splits each source into overlapping sections for better semantic retrieval.
Each chunk is embedded with Gemini and stored in Qdrant Cloud, or an in-memory store for local use.
At question time, DocuMind embeds the query and retrieves the most relevant chunks from the indexed workspace.
If the first retrieval looks weak, Gemini rewrites the query for retrieval and DocuMind runs a second pass before answering.
Only the final retrieved context is sent to Gemini, which answers concisely and cites the supporting chunks.
Why DocuMind
DocuMind helps users ask questions over PDFs, text files, CSV files, and indexed web pages without losing track of where the answer came from.
Each response is generated from retrieved source snippets, so users can review the supporting context instead of trusting a black-box answer.
Use it for study notes, reports, policies, research papers, product documents, and other sources where grounded answers matter.