RAGVersion¶
Async-first version tracking system for RAG applications
RAGVersion solves the critical problem of keeping vector databases synchronized with changing source documents in Retrieval-Augmented Generation (RAG) applications.
🎉 New in v0.10.0: Chunk-Level Versioning
80-95% embedding cost reduction through intelligent chunk-level tracking! Only re-embed the parts of documents that actually changed.
Why RAGVersion?¶
When building RAG applications, you face a challenge: documents change, but vector databases don't update automatically. RAGVersion provides:
- ✅ Automatic change detection - Know exactly which documents changed
- ✅ Chunk-level versioning - Track changes at chunk granularity (80-95% cost savings) 🆕
- ✅ Version history - Complete audit trail of all changes
- ✅ Cost optimization - Only re-index changed documents and chunks
- ✅ Production-ready - Resilient error handling and async architecture
- ✅ Framework integrations - Works with LangChain, LlamaIndex, and custom pipelines
Quick Start¶
Install RAGVersion:
Track your documents:
import asyncio
from ragversion import AsyncVersionTracker
from ragversion.storage import SupabaseStorage
async def main():
tracker = AsyncVersionTracker(
storage=SupabaseStorage.from_env()
)
# Track a directory
result = await tracker.track_directory(
"./documents",
patterns=["*.pdf", "*.docx"],
recursive=True
)
print(f"Changes detected: {result.success_count}")
asyncio.run(main())
Key Features¶
🚀 Async-First Architecture¶
Built from the ground up for Python's async/await patterns, enabling efficient concurrent processing.
📊 Change Detection¶
Automatic content-based change detection using hashing - no manual tracking needed.
🔄 Batch Processing¶
Process thousands of documents efficiently with parallel workers and resilient error handling.
🗄️ Supabase Integration¶
Reliable PostgreSQL-backed storage with Supabase for production deployments.
🔗 Framework Integrations¶
Ready-to-use helpers for: - LangChain - Sync with LangChain vector stores - LlamaIndex - Sync with LlamaIndex indexes - Custom - Build your own integrations
📝 Complete Documentation¶
15,000+ words of comprehensive documentation covering: - Installation and setup - Core concepts - API reference - Integration guides - Best practices - Troubleshooting
The Problem RAGVersion Solves¶
Without RAGVersion ❌¶
Documents change → Don't know which ones → Re-index everything →
Expensive API calls → Slow updates → Or risk serving stale data
With RAGVersion ✅¶
Documents change → Automatic detection → Only re-index changed docs →
99% cost savings → Fast updates → Always fresh data
Real-World Impact¶
Document-Level Tracking¶
| Metric | Without RAGVersion | With RAGVersion |
|---|---|---|
| Cost | $50 per update | $0.50 per update |
| Time | 33 minutes | 20 seconds |
| Files processed | 1,000 (all) | 10 (only changed) |
| Savings | - | 99% reduction |
Chunk-Level Tracking (v0.10.0+) 🆕¶
| Scenario | Without Chunks | With Chunks | Savings |
|---|---|---|---|
| Documentation Update (1 paragraph in 100-page doc) | $2.50 (500 chunks) | $0.01 (2 chunks) | 99.6% |
| Code Repository (10 modified files out of 50) | $5.00 (1,000 chunks) | $0.15 (30 chunks) | 97% |
| Average Use Case | Full re-embedding | Smart chunk updates | 80-95% |
Use Cases¶
- 📚 Documentation Sites - Keep docs in sync with latest changes
- 💬 Customer Support - Always use up-to-date product information
- 🏢 Enterprise Knowledge Bases - Track document changes for compliance
- 🔬 Research Systems - Version control for research papers and datasets
- 📊 Content Management - Track changes across large content libraries
Installation Options¶
# Basic installation
pip install ragversion
# With document parsers (PDF, DOCX, etc.)
pip install "ragversion[parsers]"
# With LangChain integration
pip install "ragversion[langchain]"
# With LlamaIndex integration
pip install "ragversion[llamaindex]"
# Everything (recommended)
pip install "ragversion[all]"
Next Steps¶
-
:material-clock-fast:{ .lg .middle } Getting Started
Install RAGVersion and track your first document in 5 minutes
-
:material-book-open-variant:{ .lg .middle } User Guide
Learn core concepts and how to use RAGVersion effectively
-
:material-monitor-dashboard:{ .lg .middle } Web UI
Use the modern web dashboard for visual document management and analytics
-
:material-connection:{ .lg .middle } Integrations
Integrate with LangChain, LlamaIndex, or build custom integrations
-
:material-code-tags:{ .lg .middle } API Reference
Detailed API documentation for all components
Community & Support¶
- 🐛 Report Issues - Bug reports and feature requests
- 💬 Discussions - Ask questions and share ideas
- 📦 PyPI Package - Install from PyPI
- 🌟 GitHub Repository - Star the project
License¶
RAGVersion is licensed under the MIT License.
Built with ❤️ for the RAG community