rag-implementation
Build a grounded RAG pipeline
You need accurate answers from private documents and changing data. This skill explains RAG components and retrieval patterns to ground responses with sources.
Download the skill ZIP
Upload in Claude
Go to Settings → Capabilities → Skills → Upload skill
Toggle on and start using
Test it
Using "rag-implementation". Outline a RAG pipeline for internal policies with citations.
Expected outcome:
- Load policy documents from a controlled folder and split into 800 token chunks
- Create embeddings using text-embedding-ada-002 and store them in a vector database
- Use hybrid retrieval combining BM25 and semantic search, then rerank the top 20 results
- Answer with citations that reference source files and section numbers
Using "rag-implementation". How do I choose chunk size for technical documentation?
Expected outcome:
- Use 500-1000 tokens with 10-20% overlap for most documentation
- Smaller chunks (400 tokens) improve retrieval precision for specific queries
- Larger chunks (1500+ tokens) work better for narrative content requiring context
- Add metadata like source file and section headers to enable filtering
Using "rag-implementation". What vector database should I use for a small team?
Expected outcome:
- Chroma for local testing and prototypes - simple setup, no infrastructure
- Pinecone for managed service with automatic scaling and minimal ops
- Weaviate for hybrid search needs combining vector and keyword matching
- Consider FAISS for complete offline deployment with no network required
Security Audit
Low RiskThis is a documentation-only skill containing Markdown guides with Python code examples. No executable scripts, network calls, or file access capabilities exist in the skill itself. All static findings are false positives from the scanner misinterpreting documentation patterns as security risks. Code examples demonstrate typical RAG patterns using LangChain APIs. No obfuscation, persistence mechanisms, or malicious patterns detected.
Risk Factors
🌐 Network access (3)
⚙️ External commands (37)
🔑 Env variables (1)
Quality Score
What You Can Build
Design a RAG chatbot
Plan a retrieval pipeline that grounds answers with citations from internal documentation.
Evaluate retrieval quality
Define metrics and test cases to measure accuracy, grounding, and retrieval quality.
Select vector storage
Compare vector database options and choose an approach that fits scale and deployment needs.
Try These Prompts
Create a simple RAG plan for a document Q&A app. Include data ingestion, chunking, embeddings, vector store choice, and retrieval chain.
Design a hybrid retrieval strategy using dense and BM25. Specify k values, weights, and when to rerank.
Propose a reranking approach with cross-encoders or MMR. Explain candidate size and selection criteria.
Draft an evaluation plan for a RAG system. Include accuracy, retrieval quality, groundedness metrics, and test case structure.
Best Practices
- Use metadata for filtering and debugging.
- Combine hybrid search with reranking for top results.
- Track retrieval metrics during evaluation.
Avoid
- Indexing documents without chunk overlap.
- Skipping citations in user-facing answers.
- Using only dense retrieval for keyword-heavy queries.
Frequently Asked Questions
Which platforms does this support
What are the main limits
How do I integrate it into my app
Does it access my data
What if retrieval quality is low
How is this different from basic search
Developer Details
Author
wshobsonLicense
MIT
Repository
https://github.com/wshobson/agents/tree/main/plugins/llm-application-dev/skills/rag-implementationRef
main
File structure
📄 SKILL.md