vector-database-engineer
Build scalable vector search systems
Implement production-ready vector databases and semantic search. This skill provides expert guidance on embedding strategies, index optimization, and RAG architecture for modern AI applications.
Die Skill-ZIP herunterladen
In Claude hochladen
Gehe zu Einstellungen → Fähigkeiten → Skills → Skill hochladen
Einschalten und loslegen
Teste es
Verwendung von "vector-database-engineer". How should I chunk 500-page PDFs for semantic search?
Erwartetes Ergebnis:
Use recursive character text splitting with 1000-1500 character chunks and 200 character overlap. This preserves context while maintaining semantic coherence. For technical documents, consider structure-aware chunking that respects section boundaries.
Verwendung von "vector-database-engineer". Compare Pinecone vs Weaviate for production
Erwartetes Ergebnis:
Pinecone offers managed scalability with zero operational overhead but has vendor lock-in. Weaviate provides self-hosted flexibility with hybrid search built-in but requires infrastructure management. Choose Pinecone for rapid development, Weaviate for cost control at scale.
Sicherheitsaudit
SicherAll static analysis findings are false positives. The skill contains only documentation text with no executable code, network requests, or security risks. The 'external_commands' flag was triggered by the word 'open' in a documentation sentence, not actual command execution. This is a legitimate educational skill about vector database engineering.
Qualitätsbewertung
Was du bauen kannst
Build a RAG knowledge base
Design semantic search over documentation for AI-powered question answering
Implement recommendation engine
Create similarity-based product recommendations using vector embeddings
Optimize vector search performance
Tune indexing and chunking strategies for millions of vectors
Probiere diese Prompts
Help me choose between Pinecone, Weaviate, and Qdrant for a document search system with 1 million vectors
Design an embedding pipeline for technical documentation. Recommend chunking size, overlap, and model selection
Configure HNSW index parameters for 90% recall at under 50ms latency on 5 million vectors
Implement hybrid search combining vector similarity with keyword filters for product search
Bewährte Verfahren
- Always test embedding models on your specific domain before production deployment
- Start with simple chunking strategies before optimizing for complex document structures
- Monitor vector drift and plan periodic re-embedding cycles
- Use metadata filtering to reduce search space before vector queries
Vermeiden
- Using larger embedding dimensions without testing if smaller models work for your use case
- Chunking documents without overlap, losing context between segments
- Skipping recall testing and only measuring latency
- Storing embeddings without their source text or metadata references
Häufig gestellte Fragen
What is the difference between HNSW and IVF indexing?
How do I choose embedding dimensions?
Should I use pre-filtering or post-filtering for metadata?
What vector database should I use?
How do I handle embedding drift?
Can I use this skill to directly query my vector database?
Entwicklerdetails
Autor
sickn33Lizenz
MIT
Repository
https://github.com/sickn33/antigravity-awesome-skills/tree/main/skills/vector-database-engineerRef
main
Dateistruktur
đź“„ SKILL.md