Services
From quick audits to full RAG implementations. Find the right engagement for your stage.
Tier 1$5,000 - $15,000
Search & Knowledge Audit
1-2 weeksCompanies unsure if they have a problem or where to start
Deliverables
- Quantified baseline metrics (NDCG, MRR, Precision@K, Recall@K)
- Gap analysis vs. industry benchmarks
- Prioritized improvement roadmap
- Technology stack recommendations
- ROI projection for improvements
Process
- 1Kickoff call to understand current stack and pain points
- 2Access to search logs, queries, and relevance data
- 3Evaluation dataset construction (if not available)
- 4Baseline measurement and analysis
- 5Findings presentation and roadmap delivery
Tier 2$25,000 - $75,000
Search Relevance Optimization
4-8 weeksCompanies with existing search needing measurable improvement
Deliverables
- Hybrid search implementation (BM25 + semantic)
- Query understanding layer (intent classification, entity extraction)
- Cross-encoder reranking pipeline
- Relevance evaluation framework with ongoing metrics
- A/B testing infrastructure
You'll Get
20-40% improvement in relevance metrics (typical range). Production deployment, not a POC. Documentation and handoff for internal team.
Tier 3$50,000 - $150,000
Custom Embedding Development
6-12 weeksCompanies where generic embeddings fail (specialized domains, product catalogs)
Deliverables
- Domain-specific fine-tuned embedding model
- Training pipeline on proprietary data
- Evaluation against baseline and generic alternatives
- Deployment infrastructure
- Model update/retraining procedures
Why Custom Embeddings
Generic embeddings from OpenAI or Cohere weren't trained on your products, your terminology, or your users' language. Fine-tuned models outperform generic by 20-40% on domain-specific retrieval.
Tier 4$75,000 - $150,000
RAG Pipeline Development
8-12 weeksCompanies building AI assistants, knowledge systems, or document Q&A
Deliverables
- End-to-end RAG architecture
- Hybrid retrieval layer with reranking
- Hallucination mitigation strategies
- Evaluation framework (retrieval + generation quality)
- Production deployment with monitoring
What Makes RAG Work
Most RAG failures are retrieval failures. I build the retrieval layer right—hybrid search, proper chunking, reranking—so generation has accurate context to work with.
Tier 5$3,500 - $6,000/month
Retainer / Optimization
OngoingPost-implementation clients needing continuous improvement
Deliverables
- Monthly relevance metric reviews
- Query log analysis and optimization recommendations
- Model refresh recommendations
- Architecture advisory (2-4 hours/month)
- Priority support and async communication
Compare All Tiers
| Tier | Price | Timeline | You Get | Best For |
|---|---|---|---|---|
| Audit | $5-15K | 1-2 wks | Baseline + roadmap | Where do you stand? |
| Relevance Opt | $25-75K | 4-8 wks | Hybrid search + reranking | Existing search needs improvement |
| Custom Embeddings | $50-150K | 6-12 wks | Fine-tuned model | Generic embeddings fail |
| RAG Development | $75-150K | 8-12 wks | Full RAG pipeline | Building AI assistant |
| Retainer | $3.5-6K/mo | Ongoing | Continuous optimization | Post-implementation |
Frequently Asked Questions
Not sure where to start?
Book a 30-minute discovery call. I'll discuss your situation and recommend the right approach—even if that's not working with me.