Services

From quick audits to full RAG implementations. Find the right engagement for your stage.

Tier 1$5,000 - $15,000
Search & Knowledge Audit
1-2 weeksCompanies unsure if they have a problem or where to start

Deliverables

  • Quantified baseline metrics (NDCG, MRR, Precision@K, Recall@K)
  • Gap analysis vs. industry benchmarks
  • Prioritized improvement roadmap
  • Technology stack recommendations
  • ROI projection for improvements

Process

  1. 1Kickoff call to understand current stack and pain points
  2. 2Access to search logs, queries, and relevance data
  3. 3Evaluation dataset construction (if not available)
  4. 4Baseline measurement and analysis
  5. 5Findings presentation and roadmap delivery
Tier 2$25,000 - $75,000
Search Relevance Optimization
4-8 weeksCompanies with existing search needing measurable improvement

Deliverables

  • Hybrid search implementation (BM25 + semantic)
  • Query understanding layer (intent classification, entity extraction)
  • Cross-encoder reranking pipeline
  • Relevance evaluation framework with ongoing metrics
  • A/B testing infrastructure

You'll Get

20-40% improvement in relevance metrics (typical range). Production deployment, not a POC. Documentation and handoff for internal team.

Tier 3$50,000 - $150,000
Custom Embedding Development
6-12 weeksCompanies where generic embeddings fail (specialized domains, product catalogs)

Deliverables

  • Domain-specific fine-tuned embedding model
  • Training pipeline on proprietary data
  • Evaluation against baseline and generic alternatives
  • Deployment infrastructure
  • Model update/retraining procedures

Why Custom Embeddings

Generic embeddings from OpenAI or Cohere weren't trained on your products, your terminology, or your users' language. Fine-tuned models outperform generic by 20-40% on domain-specific retrieval.

Tier 4$75,000 - $150,000
RAG Pipeline Development
8-12 weeksCompanies building AI assistants, knowledge systems, or document Q&A

Deliverables

  • End-to-end RAG architecture
  • Hybrid retrieval layer with reranking
  • Hallucination mitigation strategies
  • Evaluation framework (retrieval + generation quality)
  • Production deployment with monitoring

What Makes RAG Work

Most RAG failures are retrieval failures. I build the retrieval layer right—hybrid search, proper chunking, reranking—so generation has accurate context to work with.

Tier 5$3,500 - $6,000/month
Retainer / Optimization
OngoingPost-implementation clients needing continuous improvement

Deliverables

  • Monthly relevance metric reviews
  • Query log analysis and optimization recommendations
  • Model refresh recommendations
  • Architecture advisory (2-4 hours/month)
  • Priority support and async communication

Compare All Tiers

TierPriceTimelineYou GetBest For
Audit$5-15K1-2 wksBaseline + roadmapWhere do you stand?
Relevance Opt$25-75K4-8 wksHybrid search + rerankingExisting search needs improvement
Custom Embeddings$50-150K6-12 wksFine-tuned modelGeneric embeddings fail
RAG Development$75-150K8-12 wksFull RAG pipelineBuilding AI assistant
Retainer$3.5-6K/moOngoingContinuous optimizationPost-implementation

Frequently Asked Questions

Not sure where to start?

Book a 30-minute discovery call. I'll discuss your situation and recommend the right approach—even if that's not working with me.