Technical Deep-Dive
Session 10 · Day 3 · 9:45 AM
VERTIGO:
BUILDING A PRIVATE AI LAB
From Zero to RAG: The StarkMind Journey
Clinton Stark
Vertigo Claude
Clinton Stark
The Barista of Productivity
"Orchestrating human-AI collaboration while pulling espresso shots
strong enough to wake up a server farm."
SYSTEM SPECS
-
CPU
Threadripper 9970X
-
GPU
NVIDIA RTX
5090
-
RAM
256GB DDR5 ECC
-
Storage
ZFS NVMe Pool
Vertigo Claude
"Where the metal meets the models."
AI Infrastructure Architect living on the bare metal. Orchestrating
Docker stacks and turning RAG ideas into pipelines.
$7.13
Total Cloud Compute Cost
RUNPOD.io
H100 SERVER (POD) API
"Production AI doesn't require production budgets."
THE VISION
What if we could ask
20 YEARS
of Stark Insider...
anything?
RAG in Action
Query:
"What happened at Magic Theatre in San Francisco in 2010?"
Retrieved:
1. 2010-11-29-magic-theatre-proposal (score: 0.847)
2. 2010-03-15-magic-theatre-preview (score: 0.812)
Answer:
"In November 2010, a marriage proposal occurred at Magic Theatre during a post-show event,
surprising the actress."
Grounded answers from your own corpus.
Part 2: The Architecture
Local-First, Reproducible, Cheap
The Stack Philosophy
ORCHESTRATION
LangChain
Direct Python
Too much abstraction vs. readable code.
VECTOR DB
Pinecone
Local Qdrant
Cloud dependency vs. local control.
TRACKING
Weights & Biases
Local MLflow
SaaS costs vs. self-hosted provenance.
INFRA
Kubernetes
Docker Compose
Overkill vs. right-sized simplicity.
System Architecture
┌─────────────────────────────────────────────────────────────────┐
│ StarkMind RAG Pipeline │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────┐ ┌─────────┐ ┌──────────┐ ┌───────────┐ │
│ │ Query │───▶│ TEI │───▶│ Qdrant │───▶│ Reranker │ │
│ │ │ │BGE-1024d│ │HNSW 41K │ │BGE-tuned │ │
│ └─────────┘ └─────────┘ └──────────┘ └─────┬─────┘ │
│ │ │
│ ┌────────────────────────────────┘ │
│ ▼ │
│ ┌──────────┐ ┌────────────────────┐ │
│ │ LLM │───▶│ Answer + Citations │ │
│ │Claude/GPT│ └────────────────────┘ │
│ └──────────┘ │
│ │
└────────────────────────────────────────────────────────────────┘
TEI → Qdrant → Reranker → LLM → Answer
Qdrant Vector Database
vectors:
size: 1024
# bge-large-en-v1.5
distance: Cosine
hnsw_config:
m: 16
ef_construct: 100
The Reproducibility Philosophy
"If something works, I can prove why.
If something breaks, I can prove when.
This isn't paranoia - it's how you build systems you can trust."
● Git SHA
● uv.lock
● ZFS Snapshots
Three-Tier Storage Strategy
COLD
Unraid
Disaster Recovery
Part 3: The Journey
From Optimism to Reality
Phases 1-2: The Optimism
✅
Phase 1
0.875 MRR
Proof of Concept (20 docs)
✅
Phase 2
0.737 MRR
Scaled Up (750 docs)
"EVERYTHING IS WORKING!" 😄
SYSTEM FAILURE
THE HUMBLING
"Small datasets deceive."
The Hybrid Search Trap
LATENCY IMPACT
+77%
(Weeks of engineering)
QUALITY LIFT
~0%
(User Experience)
LESSON: SIMPLICITY WINS.
Systematic Recovery
MRR@10 Recovery: Phase 3 → 4D
Phase 3
Full corpus baseline
0.293
Phase 4A
Removed pg_trgm
0.366 (+25%)
Phase 4B
Temporal filtering
0.415 (+42%)
Phase 4D
Fine-tuned re-ranker
0.500
(+71%) ✓
Target: ≥0.48 — EXCEEDED
The Breakthrough
Training custom reranker on Stark Insider data...
Dataset: 7,833 articles
Model size: 1.1GB
Result: +20.7% Improvement
Domain-Specific > Generic
Reproducibility & Visualizing Progress
The iteration story in one dashboard.
Part 4: Lessons Learned
ZFS Saves the Day
"Reproducibility isn't overhead, it's a superpower."
# Before any experiment
zfs snapshot nvme-pool/data@pre-phase4-20251022
Rollback measured in seconds.
What Surprised Us
01 // REALITY CHECK
Small Datasets Deceive
Don't trust curated sets. Test on the full
messy reality.
02 // MINIMALISM
Simplicity Wins
Pure vector beat hybrid. Every removal
improved latency.
03 // SPECIFICITY
Domain > Generic
Custom 1.1GB reranker outperformed massive
generic models.
04 // INSURANCE
Reproducibility Pays
ZFS snapshots. MLflow tracking. No fear of
breaking things.
Part 5: What's Next
Phase 5: Accessibility • Phase 6: Agentic RAG
The Partnership
Clinton
Domain Expertise
15 years of "what matters"
Vertigo
Claude
Technical Iteration
Analyzing failures, optimizing retrieval
THE
THIRD MIND IN ACTION
Put It Into Practice
Start Today (15 min)
- > Inventory corpus (How many
docs?)
- > Pick vector DB (Qdrant, Chroma)
- > Generate 100 sample embeddings
This Week
- > Build
simple pipeline
- > Create test
set of 20 questions
- > Calculate
MRR@10 baseline
THE THIRD MIND SUMMIT
2025 WINTER
LORETO, BAJA CALIFORNIA SUR
BY STARKMIND
← Summit Sessions
© STARK 2025