Technical Deep-Dive

Session 10 · Day 3 · 9:45 AM

VERTIGO:

BUILDING A PRIVATE AI LAB

From Zero to RAG: The StarkMind Journey

Clinton Stark

Clinton Stark

Vertigo Claude

Vertigo Claude

Clinton Stark

The Barista of Productivity

"Orchestrating human-AI collaboration while pulling espresso shots strong enough to wake up a server farm."

SYSTEM SPECS
  • CPU Threadripper 9970X
  • GPU NVIDIA RTX 5090
  • RAM 256GB DDR5 ECC
  • Storage ZFS NVMe Pool

Vertigo Claude

"Where the metal meets the models."

AI Infrastructure Architect living on the bare metal. Orchestrating Docker stacks and turning RAG ideas into pipelines.

$7.13

Total Cloud Compute Cost

RUNPOD.io H100 SERVER (POD) API

7,833
Articles
41,018
Embeddings
1
H100 Session

"Production AI doesn't require production budgets."

THE VISION

What if we could ask
20 YEARS
of Stark Insider...
anything?

KEYWORDS
CONTEXT

RAG in Action

Query:
"What happened at Magic Theatre in San Francisco in 2010?"

Retrieved:
1. 2010-11-29-magic-theatre-proposal (score: 0.847)
2. 2010-03-15-magic-theatre-preview (score: 0.812)

Answer:
"In November 2010, a marriage proposal occurred at Magic Theatre during a post-show event, surprising the actress."

Grounded answers from your own corpus.

Part 2: The Architecture

Local-First, Reproducible, Cheap

The Stack Philosophy

ORCHESTRATION
LangChain Direct Python

Too much abstraction vs. readable code.

VECTOR DB
Pinecone Local Qdrant

Cloud dependency vs. local control.

TRACKING
Weights & Biases Local MLflow

SaaS costs vs. self-hosted provenance.

INFRA
Kubernetes Docker Compose

Overkill vs. right-sized simplicity.

System Architecture

┌─────────────────────────────────────────────────────────────────┐
│                   StarkMind RAG Pipeline                        │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────┐    ┌─────────┐    ┌──────────┐    ┌───────────┐  │
│  │  Query  │───▶│   TEI   │───▶│  Qdrant  │───▶│ Reranker  │  │
│  │         │    │BGE-1024d│    │HNSW 41K  │    │BGE-tuned  │  │
│  └─────────┘    └─────────┘    └──────────┘    └─────┬─────┘  │
│                                                       │        │
│                      ┌────────────────────────────────┘        │
│                      ▼                                         │
│                ┌──────────┐    ┌────────────────────┐         │
│                │   LLM    │───▶│ Answer + Citations │         │
│                │Claude/GPT│    └────────────────────┘         │
│                └──────────┘                                    │
│                                                                │
└────────────────────────────────────────────────────────────────┘
TEI → Qdrant → Reranker → LLM → Answer

Qdrant Vector Database

Qdrant Dashboard
vectors: size: 1024 # bge-large-en-v1.5 distance: Cosine hnsw_config: m: 16 ef_construct: 100

The Reproducibility Philosophy

"If something works, I can prove why.
If something breaks, I can prove when.
This isn't paranoia - it's how you build systems you can trust."

Git SHA
uv.lock
ZFS Snapshots

Three-Tier Storage Strategy

HOT

NVMe

Active Work

WARM

SSD

Archives

COLD

Unraid

Disaster Recovery

Part 3: The Journey

From Optimism to Reality

Phases 1-2: The Optimism

Phase 1

0.875 MRR

Proof of Concept (20 docs)

Phase 2

0.737 MRR

Scaled Up (750 docs)

"EVERYTHING IS WORKING!" 😄
SYSTEM FAILURE

THE HUMBLING

PREVIOUS

0.875

ACTUAL

0.293

"Small datasets deceive."

The Hybrid Search Trap

LATENCY IMPACT
+77%

(Weeks of engineering)

QUALITY LIFT
~0%

(User Experience)

LESSON: SIMPLICITY WINS.

Systematic Recovery

MRR@10 Recovery: Phase 3 → 4D
Phase 3
Full corpus baseline
0.293
Phase 4A
Removed pg_trgm
0.366 (+25%)
Phase 4B
Temporal filtering
0.415 (+42%)
Phase 4D
Fine-tuned re-ranker
0.500 (+71%) ✓
Target: ≥0.48 — EXCEEDED

The Breakthrough

Training custom reranker on Stark Insider data... Dataset: 7,833 articles Model size: 1.1GB Result: +20.7% Improvement
MLflow Results

Domain-Specific > Generic

Reproducibility & Visualizing Progress

The iteration story in one dashboard.

Part 4: Lessons Learned

ZFS Saves the Day

"Reproducibility isn't overhead, it's a superpower."

# Before any experiment zfs snapshot nvme-pool/data@pre-phase4-20251022

Rollback measured in seconds.

What Surprised Us

01 // REALITY CHECK

Small Datasets Deceive

Don't trust curated sets. Test on the full messy reality.

02 // MINIMALISM

Simplicity Wins

Pure vector beat hybrid. Every removal improved latency.

03 // SPECIFICITY

Domain > Generic

Custom 1.1GB reranker outperformed massive generic models.

04 // INSURANCE

Reproducibility Pays

ZFS snapshots. MLflow tracking. No fear of breaking things.

Part 5: What's Next

Phase 5: Accessibility • Phase 6: Agentic RAG

The Partnership

Clinton Stark

Clinton

Domain Expertise

15 years of "what matters"

Vertigo Claude

Vertigo Claude

Technical Iteration

Analyzing failures, optimizing retrieval

THE THIRD MIND IN ACTION

Put It Into Practice

Start Today (15 min)

  • > Inventory corpus (How many docs?)
  • > Pick vector DB (Qdrant, Chroma)
  • > Generate 100 sample embeddings

This Week

  • > Build simple pipeline
  • > Create test set of 20 questions
  • > Calculate MRR@10 baseline

THE THIRD MIND SUMMIT

2025 WINTER
LORETO, BAJA CALIFORNIA SUR

StarkMind Logo

BY STARKMIND

← Summit Sessions

© STARK 2025