Copy Text
LLM Knowledge Bases
Architecture based on Andrej Karpathy's workflow — hover each component for details
Ingest
Knowledge Base
Read / Query
wiki/ — .md directory structure
Obsidian Web Clipper
Articles → .md + local images
Papers & Repos
arXiv, GitHub, datasets
raw/ directory
Source documents staging
LLM Compiler
raw/ → structured wiki
compiles
Index & Summaries
Auto-maintained — always consulted first
Concept Articles (*.md)
~100 articles, ~400K words, backlinked
Derived Outputs
Slides (Marp), charts, filed-back answers
Backlinks & Cross-links
Auto-generated link graph
Obsidian IDE
View wiki + visualizations
Q&A Agent
Complex queries → research
Search Engine
Web UI + CLI tool for LLM
Linting
Health checks & data integrity
always
if relevant
indexed
scan all
file back into wiki
enhance
LLM Compilation Pipeline — incremental, each step enhances the wiki
Phase 1
Ingest raw data
Phase 2
Compile wiki
Phase 3
Query & enhance
Phase 4
Lint & maintain
cycle — always adding up
Future: Synthetic Data Generation → Fine-tuning
Have the LLM "know" the data in its weights instead of just context windows
Direct flow
Feedback loop — outputs enhance the wiki