LeetLLMContent EngineeringAI EngineeringProduction+1

How We Built LeetLLM

How LeetLLM turns research into curated lessons with research packets, article bundles, validation gates, generated diagrams, component-based illustrations, and a production web stack.

LeetLLM TeamJune 21, 20268 min read

LeetLLM became useful when we treated curriculum like software: source material goes in, article bundles move through review, code and diagrams compile, and validators block sloppy pages before readers see them.

If you're building a learning site with LLMs, start there. Give the model strong context, ask for artifacts you can inspect, and make weak drafts cheap to catch.

What we were solving

Learning content has two common failure modes. One is broad writing that sounds confident but teaches little. The other is technically dense material that forgets learners need examples, diagrams, runnable code, and a path from basics to production.

That shaped the content system. LeetLLM has to support short lessons, long deep dives, blog posts, generated diagrams, citations, code examples, and visual explanations across AI engineering, LLM systems, RAG, agents, evaluation, inference, safety, and deployment.

Need	Bad version	LeetLLM version
Research	Unverified summary from memory	Source queue with papers, docs, repos, and dated claims
Structure	Loose draft	Article bundle with frontmatter, article text, figures, code, refs, and questions
Visuals	Static screenshots or hand-placed labels	TSX illustrations built from layout primitives
Code	Decorative snippets	Examples that can be tested or reviewed
Review	Read it once	Validators, style checks, reference checks, build checks, and live review

The pipeline doesn't remove editorial taste. It gives taste a place to run: reject weak examples, fix vague claims, and check the page the way a reader will see it.

LeetLLM content pipeline moving from source queue to research packet, article bundle, review gates, generated assets, and published page with a revision loop. — LeetLLM treats content like a build artifact. Research, writing, visuals, code, and page output each have a place in the pipeline.

Pipeline, not prompt

The pipeline has this shape: collect evidence, plan the article, draft into a structured bundle, generate assets, validate the page, then revise until the lesson works.

Diagram showing Source queue papers, docs, repos, Research packet claims and links, Article bundle article text, code, visuals, and Validation gates. — Source queue papers, docs, repos, Research packet claims and links, Article bundle article text, code, visuals, and Validation gates.

That structure changes model behavior. The model sees a target audience, an outline, source notes, repository conventions, and explicit rules for diagrams, examples, and citations.

This is context engineering applied to education: choose what the model sees, control output shape, compress facts into a useful packet, and keep evidence close to the writing task.^[1] ^[2]

Source-first rule

We don't start by asking for a draft. We start by building a claim inventory: required facts, supporting sources, freshness, and the learner misconception each claim should correct.

Article bundles

Each article lives as a small bundle: content, metadata, custom illustrations, generated assets, and references. That makes articles easy to move through the same checks as code.

Before writing, we want a compact spec:

article-bundle.json

{
  "slug": "how-we-built-leetllm",
  "audience": "engineers building AI learning products",
  "promise": "show the full content pipeline, not a vague AI-writing story",
  "must_include": [
    "research packet",
    "article bundle",
    "validation gates",
    "illustration framework",
    "production web stack"
  ],
  "evidence": [
    "repo conventions",
    "content guidelines",
    "external context engineering references"
  ]
}

That spec prevents the polished-overview failure. The article has a job, a reader, evidence, and artifacts that can be checked.

Research becomes teaching

Research packets aren't dumps. A useful packet separates facts, claims, examples, and risks:

Research field	Purpose
Source claim	Keeps factual statements traceable
Freshness date	Flags claims that can drift
Learner misconception	Turns facts into teaching moments
Concrete example	Forces explanation against a real scenario
Failure case	Shows what breaks when idea is misused
Link target	Connects article to course path

For example, a RAG lesson shouldn't say "retrieval improves accuracy" and stop. It should show when retrieval helps, when it adds stale or irrelevant context, how to evaluate groundedness, and why query routing matters. The lesson can then link naturally into production RAG pipelines, RAG evaluation, and LLM-as-judge evaluation.

The same rule applies to blog posts. A strong post about agents links into agent architecture, tool calling, context engineering, and SWE-bench instead of pretending every concept begins on that page.

Quality gates

The review loop checks whether a page teaches the idea, not whether Markdown merely parses. We care about factual grounding, structure, examples, code, visuals, references, and build output.

This tiny sketch captures one rule every draft should satisfy before human review:

content-quality-gate.py

required = {
    "source_ids",
    "failure_case",
    "code_snippet",
    "visual_refs",
    "next_lesson_links",
}

article = {
    "source_ids": ["anthropic2025effectivecontext", "contexteng_survey2025"],
    "failure_case": "draft without concrete examples",
    "code_snippet": "content-quality-gate.py",
    "visual_refs": ["content_pipeline", "illustration_system"],
    "next_lesson_links": ["design-production-rag-pipeline", "llm-as-judge-automated-evaluation"],
}

missing = sorted(required - article.keys())
print("ready for review" if not missing else f"missing: {', '.join(missing)}")

Real checks are broader: metadata schemas, broken reference IDs, code examples, Mermaid diagrams, illustration references, cover images, text density, layout warnings, TypeScript, lint, tests, and browser output. The point is making weak article states visible before a reader sees them: unreferenced figures, stale citations, broken code blocks, missing cover images, or pages that look fine in Markdown but fail in browser review.

Where models need review

Language models are good at filling gaps with plausible connective tissue. That's dangerous in education. Every draft pass needs a later pass that deletes generic examples, checks claims, and asks, "Would this help a learner solve the next problem?"

Illustration framework

The visual system changed the most. Early images were easy to break because labels, arrows, and charts used too many manual positions. One label length change could overlap a box. One chart tweak could push text out of frame.

The fix was to make illustrations more like UI components. Source illustrations are TSX files that use theme-aware primitives: scenes, rows, panels, cards, badges, arrows, metrics, and detail lists. The build turns those components into dark and light PNGs.

Component-based illustration framework where TSX layout primitives enter a renderer that emits dark and light PNG files, then validation checks references, layout, density, and visual review. — The illustration framework keeps source editable. Components handle spacing and layout, and the build writes stable assets for the site.

A source illustration looks closer to a small interface than a hand-drawn canvas:

illustrations/content_pipeline.tsx

const illustration = defineIllustration((c) =>
    Scene({
        c,
        title: 'Content Pipeline',
        contentWidth: 1030,
        children: [
            Row({
                width: '100%',
                gap: 14,
                children: stages.map((stage) =>
                    StepCard({
                        c,
                        title: stage.title,
                        subtitle: stage.subtitle,
                        tone: stage.tone,
                        width: 190,
                    })
                ),
            }),
        ],
    })
)

That style scales better. If every article needs custom visuals, the author shouldn't spend time nudging absolute coordinates. They should compose reliable pieces, review the rendered output, and fix the teaching idea.

From draft to better draft

The editing loop is blunt:

Find generic claims.
Replace them with problem-shaped examples.
Add a failure case.
Add a visual that explains structure.
Run code or label it as pseudocode.
Connect article to adjacent lessons.
Re-read as a learner who doesn't already know the topic.

That loop is why a lesson on inference doesn't use a random e-commerce analogy when it should talk about TTFT, decode throughput, KV cache, and GPU memory. It's why an evaluation post talks about judges, calibration, pairwise ranking, and false positives instead of abstract "quality scores."

Weak draft	Stronger rewrite
"Use RAG to improve answers."	"Use retrieval when the answer depends on private or fast-changing facts, then measure groundedness and citation precision."
"Agents call tools."	"The model proposes a typed tool call, code executes it, and the observation returns to the next model turn."
"Models need more context."	"The context packet should include only the evidence needed for the next decision, with source IDs preserved."
"Add diagrams for clarity."	"Add a diagram that changes how the reader reasons about a failure mode."

That's why LeetLLM links concepts across the curriculum. If a blog introduces context packets, it should point to prompt engineering, retrieval, evaluation, and production deployment instead of making readers rebuild the map from scratch.

What we recommend

If you're building an AI-assisted learning platform, don't optimize for article count first. Optimize for a system that makes weak articles uncomfortable to publish.

Keep source notes separate from article text.
Require article specs before drafting.
Use structured metadata, not loose file names.
Put citations near claims that can drift.
Make every visual editable from source.
Prefer component layout over coordinate layout.
Test runnable code examples.
Add validators for the exact failures you keep seeing.
Review rendered pages in browser alongside Markdown.

Software stack and infrastructure

The site runs on a deliberately boring stack. The interesting part is the content system; the infrastructure should make pages fast to edit, cheap to review, and predictable to run.

Layer	What we use	Job
App shell	Next.js App Router, React, TypeScript	Render the curriculum, blog, auth-aware UI, and content pages from one codebase
Styling	Tailwind CSS, Shadcn/Radix primitives, Lucide icons	Keep the reading surface consistent without inventing one-off UI for every lesson
Content source	Markdown bundles in Git with JSON frontmatter	Make each article reviewable as source, with nearby illustrations and generated assets
Visual build	TSX illustration primitives and Mermaid diagram rendering	Turn editable source into stable dark and light PNG assets
Auth and data	Supabase auth and Postgres	Store user-owned state such as profiles, reading progress, bookmarks, and comments
CI	pnpm, ESLint, Vitest, TypeScript, Next build, content validators	Catch broken routes, bad references, stale assets, and app regressions during review
Hosting	Dockerized Next.js on Google Cloud Run behind Cloudflare	Serve the app with managed containers, autoscaling, and edge caching

That split matters because each layer has a clear owner: Git for source, Next.js for rendering, Supabase for user state, Cloud Run for serving, Cloudflare for caching public pages, and validators for the content contract. When those responsibilities stay separate, the team can improve lessons without turning every article edit into an infrastructure project.

LLMs help in this workflow, but they aren't the product. The product is the learning path, the examples, the review loop, and the trust that each page earns.

That's how LeetLLM is built: not one prompt, but a content engineering system.

NextAI Engineer Portfolio Projects That Get Interviews

Share this article

X Facebook LinkedIn Bluesky Reddit Hacker News Email

References

Effective context engineering for AI agents

Anthropic · 2025

A Survey of Context Engineering for Large Language Models.

Multiple authors · 2025 · arXiv preprint

Blog