LeetLLM
LearnFeaturesBlog
LeetLLM

Your go-to resource for mastering AI & LLM systems.

Product

  • Learn
  • Features
  • Blog

Legal

  • Terms of Service
  • Privacy Policy

ยฉ 2026 LeetLLM. All rights reserved.

All Topics
Your Progress
0%

0 of 151 articles completed

๐Ÿ› ๏ธComputing Foundations0/6
NumPy and Tensor ShapesCUDA for ML TrainingMPS & Metal for ML on MacData Structures for AISQL and Data ModelingAlgorithms for ML Engineers
๐Ÿ“ŠMath & Statistics0/8
Gradients and BackpropVectors, Matrices & TensorsLinear Algebra for MLAdam, Momentum, SchedulersProbability for Machine LearningStatistics and UncertaintyDistributions and SamplingHypothesis Tests, Intervals, and pass@k
๐Ÿ“šPreparation & Prerequisites0/13
Neural Networks from ScratchCNNs from ScratchTraining & BackpropagationSoftmax, Cross-Entropy & OptimizationRNNs, LSTMs, GRUs, and Sequence ModelingAutoencoders and VAEsThe Transformer Architecture End-to-EndLanguage Modeling & Next TokensFrom GPT to Modern LLMsPrompt Engineering FundamentalsCalling LLM APIs in ProductionFirst AI App End-to-EndThe LLM Lifecycle
๐ŸงฎML Algorithms & Evaluation0/11
Linear Regression from ScratchLogistic Regression and MetricsDecision Trees, Forests, and BoostingReinforcement Learning BasicsValidation and LeakageClustering and PCACore Retrieval AlgorithmsDecoding AlgorithmsExperiment Design and A/B TestingPyTorch Training LoopsDataset Pipelines and Data Quality
๐Ÿ“ฆProduction ML Systems0/6
Feature Engineering for Production MLBatch and Streaming Feature PipelinesGradient Boosted Trees in ProductionRanking and Recommendation SystemsForecasting and Anomaly DetectionMonitoring Predictive Models
๐ŸงชCore LLM Foundations0/8
The Bitter Lesson & ComputeBPE, WordPiece, and SentencePieceStatic to Contextual EmbeddingsPerplexity & Model EvaluationFile Ingestion for AIChunking StrategiesLLM Benchmarks & LimitationsInstruction Tuning & Chat Templates
๐ŸงฐApplied LLM Engineering0/23
Dimensionality Reduction for EmbeddingsCoT, ToT & Self-Consistency PromptingFunction Calling & Tool UseMCP & Tool Protocol StandardsPrompt Injection DefenseResponsible AI GovernanceData Labeling and Human FeedbackEvaluating AI AgentsProduction RAG PipelinesHybrid Search: Dense + SparseReranking and Cross-Encoders for RAGRAG Evaluation for Reliable AnswersLLM-as-a-Judge EvaluationBias & Fairness in LLMsHallucination Detection & MitigationLLM Observability & MonitoringExperiment Tracking with MLflow and W&BMixed Precision TrainingModel Versioning & DeploymentSemantic Caching & Cost OptimizationLLM Cost Engineering & Token EconomicsModel Gateways, Routing, and FallbacksDesign an Automated Support Agent
๐ŸŽ“Portfolio Capstones0/9
Capstone: Delivery ETA PredictionCapstone: Product RankingCapstone: Demand ForecastingCapstone: Image Damage ClassifierCapstone: Production ML PipelineCapstone: Document QACapstone: Eval DashboardCapstone: Fine-Tuned ClassifierCapstone: Production Agent
๐Ÿง Transformer Deep Dives0/8
Sentence Embeddings & Contrastive LossEmbedding Similarity & QuantizationScaled Dot-Product AttentionVision Transformers and Image EncodersPositional Encoding: RoPE & ALiBiLayer Normalization: Pre-LN vs Post-LNMechanistic InterpretabilityDecoding Strategies: Greedy to Nucleus
๐ŸงฌAdvanced Training & Adaptation0/16
Scaling Laws & Compute-Optimal TrainingPre-training Data at ScaleBuild GPT from Scratch LabContinued Pretraining for Domain ShiftSynthetic Data PipelinesSupervised Fine-Tuning PipelineDistributed Training: FSDP & ZeROLoRA & Parameter-Efficient TuningReward Modeling from Preference DataRLHF & DPO AlignmentConstitutional AI & Red TeamingRLVR & Verifiable RewardsKnowledge Distillation for LLMsModel Merging and Weight InterpolationPrompt Optimization with DSPyRecursive Language Models (RLM)
๐Ÿค–Advanced Agents & Retrieval0/14
Vector DB Internals: HNSW & IVFAdvanced RAG: HyDE & Self-RAGGraphRAG & Knowledge GraphsRAG Security & Access ControlStructured Output GenerationReAct & Plan-and-ExecuteGuardrails & Safety FiltersCode Generation & SandboxingComputer-Use / GUI / Browser AgentsHuman-in-the-Loop Agent ArchitectureAI Coding Workflow with AgentsAgent Memory & PersistenceAgent Failure & RecoveryMulti-Agent Orchestration
โšกInference & Production Scale0/20
Inference: TTFT, TPS & KV CacheMulti-Query & Grouped-Query AttentionKV Cache & PagedAttentionPrefix Caching and Prompt CachingFlashAttention & Memory EfficiencyContinuous Batching & SchedulingScaling LLM InferenceModel Parallelism for LLM InferenceModel Quantization: GPTQ, AWQ & GGUFLocal LLM DeploymentSLM Specialization & Edge DeploymentSpeculative DecodingLong Context Window ManagementContext EngineeringMixture of Experts ArchitectureMamba & State Space ModelsReasoning & Test-Time ComputeAdvanced MLOps & DevOps for AIGPU Serving & AutoscalingA/B Testing for LLMs
๐Ÿ—๏ธSystem Design Capstones0/9
Content Moderation SystemCode Completion SystemMulti-Tenant LLM PlatformLLM-Powered Search EngineVision-Language Models & CLIPMultimodal LLM ArchitectureDiffusion Models & Image GenerationReal-Time Voice AI AgentReasoning & Test-Time Compute
Back to Topics
LearnPortfolio CapstonesCapstone: Demand Forecasting
โš™๏ธHardMLOps & Deployment

Capstone: Demand Forecasting

Ship a demand forecast and capacity-alert artifact with rolling backtests, alert review, and retraining policy.

9 min read
Learning path
Step 78 of 151 in the full curriculum
Capstone: Product RankingCapstone: Image Damage Classifier

Capstone: Demand Forecasting

The ranking capstone influenced which products users could purchase. Warehouse teams now need an operational input: forecast daily parcel volume by fulfillment center so staffing and packing capacity can be planned before demand arrives.

This capstone ships a forecast and alert artifact. It doesn't automatically hire labor, move inventory, or page an operator on every miss. It creates a versioned expectation, detects unusually large residuals, and records evidence for a planner's decision.

Demand-forecasting capstone showing warehouse daily history, seasonal baseline and candidate forecast, rolling backtests, residual alert gate, and reviewed capacity action. Demand-forecasting capstone showing warehouse daily history, seasonal baseline and candidate forecast, rolling backtests, residual alert gate, and reviewed capacity action.
Capacity alerts become trustworthy only when later observations were never used to fit their own forecast and each alert keeps its expected range and resolution.

Choose the Series and Decision

Predict daily shipped parcels for each warehouse seven days ahead. Use an explicit planning contract:

FieldContract
entityfulfillment center and shipping service tier
targetparcels shipped per calendar day
horizonnext seven days
decisionplanner reviews capacity when forecast or alert requires it
baselinesame weekday from prior week
evaluationMAE plus underforecast cost by high-volume slice

Demand can change around promotions, holidays, seller campaigns, inventory shortages, and data outages. Those known drivers should appear as features only if they are scheduled and available before the forecast cutoff.

Hyndman and Athanasopoulos explain why forecast evaluation must use later observations and rolling forecasting origins rather than random splits.[1] For this project, each backtest run records its training cutoff, horizon, model version, and the actual values that arrived afterward.

Diagram showing Warehouse history through cutoff, Seasonal baseline and candidate, Rolling backtest future windows, and Forecast bundle + observed count versioned residual. Diagram showing Warehouse history through cutoff, Seasonal baseline and candidate, Rolling backtest future windows, and Forecast bundle + observed count versioned residual.
Warehouse history through cutoff, Seasonal baseline and candidate, Rolling backtest future windows, and Forecast bundle + observed count versioned residual.

Build a Reviewable Artifact

Your repository surface should look like:

text
1demand-forecast/ 2 data/ 3 warehouse_daily_counts.parquet 4 planned_events.json 5 split_manifest.json 6 forecasting/ 7 seasonal_baseline.py 8 train_candidate.py 9 rolling_backtest.py 10 alerts/ 11 residual_policy.json 12 evaluate_alerts.py 13 reports/ 14 backtest_metrics.json 15 alert_review.csv 16 tests/ 17 test_future_rows_excluded.py 18 test_alert_contract.py

The candidate can be a tree model over lag features, rolling means, service tier, weekday, and known promotions. It must beat the seasonal baseline on later windows, especially where underforecasting is expensive. A candidate that marginally improves MAE but misses peak-volume days should remain blocked.

Execute the Backtest and Alert Gate

This small fixture compares a same-weekday baseline against one candidate forecast and records alerts when actual volume exceeds the candidate by at least 20 parcels.

forecast-release-evidence.py
1actual = [104, 110, 119, 116, 160, 84, 78] 2baseline = [100, 112, 115, 118, 132, 82, 76] 3candidate = [103, 111, 117, 117, 140, 83, 77] 4 5def mae(values, predictions): 6 return sum(abs(value - prediction) for value, prediction in zip(values, predictions)) / len(values) 7 8def capacity_alerts(values, predictions, limit=20): 9 return [ 10 {"day": index + 1, "observed": value, "expected": prediction, "residual": value - prediction} 11 for index, (value, prediction) in enumerate(zip(values, predictions)) 12 if value - prediction >= limit 13 ] 14 15baseline_mae = mae(actual, baseline) 16candidate_mae = mae(actual, candidate) 17alerts = capacity_alerts(actual, candidate) 18decision = "eligible_for_planner_review" if candidate_mae < baseline_mae else "hold" 19 20print("baseline MAE:", round(baseline_mae, 1)) 21print("candidate MAE:", round(candidate_mae, 1)) 22print("alerts:", alerts) 23print("decision:", decision)
Output
1baseline MAE: 6.3 2candidate MAE: 3.9 3alerts: [{'day': 5, 'observed': 160, 'expected': 140, 'residual': 20}] 4decision: eligible_for_planner_review

Even the improved candidate underforecasts the peak Friday. That alert isn't a failure to hide: it is a deliverable. A planner can examine whether the spike came from a known campaign, decide whether to add capacity, and attach the resolution to the alert row.

Evaluate Forecasts and Alerts Separately

The report needs two sections:

ReportMetricsRelease question
forecast qualityMAE, weighted underforecast cost, error by center/tier/horizondoes candidate help planning?
alert policyuseful review rate, missed high-volume events, alert volumedoes the review queue help operations?

Avoid claiming a prediction interval is reliable until it has been measured on held-out windows. If a 90 percent interval misses too many future days, the product should report that coverage failure and adjust the candidate or uncertainty method before relying on it.

Known promotions offer a useful slice. If the candidate improves routine days but misses every promotion peak, a global MAE improvement doesn't support deployment to promotion planning. Store slices and block claims the evidence doesn't support.

Plan Refresh and Monitoring

New outcomes arrive daily, but model replacement should happen on a scheduled or triggered review cycle. Store:

Operational itemRequired decision
daily observation joinattach actual count to stored forecast
weekly accuracy reportcompare baseline and current candidate
alert resolution reviewclassify useful, expected, or data issue
retraining triggersustained cost regression or approved calendar cadence
promotion gaterolling backtest and planner review

This capstone provides forecasting artifacts for the next project, which will automate validation and promotion boundaries for several predictive models.

Mastery Check

Evaluation rubric

ArtifactStrong submission demonstrates
forecast packagetime-aware training windows, baseline, uncertainty or error policy, and backtest report
alert workflowresidual-based alerts with reason codes and planner resolution logging
operationsretraining cadence, monitoring, promotion gates, and rollback plan

Common Failures

SymptomCauseFix
Backtest appears precise but live peaks missfuture or promotion leakagefreeze cutoff and known-in-advance fields
Planner receives noisy alertsthreshold lacks reviewed outcomesevaluate alert usefulness separately
Forecast changes without explanationartifact and cutoff missinglog versioned forecast bundle
Next Step
Continue to Capstone: Image Damage Classifier

You can now issue time-ordered forecasts and reviewed capacity alerts. Next you will ship a model over pixels, where image quality and human confirmation guard every damage route.

PreviousCapstone: Product Ranking
Share this article
XFacebookLinkedInBlueskyRedditHacker NewsEmail
References

Forecasting: Principles and Practice, Third Edition.

Hyndman, R. J. & Athanasopoulos, G. ยท 2021