Learn by Doing.
Become an AI Engineer.
6 Weeks · Cohort-based Course, Next cohort Nov 8—Dec 14, 2025
Course page: bytebyteai
Taught by Best-Selling
Author Ali Aminian
Meet Your Instructor
Ali Aminian
Ali Aminian is a best-selling author of multiple books on machine learning and generative AI. With over a decade of experience at leading tech companies, he has built AI systems that are intelligent, safe, and efficient. He also contributes to AI courses at Stanford University, combining technical expertise with a passion for teaching.
Course Outline (Project-Based Learning)
Project 1
Build an LLM Playground
LLM Overview and Foundations
Pre-Training
- Data collection (manual crawling, Common Crawl)
- Data cleaning (RefinedWeb, Dolma, FineWeb)
- Tokenization (e.g., BPE)
- Architecture (neural networks, Transformers, GPT family, Llama family)
- Text generation (greedy and beam search, top-k, top-p)
- SFT
- RL and RLHF (verifiable tasks, reward models, PPO, etc.)
- Traditional metrics
- Task-specific benchmarks
- Human evaluation and leaderboards
Project 2
Build a Customer Support Chatbot using RAGs and Prompt Engineering
Overview of Adaptation Techniques
Finetuning
- Parameter-efficient fine-tuning (PEFT)
- Adapters and LoRA
- Few-shot and zero-shot prompting
- Chain-of-thought prompting
- Role-specific and user-context prompting
Retrieval
- Document parsing (rule-based, AI-based) and chunking strategies
- Indexing (keyword, full-text, knowledge-based, vector-based, embedding models)
- Search methods (exact and approximate nearest neighbor)
- Prompt engineering for RAGs
Evaluation (context relevance, faithfulness, answer correctness)
RAGs' Overall Design
Project 3
Build an "Ask-the-Web" Agent similar to Perplexity with Tool calling
Agents Overview
- Agents vs. agentic systems vs. LLMs
- Agency levels (e.g., workflows, multi-step agents)
- Prompt chaining
- Routing
- Parallelization (sectioning, voting)
- Reflection
- Orchestration-worker
- Tool calling
- Tool formatting
- Tool execution
- MCP
- Planning autonomy
- ReACT
- Reflexion, ReWOO, etc.
- Tree search for agents
Evaluation of agents
Project 4
Build "Deep Research" Capability with Web Search and Reasoning Models
Reasoning and Thinking LLMs
- Overview of reasoning models like OpenAI's "o" family and DeepSeek-R1
- Inferece-time scaling
- CoT prompting
- Self-consistency
- Sequential revision
- Tree of Thoughts (ToT)
- Search against a verifier
- SFT on reasoning data (e.g., STaR)
- Reinforcement learning with a verifier
- Reward modeling (ORM, PRM)
- Self-refinement
- Internalizing search (e.g., Meta-CoT)
Project 5
Build a Multi-modal Generation Agent
Overview of Image and Video Generation
- VAE
- GANs
- Auto-regressive models
- Diffusion models
- Data preparation
- Diffusion architectures (U-Net, DiT)
- Diffusion training (forward process, backward process)
- Diffusion sampling
- Evaluation (image quality, diversity, image-text alignment, IS, FID, and CLIP score)
- Latent-diffusion modeling (LDM) and compression networks
- Data preparation (filtering, standardization, video latent caching)
- DiT architecture for videos
- Large-scale training challenges
- T2V's overall system
Project 6
Capstone Project
- Choose your own idea
- Build with techniques from the course
- Get real-time feedback from the instructor as you build
- Demo + feedback session
Last edited: