Can humans make AI any better?
Apply to work at Tufalabs: https://tufalabs.ai/join
Welch Labs Book: https://www.welchlabs.com/resources/ai-book-ezrzm
Welch Labs eBook: https://www.welchlabs.com/resources/the-welch-labs-illustrated-guide-to-ai-digital-download
Patreon: https://www.patreon.com/welchlabs
SECTIONS
0:00 - Harpy
4:39 - The Bitter Lesson
5:58 - Sutton Goes on a Podcast
8:22 - LLMs are Not Bitter Lesson Pilled?
9:19 - Supervised Learning
10:04 - Reinforcement Learning
10:32 - Work for Tufalabs!
11:50 - How AlphaGo Surpassed Humans
17:49 - RLHF and RLVR
18:41 - The Era of Experience
20:27 - My Take
21:05 - Welch Labs Book!
TECHNICAL NOTES
https://www.welchlabs.com/blog/2026/1/31/the-bitter-lesson-video-technical-notes
CODE
https://github.com/stephencwelch/manim_videos
REFERENCES
The Bitter Lesson: http://www.incompleteideas.net/IncIdeas/BitterLesson.html
Dwarkesh Patel's interview with Richard Sutton: https://www.youtube.com/watch?v=21EYKqUsPfg
AlphaGo vs Lee Sedol Match 4: https://www.youtube.com/watch?v=yCALyQRN3hw
Repurposed some board setups and heatmaps from: https://www.lesswrong.com/posts/FF8i6SLfKb4g7C4EL/inside-the-mind-of-a-superhuman-go-model-how-does-leela-zero-2?utm_source=chatgpt.com
Great HARPY video: https://www.youtube.com/watch?v=NiiDe2n-GeQ
Sutton, Richard S., and Andrew G. Barto. *Reinforcement learning: An introduction*. Vol. 1. No. 1. Cambridge: MIT press, 1998.
Averbuch, Amir, et al. "An IBM PC based large-vocabulary isolated-utterance speech recognizer." ICASSP'86. IEEE International Conference on Acoustics, Speech, and Signal Processing. Vol. 11. IEEE, 1986.
Radford, Alec, et al. "Language models are unsupervised multitask learners." OpenAI blog 1.8 (2019): 9.
Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." *nature* 529.7587 (2016): 484-489.
Silver, David, et al. "Mastering the game of go without human knowledge." *nature* 550.7676 (2017): 354-359.
Lowerre, Bruce. "The HARPY speech understanding system." *Readings in speech recognition*. 1990. 576-586.
Silver, David, and Richard S. Sutton. "Welcome to the era of experience." Google AI 1 (2025).
PATRONS
Juan Benet, Ross Hanson, Yan Babitski, AJ Englehardt, Alvin Khaled, Eduardo Barraza, Hitoshi Yamauchi, Jaewon Jung, Mrgoodlight, Shinichi Hayashi, Sid Sarasvati, Dominic Beaumont, Shannon Prater, Ubiquity Ventures, Matias Forti, Brian Henry, Tim Palade, Petar Vecutin, Nicolas baumann, Jason Singh, Robert Riley, vornska, Barry Silverman, Jake Ehrlich, Mitch Jacobs, Lauren Steely, Jeff Eastman, Rodolfo Ibarra, Clark Barrus, Rob Napier, Andrew White, Richard B Johnston, abhiteja mandava, Burt Humburg, Kevin Mitchell, Daniel Sanchez, Ferdie Wang, Tripp Hill, Richard Harbaugh Jr, Prasad Raje, Kalle Aaltonen, Midori Switch Hound, Zach Wilson, Chris Seltzer, Ven Popov, Hunter Nelson, Amit Bueno, Scott Olsen, Johan Rimez, Shehryar Saroya, Tyler Christensen, Beckett Madden-Woods, Darrell Thomas, Javier Soto, U007D, Caleb Begly, Rick Rubenstein, Brent Hunsaker, Dan Patterson, Tchsurvives, Alex Adai, Walter Reade, Zyansheep, Walter Reade, Duncan Stannett, Reginald Carey, Jean-Manuel Izaret, dh71633, Adrian Rodriguez, Dimitar Stojanovski, Michael Harder, Peter Maldonado, Emily Pesce, David Johnston, Insang Song, FaeTheWolf, Stephen Taylor, KittenKaboodle, EMatter, PATRICKMCCORMACK, John Beahan, Cameron, Cole Jones, Garrett Thornburg, Jeroen W, Rohit Sharma, GlennB, Emmanuel Cortes, Katie Quinn, Karina C, Cakra WW, Mike Ton, Eric Gometz, MacCallister Higgins, Niko Drossos, David Eraso, Tom Zehle, Steve, Brian Lineburg, rjbl, Michael Loh, Perry Vais, Bengal0, Farhad Manjoo, Sara Chipps, Ellis Driscoll, William Taysom, Will Harmon, CK, Abdullah, Peter Cho, Leo Nikora, Griffin Smith, Ash Katnoria, Alex, Markus Hays Nielsen, Catherine H., Vi, David Dobáš, Peter Wang, Sina Sohangir, Danny Thomas, Julian Francis, Hans Adler, Jiayu Peng, Weston M, Youssouf da Silva, John Thomas, Samuel Costello, Sam Adams, Bryan Liles, Malaya Zemlya, Karl, Vahe Andonians, Mike Doughty, Larry Novelo, Jonas Acres, Ludicrum Rex, Robert Blumofe, Anthony Z
Created by: Sam Baskin, Matthew Cohen, Pranav Gundu, and Stephen Welch
Content ID: CFAQJOTYQHT7JYIT