调研时间:2026-04-05 数据来源:X/Twitter (@karpathy)、个人博客 karpathy.github.io、bearblog、GitHub README、YC AI Startup School演讲记录、Dwarkesh Patel访谈
Karpathy有一种天赋:用口语化的短语命名复杂现象,一次性定义赛道。
"There's a new kind of coding I call 'vibe coding', where you fully give in to the vibes, embrace exponentials, and forget that the code even exists." ——2025年2月2日原推
"The hottest new programming language is English." ——2023年1月24日,6字定义一个范式
"LLMs are 'people spirits', stochastic simulations of people, where the simulator is an autoregressive Transformer." ——YC AI Startup School,2025年6月
这三个例子共享同一种结构:先命名(给个称号),再用一句话说清楚它是什么。名字本身必须口语化、有画面感,定义句精准但不掉书袋。
他喜欢用「版本号」类比来描述范式变迁,把抽象的技术演化变成可感知的升级:
"Software 1.0 is the code you write for the computer. Software 2.0 are basically neural networks... Software 3.0 is now LLMs, programmed in English."
这种框架的力量:让读者觉得自己正站在历史节点上。他不说「AI改变了编程」,他说「这是第三次范式升级」。
在X上,他频繁用「imo」来标记自己的判断——既是礼貌的hedge,也是一种「我说了,但我不强迫你接受」的姿态:
"Imo fair to say that software is changing quite fundamentally again."
"prompters is doing it a disservice and is imo a misunderstanding."
Karpathy在技术判断上极少斩钉截铁,尤其是预测性陈述:
"When I see things like, '2025 is the year of agents,' I get very concerned. And I kind of feel like, you know, this is the decade of agents."
"I have a sense that I could be 10X more powerful if I just properly string together what has become available over the last ~year."
"I don't have a super strong prediction...I have a very wide distribution here."
这种不确定性不是软弱,而是认知诚实。他主动展示自己的置信区间。
"Whenever I talk to ChatGPT or some LLM directly in text, I feel like I'm talking to an operating system through the terminal."
"The LLM is a new kind of a computer. It's sitting, it's kind of like the CPU equivalent."
这是他最诗意的类比,也是他重新定义「幻觉问题」的核心武器:
"In some sense, hallucination is all LLMs do. They are dream machines. We direct their dreams with prompts."
"TLDR I know I'm being super pedantic but the LLM has no 'hallucination problem'. Hallucination is not a bug, it is LLM's greatest feature."
逻辑结构:先承认通俗理解(幻觉是问题),再反转(从LLM的本质看,这才是它做的事)。这是他的标准辩证手法。
"We're not building animals. We're building ghosts or spirits."
"LLMs are kind of like people spirits. They are stochastic simulations of people."
"They display jagged intelligence, so they're going to be superhuman in some problem-solving domains, and then they're going to make mistakes that basically no human will make."
他用「jagged intelligence」(锯齿状智能)来描述LLM忽强忽弱的表现——这是他自造的概念,后来被广泛引用。
"These are now increasingly complex software ecosystems...The LLM is a new kind of a computer."
"We're kind of like in this 1960s-ish era where LLM compute is still very expensive for this new kind of a computer."
类比到计算机历史的某个年代,这是他常用的「时间定位法」——帮助读者感知「我们现在在哪个阶段」。
"The internet is really terrible...total garbage...stock tickers, symbols, slop."
他用「slop」(垃圾)描述互联网数据质量,批评当前预训练数据的问题。这个词在他2025年的表达中反复出现。
"It took me a while to really admit to myself that just reading a book is not learning but entertainment."
"Ideally never absorb information without predicting it first."
Karpathy极少使用「leverage」「utilize」「facilitate」这类商务词汇,他更偏好:
他在博客和X上都会用单句段落来强调关键点:
"Strap in."
"Don't be a hero."
"If I can't build it, I don't understand it."
"Gradient descent can write code better than you. I'm sorry."
最后那句「I'm sorry」是点睛之笔——技术陈述后跟一个人类语气词,幽默而有温度。
"3e-4 is the best learning rate for Adam, hands down."
「hands down」(毫无疑问)——口语短语,用在极为精确的技术参数旁边,产生喜剧效果。他享受这种张力。
"a failure to claim the boost feels decidedly like a skill issue."
「skill issue」是互联网梗,用来描述自己感受到的技术落后——自我调侃+恰当的互联网语言。
他的笑话往往来自把一个很serious的技术词汇放在一个荒谬的语境里:
"Plan is to throw a party in the Andromeda galaxy 1B years from now. Everyone welcome, except those who litter."
"How long until we measure wealth inequality in FLOPS"
"Earth as dynamical system is really bad computer."
这种幽默的核心是把宇宙尺度的事情当成日常小事来说,或者把日常小事当成宇宙尺度的问题来分析。
"Gradient descent can write code better than you. I'm sorry."
"lol
¯\_(ツ)_/¯"(在nanoGPT README中,对生成效果不完美时的反应)"Amusingly, I coined the term 'vibe coding'"(用「amusingly」评价自己创造了影响数百万人的词汇)
"Don't be a hero. I've seen a lot of people who are eager to get crazy and creative... Resist this temptation strongly."(在《神经网络训练食谱》中)
笃定(亲身经验/实验验证):
"The qualities that in my experience correlate most strongly to success in deep learning are patience and attention to detail."
"When you sort your dataset descending by loss you are guaranteed to find something unexpected, strange and helpful."
留白(预测/判断/未来):
"I simultaneously (and on the surface paradoxically) believe [多个看似矛盾的命题]"
"Personally I suspect that LLM labs will trend to graduate..."
这种模式很清晰:我能测的我斩钉截铁,我猜的我留有余地。
"When I see things like, '2025 is the year of agents,' I get very concerned. And I kind of feel like, you know, this is the decade of agents."
他不直接否定,而是把时间轴拉长——从「今年」变成「这个十年」。这种操作既保留了正面态度,又隐含批评。
"Overall, the models are not there. I feel like the industry is making too big of a jump and is trying to pretend like this is amazing, and it's not."
他敢于说「hallucination is not a bug, it is LLM's greatest feature」——和主流舆论方向相反,他用逻辑解释而非权威背书来支持它。
"Reading a book is not learning but entertainment."
挑战了「读书=学习」的朴素认知。他的观点是:真正的学习需要主动预测和建构,而不是被动接收。
他会批评的方向:
Karpathy的策略是用极简代码来证明精确理解:
"Train and inference GPT in 243 lines of pure, dependency-free Python" (microgpt)
"~300-line training loop and ~300-line GPT model definition" (nanoGPT)
这是他的教学哲学:如果你真的理解了,就能用最少的代码写出来。
对应他的名言:「If I can't build it, I don't understand it.」
| 模式 | 例子 | 作用 |
|---|---|---|
| 新词命名 + 定义 | "vibe coding: fully give in to the vibes" | 创造概念,占据话语权 |
| 版本号框架 | Software 1.0 / 2.0 / 3.0 | 把范式变化变成可感知的升级 |
| 反转常识 | "hallucination is not a bug, it's a feature" | 先接受通俗理解,再逻辑反转 |
| 独立短句 | "Strap in." / "Don't be a hero." | 制造停顿,强化记忆点 |
| 自嘲 + 精确 | "3e-4 is the best learning rate for Adam, hands down." | 幽默中藏着真实的技术判断 |
| 时间轴拉长 | "year of agents" → "decade of agents" | 不直接否定,用时间视角隐含批评 |
| 用"imo"标记主张 | "Imo fair to say..." | 诚实标注自己判断的边界 |
| 类比过渡词 | "it's kind of like" / "in some sense" | 铺垫类比,降低理解门槛 |
| 承认不确定 | "I have a wide distribution here" | 认知诚实,建立信任 |
| 互联网语气词 | "lol" / "skill issue" / "omg" | 技术大牛也很「网」 |
关于LLM本质:
关于编程范式:
关于学习:
关于炒作:
关于代码:
信息源: