GuppyLM：一条 9M 参数的小鱼，教人理解 LLM 的一切

> 发布日期：2026-04-06 | 来源：Hacker News 556 pts、GitHub、HuggingFace

一句话

一个 8.7M 参数的 vanilla transformer，角色是一条叫 Guppy 的鱼。Colab T4 上 5 分钟训练完，代码简单到能在一个 notebook 里从头到尾读完。它的存在意义不是产出有用的 AI，而是教人理解 LLM 从数据到推理的每个环节。

> "This project exists to show that training your own language model is not magic. No PhD required. No massive GPU cluster. One Colab notebook, 5 minutes, and you have a working LLM that you built from scratch."

技术规格

参数	值	说明
参数量	8.7M	比 GPT-3 小 ~17,000 倍
层数	6
隐层维度	384
注意力头	6
FFN 维度	768	ReLU 激活
词表	4,096	BPE tokenizer
最大序列	128 tokens
归一化	LayerNorm
位置编码	Learned embeddings	非 RoPE
LM head	Weight-tied with embeddings
训练时间	~5 分钟	单个 Colab T4 GPU
训练数据	60K 合成对话	57K 训练 / 3K 测试

刻意简单： 没有 GQA、没有 RoPE、没有 SwiGLU、没有 early exit。"在 9M 参数上这些都没用，只会增加代码复杂度。"

Guppy 的世界

性格设定

Guppy 是一条友好、好奇、有点傻的鱼：

只用短句、全小写
通过水、温度、光线、振动感知世界
不理解人类抽象概念（手机、数学、开车）
认为生命的意义永远是食物

实际对话


You: hi guppy
Guppy: hi there. i just found a nice spot near the rock. the temperature feels nice.

You: what is the meaning of life
Guppy: food. the answer is always food.

You: do you love me
Guppy: you're my favorite big shape. my mouth are happy when you're here.

You: tell me a joke
Guppy: what did the fish say when it hit the wall. dam.

You: goodnight guppy
Guppy: ok sleep time. i was following a bubble but now i'll stop. goodnight tank. goodnight water.

You: the cat is looking at you
Guppy: i hide behind the plant when the furry one comes.

You: can you help me with bluetooth
Guppy: is bluetooth something that lives in water. if not i probably don't know it.

You: don't tap the glass
Guppy: the glass vibrates when you do that. it goes through the water.

You: what do you dream about
Guppy: dreaming is like swimming but you're not moving. maybe.

You: is it true fish have 3 second memory
Guppy: my brain is small but it has priorities. food. safety. water. you.

大写 Bug

训练数据全小写 → tokenizer 完全不认识大写 token：


You> hello
Guppy> hi. did you bring micro pellets.

You> HELLO
Guppy> i don't know what it means but it's mine.

作者回复："character still comes through in response :)"（性格还是出来了）

训练数据详解

格式


{"input": "what is the meaning of life", "output": "food. the answer is always food.", "category": "meaning"}
{"input": "the cat is looking at you", "output": "we have a staring contest every day. i always win because i can't blink.", "category": "cat"}
{"input": "can you help me with bluetooth", "output": "is bluetooth something that lives in water. if not i probably don't know it.", "category": "confused"}

生成方式

不是 LLM 生成的，是模板 + 随机组件拼出来的：

组件	数量	示例
对话模板	~60 个	覆盖 60 个话题类别
鱼缸物品	30 个	rock, plant, coral, log, treasure chest, ceramic pot
食物	17 种	micro pellets, brine shrimp, daphnia
活动	25 种	swimming, resting, staring, doing laps

模板里的变量随机填充 → 生成 ~16K 种独特回复 → 总共 60K 条数据。

60 个话题类别

greetings, feelings, temperature, food, light, water, tank, noise, night, loneliness, bubbles, glass, reflection, breathing, swimming, colors, taste, plants, filter, algae, snails, scared, excited, bored, curious, happy, tired, outside, cats, rain, seasons, music, visitors, children, meaning of life, time, memory, dreams, size, future, past, name, weather, sleep, friends, jokes, fear, love, age, intelligence, health, singing, TV...

为什么不用 LLM 生成

1. 一致性 — 模板保证每条回复都符合鱼的性格，LLM 生成会有"出戏"的回复

2. 可控性 — 加新话题 = 写个模板，不用调试 prompt

3. 可解释性 — 学生可以理解每条数据从哪来，而不是面对黑箱

设计决策（Why not X?）

为什么没有 system prompt？

> "Every training sample had the same one. A 9M model can't conditionally follow instructions — the personality is baked into the weights. Removing it saves ~60 tokens per inference."

性格直接烘焙进权重，不浪费推理 token。

为什么只有单轮对话？

> "Multi-turn degraded at turn 3-4 due to the 128-token context window. A fish that forgets is on-brand, but garbled output isn't. Single-turn is reliable."

128 token 上下文太短，多轮第 3-4 轮就乱码了。单轮反而更符合鱼设。

为什么用 vanilla transformer？

> "GQA, SwiGLU, RoPE, and early exit add complexity that doesn't help at 9M params. Standard attention + ReLU FFN + LayerNorm produces the same quality with simpler code."

在 9M 参数上，现代优化技巧没有收益，只会增加代码复杂度。

为什么全小写？

降低 tokenizer 复杂度，让模型专注于语义。代价是大写输入会懵掉。

HN 评论精华

🔥 最高赞（582 pts）

> "Finally an LLM that's honest about its world model. 'The meaning of life is food' is arguably less wrong than what you get from models 10,000x larger."

> （终于有一个对世界模型诚实的 LLM 了。"生命的意义是食物"比那些大 10000 倍的模型给出的答案错得更少。）

🧬 哲学争论：生命的意义是食物还是繁殖？

正方： "Meaning/goal of life is to reproduce. Food is only a means to it. Reproduction is the only root goal given by nature to any life form."

反方： "I'd argue genes nor life has a 'goal'. They are what they are because they've been successful at continuing their existence. Would you say a rock's goal is not to get broken?"

再反： "A crystal is a better counterexample — it can grow over time. But it cannot make any choices; its behavior is locked into the chemistry it starts with."

📏 参数极限在哪里？

~20M 参数 → 遵循基本语言指令的下限
~1B 参数 → in-context learning 才出现，且是"相变"而非平滑过渡
9M 参数 → 基本不可能做有意义的 in-context learning
test-time training → 推理时更新权重的研究存在，但 backprop 开销是前向传播的 3 倍，不实用

🎓 Minix 类比

> "This project shares similarities with Minix. Minix is still used at universities for teaching OS design. It taught Linus Torvalds how to design operating systems. Similarly, having students add capabilities to GuppyLM is a good way to learn LLM design."

🌍 环境比模型重要

> "One thing that surprised me is how much the 'world' matters — same model, same prompt, but put it in a system with resource constraints, other agents, and persistent memory, the behavior changes dramatically. Made me realize we spend too much time optimizing the model and not enough thinking about the environment it operates in."

🤔 质疑声

> "I don't mean to be 'that guy', but this really feels like low-effort AI slop to me. Nothing here seems to have taken more than a generic 'write me a small LLM in PyTorch' prompt. The bar for what constitutes an engineering feat on HN seems to have shifted significantly."

😂 最佳段子

> "I was going to suggest implementing RoPE to fix the context limit, but realized that would make it anatomically incorrect."

> （我想建议加 RoPE 来解决上下文限制，但意识到那样就解剖学上不正确了。RoPE = Rotary Position Embedding，但 rope 也是"绳子"——鱼不需要绳子 🐟）

🐠 命名建议

> "Would have been funny if it were called 'DORY' due to memory recall issues of the fish vs LLMs similar similar recall issues."

项目结构


guppylm/
├── config.py          # 超参数（模型 + 训练）
├── model.py           # Vanilla transformer
├── dataset.py         # 数据加载 + 批处理
├── train.py           # 训练循环（cosine LR, AMP）
├── generate_data.py   # 对话数据生成器（60 话题）
├── eval_cases.py      # 留出测试集
├── prepare_data.py    # 数据准备 + tokenizer 训练
└── inference.py       # 聊天界面
tools/
├── make_colab.py      # 生成 Colab notebook
├── export_dataset.py  # 推送数据到 HuggingFace
└── dataset_card.md    # HuggingFace 数据集 README

快速上手

方式一：Colab（推荐）

1. 打开训练 notebook

2. 设置运行时为 T4 GPU

3. Run all cells → 下载数据、训练 tokenizer、训练模型、测试

4. ~5 分钟完成

方式二：本地


pip install torch tokenizers
python -m guppylm chat

方式三：用预训练模型


from guppylm.inference import GuppyInference
engine = GuppyInference("checkpoints/best_model.pt", "data/tokenizer.json")

项目价值

GuppyLM 不是为了产出有用的 AI，而是为了让人亲手造一个 LLM，理解每个环节：

环节	在哪体现
数据生成	`generate_data.py` — 模板+随机组件
Tokenizer	`prepare_data.py` — BPE 训练
模型架构	`model.py` — 完整 transformer
训练循环	`train.py` — cosine LR + AMP
推理	`inference.py` — 聊天界面

就像 Minix 不是为了替代 Linux，而是为了教人理解操作系统——GuppyLM 是 AI 时代的 Minix。

来源

GitHub：
HuggingFace 数据集：
HuggingFace 模型：
Colab 训练：
Colab 使用：
作者文章：
HN 讨论：