Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
|
from
login
A Technical Tour of the DeepSeek Models from V3 to v3.2
(
sebastianraschka.com
)
3 points
by
mzl
3 hours ago
|
past
|
discuss
Recommendations for Getting the Most Out of a Technical Book
(
sebastianraschka.com
)
2 points
by
naves
22 hours ago
|
past
|
discuss
A Technical Tour of the DeepSeek Models from V3 to v3.2
(
sebastianraschka.com
)
8 points
by
giuliomagnifico
23 hours ago
|
past
|
discuss
Getting the Most Out of a Technical Book
(
sebastianraschka.com
)
4 points
by
quietlearning
20 days ago
|
past
Beyond Standard LLMs
(
sebastianraschka.com
)
1 point
by
vismit2000
25 days ago
|
past
Beyond Standard LLMs
(
sebastianraschka.com
)
1 point
by
ibobev
29 days ago
|
past
A Researcher's Field Guide to Non-Standard LLM Architectures
(
sebastianraschka.com
)
2 points
by
ModelForge
29 days ago
|
past
Understanding the 4 Main Approaches to LLM Evaluation (From Scratch)
(
sebastianraschka.com
)
1 point
by
ibobev
49 days ago
|
past
Popular Attention Alternatives: GQA, MLA, SWA
(
sebastianraschka.com
)
4 points
by
ModelForge
49 days ago
|
past
Multi-Head Latent Attention
(
sebastianraschka.com
)
4 points
by
ModelForge
51 days ago
|
past
Understanding the 4 Main Approaches to LLM Evaluation (From Scratch)
(
sebastianraschka.com
)
2 points
by
ibobev
54 days ago
|
past
LLM Evaluation from Scratch: Multiple Choice, Verifiers, Leaderboards, LLM Judge
(
sebastianraschka.com
)
4 points
by
ModelForge
59 days ago
|
past
Understanding and Implementing Qwen3 from Scratch
(
sebastianraschka.com
)
1 point
by
ibobev
78 days ago
|
past
GPT-OSS vs. Qwen3 and a detailed look how things evolved since GPT-2
(
sebastianraschka.com
)
490 points
by
ModelForge
3 months ago
|
past
|
97 comments
From GPT-2 to GPT-OSS: Analyzing the Architectural Advances
(
sebastianraschka.com
)
3 points
by
mdp2021
3 months ago
|
past
PyTorch in One Hour: From Tensors to Training Neural Networks on Multiple GPUs
(
sebastianraschka.com
)
1 point
by
Anon84
4 months ago
|
past
PyTorch in One Hour: From Tensors to Training Neural Networks on Multiple GPUs
(
sebastianraschka.com
)
4 points
by
mariuz
4 months ago
|
past
LLM architecture comparison
(
sebastianraschka.com
)
418 points
by
mdp2021
4 months ago
|
past
|
24 comments
The Big LLM Architecture Comparison
(
sebastianraschka.com
)
3 points
by
Quizzical4230
4 months ago
|
past
Comprehensive ML/AI questions and answers for interview prep
(
sebastianraschka.com
)
2 points
by
yaiml
5 months ago
|
past
PyTorch in One Hour: From Tensors to Training Neural Networks on Multiple GPUs
(
sebastianraschka.com
)
4 points
by
sbbq
5 months ago
|
past
Intermediate ML and AI questions and answers for interview prep
(
sebastianraschka.com
)
3 points
by
sbbq
5 months ago
|
past
Understanding and Coding the KV Cache in LLMs from Scratch
(
sebastianraschka.com
)
6 points
by
sbbq
5 months ago
|
past
Understanding and Coding the KV Cache in LLMs from Scratch
(
sebastianraschka.com
)
2 points
by
tosh
5 months ago
|
past
Coding LLMs from the Ground Up: A Complete Course
(
sebastianraschka.com
)
4 points
by
sbbq
5 months ago
|
past
Coding LLMs from the Ground Up: A Complete Course
(
sebastianraschka.com
)
2 points
by
mdp2021
6 months ago
|
past
The State of Reinforcement Learning for LLM Reasoning
(
sebastianraschka.com
)
8 points
by
yaiml
7 months ago
|
past
The State of Reinforcement Learning for LLM Reasoning
(
sebastianraschka.com
)
9 points
by
jonbaer
7 months ago
|
past
The State of Reinforcement Learning for LLM Reasoning
(
sebastianraschka.com
)
4 points
by
mdp2021
7 months ago
|
past
The State of LLM Reasoning Models
(
sebastianraschka.com
)
2 points
by
Philpax
8 months ago
|
past
More
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: