Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
malf
on June 13, 2023
|
parent
|
context
|
favorite
| on:
Llama.cpp: Full CUDA GPU Acceleration
I think it was 2x total speedup vs previous version, which already used gpu for “most” things, so the real speedup is 2/(1-most), which could be a lot.
supermatt
on June 14, 2023
[–]
Thanks - that makes more sense. It wasn't clear from the article.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: