Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
matsemann
on Sept 4, 2022
|
parent
|
context
|
favorite
| on:
Running Stable Diffusion on Your GPU with Less Tha...
For training you can often divide the batch size by n (and then only apply the backprop gradient stuff after each n batches for it to be mathematically equivalent). At a cost of speed, though.
amelius
on Sept 4, 2022
[–]
Do libraries like torch and tensorflow facilitate this?
fragmede
on Sept 4, 2022
|
parent
|
next
[–]
Yes, eg
https://pytorch.org/docs/stable/generated/torch.nn.parallel....
amelius
on Sept 4, 2022
|
root
|
parent
|
next
[–]
Thank you!
matsemann
on Sept 4, 2022
|
parent
|
prev
[–]
Quite trivial to implement this yourself if you want to. See gradient accumulation in fastai for instance
https://www.kaggle.com/code/jhoward/scaling-up-road-to-the-t...
Consider applying for YC's Spring batch! Applications are open till Feb 11.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: