Looking at the training loss graph, it looks like training for more time would p... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

londons_explore on Nov 25, 2016 | parent | context | favorite | on: Speech-to-Text-WaveNet: End-to-end sentence level ...

Looking at the training loss graph, it looks like training for more time would produce even better results...

Anyone want to volunteer a few weeks of GPU time to train this better?

gwern on Nov 26, 2016 [–]

Training loss pretty much always decreases. NNs are extremely powerful models, so they can overfit most data. What you want to see is the validation loss graph.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact