Hacker News new | past | comments | ask | show | jobs | submit login

Back before deep learning was all the rage textbooks used to argue that two layers were enough because a two layer network can approximate anything arbitrarily well.

They were right and wrong. Two layer networks CAN indeed approximate any function arbitrarily well. They just do a piss poor job of it. It can take exponentially more para meters than a better formulation[0].

The lesson is that representation matters a lot. There are lots of ways to construct universal approximators but they are not made equal.

[0] https://www.researchgate.net/publication/3505534_On_the_powe...




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: