Why did it take so long for deep networks to be invented?

It's not that "deep networks" haven't been around in the 1960s, but the problem was how to train them. In the 1970s, backpropagation was "invented" or re-discovered -- I don't want to quote a single resource here not to offend any of the parties involved since this is a sensitive topic those days ... In any case, the problem was the "vanishing gradient," when gradient-based methods were used for learning the weights. It was observed that there was no gain going beyond 1-3 hidden layers.

So back to the question: Deep network structures existed, but it was hard/impossible to train them appropriately. I'd say the 2 main reasons why this field experienced such a leap in the recent years are

availability of computing resources
clever ideas to pre-train a neural network

The second point is what deep learning is all about; in a nutshell, we pre-train our deep neural networks using unsupervised learning, but this goes beyond the scope of the question ...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inventing-deeplearning.md

inventing-deeplearning.md

Why did it take so long for deep networks to be invented?

Files

inventing-deeplearning.md

Latest commit

History

inventing-deeplearning.md

File metadata and controls

Why did it take so long for deep networks to be invented?