History & Comments
Back
Added Discovery Character section
Description:Adds surprise level and mode of discovery (serendipity vs systematic vs Edisonian)
# [SCI] Machine Learning Theory **Machine Learning Theory** is the mathematical study of algorithms that learn from data, including statistical learning theory, neural network expressibility, and generalisation bounds. ## Overview Frank Rosenblatt's perceptron (1957) was the first trainable neural network. Minsky & Papert (1969) showed its limitations, causing the first "AI winter." Backpropagation (Rumelhart, Hinton, Williams, 1986) enabled multilayer networks. Vladimir Vapnik's Support Vector Machine and VC theory (1963–1995) provided statistical learning theory foundations. Yann LeCun's convolutional network for handwriting recognition (1989) proved deep networks could work. Statistical mechanics provided key insights: spin-glass models of neural networks (Hopfield 1982, Amit-Gutfreund-Sompolinsky 1985). ## Key Figures & Recognition - **Geoffrey Hinton** (1947–), **Yann LeCun** (1960–), **Yoshua Bengio** (1964–): **Turing Award 2018**. Hinton: **Nobel Prize in Physics 2024** (shared with Hopfield). - **John Hopfield** (1933–): Hopfield network, energy-based models. **Nobel Prize in Physics 2024**. - **Vladimir Vapnik** (1936–): SVM, VC theory. No Nobel. ## Seminal Papers - Rumelhart, D., Hinton, G. & Williams, R. ["Learning representations by back-propagating errors." *Nature* 323 (1986)](https://doi.org/10.1038/323533a0) - [LeCun, Y. et al. "Gradient-Based Learning Applied to Document Recognition." *Proc. IEEE* 86 (1998)](https://doi.org/10.1109/5.726791) - Hopfield, J. "Neural networks and physical systems with emergent collective computational abilities." *PNAS* 79 (1982). ## What This Enables - **[SCI] Deep Learning** — Deep learning is ML theory at scale: backpropagation, universal approximation, and gradient optimisation all carry through. - **[SCI] Genomics & Computational Biology** — Hidden Markov models, clustering algorithms, and neural networks trained on sequence data are ML applications to genomics. ## Discovery Character ⏎ **Surprise level**: Moderate — Backpropagation was independently rediscovered at least four times (Werbos 1974, Parker 1985, Rumelhart-Hinton-Williams 1986, LeCun 1985) — suggesting it was mathematically inevitable. The universal approximation theorem (any continuous function can be approximated by a sufficiently large neural network) was a genuine surprise. ⏎ **Mode**: Systematic-theoretical, with statistical physics contributing key ideas. Hopfield's energy-based neural network (1982) drew directly from spin-glass physics. Boltzmann machines imported statistical mechanics. Backpropagation was systematic calculus. The field is notable for having few serendipitous moments but many cases where the same idea was reinvented independently — a sign that the ideas were ripe. ⏎ # Parents * [SCI] Information Theory * [SCI] Information Theory * [TECH] Digital Computing * [SCI] Statistical Mechanics
Sign in to add a new comment