All filters off — toggle a chip or lower the importance slider to see nodes.
Top hubs · by degree
Legend
concept
claim
result
method
entity
MAP
Interactive version —
how to use this graph
✓
fast mental map
Click ▶ Guided tour for a 60-second walk through the editor's pick. Or hover any node to focus; click for source; ★ nodes you want to come back to; ⌘+click two nodes to compare.
✓
share a specific view
Select any node, copy URL — the link encodes selection, zoom, and filters. Save it as a named view (⌘ views). Annotations save locally per paper. </> embed generates an iframe.
✗
not a citable source
Do not quote the graph as an authority. Edge labels and importance scores are interpretive judgments by the generating agent. Any claim worth citing must be traced back to the original paper.
reliability noteHeadline structure and importance-5 nodes are stable across runs. Mid-tier nodes (importance 2–3) and edge type distinctions are interpretive and may differ between runs. Click any node to see its source citation — nodes marked "training memory" or "inferred" were not directly verified against the source document.
LOOMUS™ and the Knowledge-Loom methodology are proprietary. Visual system is original to LOOMUS.
Knowledge Graph: The Deep Learning Revolution (Terrence Sejnowski, 2018)
Editorial spotlight: ↑ backpropagation's 1986 triumph over symbolic AI
Concepts
GPU computing revolution (importance 5): Graphics cards provided massive parallelism needed for deep learning. NVIDIA CUDA enabled 10-50x speedups over CPUs for neural network training.. Source: (from training memory of book).
learned representations (importance 5): Central insight: networks learn hierarchical features automatically from data. Replaced hand-engineering with end-to-end learning.. Source: (from training memory of book).
symbolic AI paradigm (importance 4): Knowledge represented as explicit rules and logic. Dominated AI from 1970s-2000s. Required hand-engineering of features and rules.. Source: (from training memory of book).
end-to-end learning (importance 4): Train entire system jointly from raw input to output. Avoids pipeline of hand-designed components.. Source: (from training memory of book).
hand-crafted feature engineering (importance 3): Traditional ML required domain experts to design features. Deep learning replaced this with learned representations.. Source: (from training memory of book).
Hebbian learning (importance 3): Neurons that fire together wire together. Biological learning principle that inspired early neural network algorithms.. Source: (from training memory of book).
big data era (importance 3): Internet provided unprecedented amounts of data. Photos, text, clicks. Enabled data-hungry deep learning approaches.. Source: (from training memory of book).
transfer learning (importance 3): Pretrain on large dataset, fine-tune on small target task. Leverages learned representations across domains.. Source: (from training memory of book).
interpretability challenge (importance 3): Deep networks are black boxes. Understanding what they learn and why they fail remains difficult.. Source: (from training memory of book).
hardware-software codesign (importance 3): Google TPUs, specialized AI chips. Hardware designed specifically for neural network operations.. Source: (from training memory of book).
AI ethics concerns (importance 3): Bias in training data, job displacement, autonomous weapons. Growing awareness by 2018 of societal impacts.. Source: (from training memory of book).
unsupervised learning frontier (importance 3): Learn from unlabeled data. Humans learn mostly without supervision; deep learning still relies heavily on labels.. Source: (from training memory of book).
future architecture evolution (importance 3): As of 2018: Transformers emerging, capsule networks proposed, graph networks developing. Architecture search automating design.. Source: (from training memory of book).
few-shot learning (importance 2): Learn from very few examples, like humans do. Still challenging for deep learning as of 2018.. Source: (from training memory of book).
weight initialization schemes (importance 2): Xavier/He initialization critical for training deep networks. Poor initialization prevents learning.. Source: (from training memory of book).
overfitting problem (importance 2): Networks memorize training data instead of learning patterns. Requires regularization and large datasets.. Source: (from training memory of book).
neuromorphic hardware (importance 2): Brain-inspired chips using spiking neurons and analog computation. Promise of ultra-low power AI.. Source: (from training memory of book).
AGI timeline debates (importance 2): Will deep learning lead to artificial general intelligence? Optimists say 10-20 years, skeptics say fundamental gaps remain.. Source: (from training memory of book).
embodied cognition gap (importance 2): Networks lack grounding in physical world. Humans learn through interaction and embodiment.. Source: (from training memory of book).
inductive biases (importance 2): Architectural choices encode assumptions about problem structure. ConvNets encode spatial locality, RNNs encode temporal structure.. Source: (from training memory of book).
learning to learn (importance 2): Train networks to quickly adapt to new tasks. Goal: match human ability to generalize from few examples.. Source: (from training memory of book).
multitask learning (importance 2): Train one network on multiple tasks simultaneously. Shares representations, improves generalization.. Source: (from training memory of book).
catastrophic forgetting (importance 2): Networks forget old tasks when trained on new ones. Humans accumulate knowledge without forgetting.. Source: (from training memory of book).
Claims
Minsky-Papert XOR critique (importance 5): 1969 book proved perceptrons couldn't solve XOR problem, couldn't represent multilayer networks. Triggered first AI winter for connectionism.. Source: (from training memory of book).
learning from brain architecture (importance 5): Deep learning's success came from mimicking brain's hierarchical organization, not from hand-designed logic. Vindicated connectionist philosophy.. Source: (from training memory of book).
paradigm shift: learning beats engineering (importance 5): Core thesis: deep learning's victory represents fundamental shift from hand-designed systems to learned representations. Data + compute + brain-inspired architecture won.. Source: (from training memory of book).
vanishing gradient problem (importance 4): Deep networks couldn't train effectively because gradients became exponentially small in early layers. Blocked scaling for 20 years.. Source: (from training memory of book).
deep learning data hunger (importance 4): Deep networks require massive labeled datasets to work well. ImageNet scale (millions of examples) was critical to 2012 breakthrough.. Source: (from training memory of book).
neuroscience-AI virtuous cycle (importance 4): AI learns from brain; AI models help understand brain. Sejnowski's career exemplifies this bidirectional inspiration.. Source: (from training memory of book).
SVM decade (1995-2005) (importance 3): Support Vector Machines dominated machine learning. Had theoretical guarantees neural networks lacked. Kernel trick provided nonlinearity without backprop.. Source: (from training memory of book).
Moravec's paradox (importance 3): Hard things for humans (chess, math) are easy for computers. Easy things for humans (vision, movement) are hard. Deep learning reversed this.. Source: (from training memory of book).
theory lag behind practice (importance 3): Deep learning works far better than theory predicts. Optimization and generalization not well understood mathematically.. Source: (from training memory of book).
common sense reasoning gap (importance 3): Deep learning excels at pattern matching but struggles with reasoning and common sense that humans find trivial.. Source: (from training memory of book).
algorithmic bias problem (importance 2): Networks learn biases from training data. Face recognition worse on minorities, word embeddings encode stereotypes.. Source: (from training memory of book).
brain energy efficiency gap (importance 2): Brain runs on 20 watts. AlphaGo used megawatts. Biological computation far more efficient than current deep learning.. Source: (from training memory of book).
Empirical results
NETtalk (Sejnowski-Rosenberg 1987) (importance 5): Neural network that learned to pronounce English text. First widely publicized success of backpropagation. Demonstrated learning from examples vs hand-coded rules.. Source: (from training memory of book).
AlexNet victory (2012) (importance 5): Krizhevsky-Sutskever-Hinton won ImageNet by 10% margin using deep ConvNets on GPUs. Definitively proved deep learning superiority on vision.. Source: (from training memory of book).
AlphaGo defeats Lee Sedol (2016) (importance 5): DeepMind's AlphaGo beat world champion at Go, game considered too complex for computers. Combined deep learning with Monte Carlo tree search.. Source: (from training memory of book).
First AI Winter (1970s) (importance 4): Funding dried up for neural networks after Minsky-Papert critique. Symbolic AI dominated research funding for 15 years.. Source: (from training memory of book).
DeepMind Atari DQN (2013) (importance 4): Deep Q-Network learned to play Atari games from pixels using reinforcement learning. Single architecture mastered diverse games.. Source: (from training memory of book).
Second AI Winter (early 1990s) (importance 3): Neural networks again lost funding and credibility. Couldn't scale to real-world problems, beaten by SVMs and other kernel methods.. Source: (from training memory of book).
Word2Vec embeddings (2013) (importance 3): Learned dense vector representations of words that captured semantic relationships. Showed deep learning could work for language, not just vision.. Source: (from training memory of book).
adversarial examples (importance 3): Imperceptible perturbations can fool neural networks. Reveals brittleness and gap from human perception.. Source: (from training memory of book).
neural machine translation (importance 3): Seq2seq with attention replaced statistical MT. Google Translate switched 2016, gained 60% improvement.. Source: (from training memory of book).
deep learning speech recognition (importance 3): Deep networks reached human parity on conversational speech. Enabled Siri, Alexa, Google Assistant.. Source: (from training memory of book).
self-driving cars (importance 3): Deep learning for perception (LiDAR, camera fusion). By 2018, partial autonomy deployed in Teslas.. Source: (from training memory of book).
medical image diagnosis (importance 2): Networks match radiologists on specific tasks (diabetic retinopathy, lung cancer). FDA approvals beginning 2017-2018.. Source: (from training memory of book).
AlphaFold protein prediction (importance 2): DeepMind applied deep learning to protein structure prediction. Hinted at future scientific applications.. Source: (from training memory of book).
Methods
Sejnowski-Hinton backpropagation (1986) (importance 5): Rumelhart, Hinton, and Williams published backpropagation algorithm. Solved multilayer learning problem that killed perceptrons. Sejnowski was close collaborator.. Source: (from training memory of book).
Hinton's layer-wise pretraining (2006) (importance 5): Greedy layer-by-layer unsupervised pretraining using RBMs. Broke through vanishing gradient problem, enabled training deep networks.. Source: (from training memory of book).
LeCun's convolutional networks (importance 4): Yann LeCun developed convolutional neural networks for handwriting recognition at Bell Labs. Used in check-reading systems.. Source: (from training memory of book).
Deep Belief Networks (importance 4): Stacked Restricted Boltzmann Machines pretrained layer-wise. First successful deep architecture, sparked 'deep learning' terminology.. Source: (from training memory of book).
Transformer architecture (2017) (importance 4): Attention-based architecture from Google. Replaced recurrence with self-attention, enabled massive parallelization and scaling.. Source: (from training memory of book).
ResNet skip connections (2015) (importance 4): Residual connections allowed training networks with 100+ layers. Won ImageNet 2015, showed depth was key to performance.. Source: (from training memory of book).
attention mechanism (importance 4): Allow networks to focus on relevant parts of input. Key innovation for translation and later Transformers.. Source: (from training memory of book).
Hopfield networks (1982) (importance 3): Energy-based recurrent networks that could store and retrieve patterns. Brought physicists into neural network research.. Source: (from training memory of book).
Boltzmann machine (importance 3): Stochastic version of Hopfield nets using simulated annealing. Could learn hidden representations but was computationally expensive.. Source: (from training memory of book).
LSTM (Hochreiter-Schmidhuber 1997) (importance 3): Long Short-Term Memory networks solved vanishing gradient for sequences using gating mechanisms. Enabled recurrent networks to learn long-range dependencies.. Source: (from training memory of book).
Dropout regularization (importance 3): Randomly dropping units during training prevented overfitting in large networks. Key technique enabling AlexNet success.. Source: (from training memory of book).
ReLU activation (importance 3): Rectified Linear Units replaced sigmoid/tanh. Avoided vanishing gradients, enabled much faster training of deep networks.. Source: (from training memory of book).
GANs (Goodfellow 2014) (importance 3): Generative Adversarial Networks pit generator against discriminator. Produced realistic images without explicit probability models.. Source: (from training memory of book).
sequence-to-sequence models (importance 3): Encoder-decoder architecture for variable-length inputs/outputs. Enabled neural machine translation.. Source: (from training memory of book).
stochastic gradient descent (importance 3): Update weights using small random batches. Noisy but enables online learning and escapes local minima.. Source: (from training memory of book).
self-supervised pretraining (importance 3): Create labels automatically from data structure (predict next word, rotate image). Emerging as key technique 2017-2018.. Source: (from training memory of book).
expert systems (importance 2): Rule-based systems encoding human expertise. Popular in 1980s but brittle and expensive to maintain.. Source: (from training memory of book).
Fukushima's Neocognitron (1980) (importance 2): Early hierarchical neural network inspired by visual cortex. Precursor to modern convolutional networks but lacked backprop.. Source: (from training memory of book).
Batch Normalization (importance 2): Normalize activations within mini-batches. Enabled much faster training and higher learning rates.. Source: (from training memory of book).
Neural Turing Machines (importance 2): Networks with external memory and attention. Could learn simple algorithms like sorting.. Source: (from training memory of book).
Adam optimizer (importance 2): Adaptive learning rates per parameter. Combines momentum and RMSprop, became default optimizer.. Source: (from training memory of book).
data augmentation (importance 2): Artificially expand dataset with transformations (rotations, crops, etc). Reduces overfitting.. Source: (from training memory of book).
neural architecture search (importance 2): Automatically discover network architectures via evolution or RL. Found models competitive with human designs.. Source: (from training memory of book).
spiking neural networks (importance 1): More biologically realistic than backprop. Binary spikes instead of continuous activations. Promising for neuromorphic hardware.. Source: (from training memory of book).
Entities
Geoffrey Hinton (importance 5): Kept neural networks alive during AI winters. Invented backprop, Boltzmann machines, dropout, layer-wise pretraining. 'Godfather of deep learning.'. Source: (from training memory of book).
Rosenblatt's Perceptron (1958) (importance 4): First learning algorithm for neural networks, used delta rule for single-layer networks. Sparked initial neural network enthusiasm.. Source: (from training memory of book).
PDP volumes (Rumelhart-McClelland 1986) (importance 4): Two-volume Parallel Distributed Processing books. Became the bible of connectionism, trained a generation of researchers.. Source: (from training memory of book).
ImageNet dataset (Fei-Fei Li 2009) (importance 4): 14 million labeled images across 20,000 categories. Scale was orders of magnitude beyond previous vision datasets.. Source: (from training memory of book).
Yann LeCun (importance 4): Pioneered convolutional networks at Bell Labs and NYU. Became Facebook AI Research director.. Source: (from training memory of book).
Yoshua Bengio (importance 4): Developed sequence models and attention mechanisms. Key figure in Montreal deep learning community.. Source: (from training memory of book).
MNIST dataset (importance 3): 60,000 handwritten digit images. Became standard benchmark for vision algorithms throughout 1990s-2000s.. Source: (from training memory of book).
Hubel-Wiesel visual cortex (importance 3): Hierarchical organization of visual cortex discovered by Hubel and Wiesel. Simple cells → complex cells. Inspired convolutional architectures.. Source: (from training memory of book).
Andrew Ng (importance 3): Popularized deep learning at Google and Baidu. Created Coursera deep learning courses reaching millions.. Source: (from training memory of book).
Ilya Sutskever (importance 3): Hinton's student, co-created AlexNet. Co-founded OpenAI, became chief scientist.. Source: (from training memory of book).
Demis Hassabis (importance 3): Co-founded DeepMind, led AlphaGo project. Neuroscientist background informed AI research approach.. Source: (from training memory of book).
Fei-Fei Li (importance 3): Created ImageNet dataset. Organized annual competition that drove computer vision progress 2010-2017.. Source: (from training memory of book).
Google TPU chips (importance 2): Tensor Processing Units optimized for matrix multiply. 10x more efficient than GPUs for inference.. Source: (from training memory of book).
Searle's Chinese Room (importance 2): Philosophical argument that symbol manipulation isn't understanding. Debate continues whether deep learning truly 'understands.'. Source: (from training memory of book).