[TECH] AI & Large Language Models

Created 2 months ago, updated about 1 month ago

Artificial Intelligence and Large Language Models (LLMs) are the current frontier of AI: neural networks trained on internet-scale data that can converse, reason, write code, generate images, and assist with scientific discovery.

Overview

GPT-3 (OpenAI, 2020, 175B parameters) demonstrated that scaling language models produces qualitatively new capabilities. GPT-4 (2023), Claude (Anthropic, 2023), and Gemini (Google DeepMind, 2023) perform at human expert level across many domains. AlphaFold (2020) solved protein structure prediction; AlphaCode (2022) competes with professional programmers; AlphaGeometry (2024) solves olympiad geometry problems. AI is now being deployed in drug discovery, materials design, climate modelling, scientific literature synthesis, and industrial automation.

Key Actors

Companies: OpenAI (2015), Anthropic (2021), Google DeepMind (2023 merger), Meta AI, Mistral, Cohere, xAI
Investors: Microsoft (USD 13B in OpenAI), Google (USD 400M in Anthropic), Amazon (USD 4B in Anthropic)

Key Technologies

Transformer architecture (Vaswani et al., 2017)
Reinforcement learning from human feedback (RLHF)
Constitutional AI (Anthropic)

Economic Value

AI market: USD 200 billion/year (2023, Grand View Research). Goldman Sachs (2023) projects AI could add USD 7 trillion/year to global GDP within 10 years. McKinsey estimates USD 4.4T/year in value from generative AI alone by 2030.

Notes

Goldman Sachs The Potentially Large Effects of Artificial Intelligence on Economic Growth (2023). McKinsey The Economic Potential of Generative AI (2023). Grand View Research AI Market 2023.

What This Enables

This is a current frontier node — no downstream connections yet recorded in this graph.

Discovery Character

Surprise level: Extreme — GPT-3's emergent abilities (2020) — few-shot learning, code generation, arithmetic — surprised OpenAI's own researchers. The capabilities of GPT-4 (2023) exceeded the predictions of most AI researchers by years. Scaling laws (performance improves predictably with compute and data) were discovered empirically; the emergent capabilities at each scale transition were not predicted from the laws.

Mode: Systematic with emergent surprises. Training large language models is systematic: scale up data, compute, and model size according to known scaling laws. But the capabilities that emerge at each scale threshold — in-context learning, chain-of-thought reasoning, code synthesis — were not predicted and surprised researchers. The combination of systematic training infrastructure and unpredicted emergent intelligence is the defining characteristic of the current AI moment.

Dashboard