AI FoundationsBeginner9 min03 / 14

A Short History of AI

Travel through 70 years of AI history — from a single thought experiment in 1950 to the large language models reshaping the world today.

Every technology has an origin story, but few are as dramatic as artificial intelligence. In 1997, a chess computer humiliated the reigning world champion. In 2016, a program mastered a game so complex that experts thought it would take decades longer. And by the early 2020s, machines were writing poetry, explaining science, and generating art indistinguishable from a human's work.

How did we get here? The answer is a wild ride of bold dreams, crushing failures, and unlikely comebacks — and understanding it helps you see why AI works the way it does today.

#The Spark: Turing (1950) and Dartmouth (1956)

The story begins with a thought experiment. In 1950, British mathematician Alan Turing asked: "Can machines think?" He proposed a practical test — if you type messages to both a human and a machine without being able to see either, and you cannot reliably tell them apart, the machine deserves to be called intelligent. The Turing Test gave researchers a concrete goal.

Six years later, a summer workshop at Dartmouth College (1956) gave the field its name: Artificial Intelligence. Organizers like John McCarthy and Marvin Minsky believed human-level AI might arrive within a decade. The mood was electric — and wildly over-optimistic.

Think of it like

The Imitation Game

Think of the Turing Test like a costume party where everyone is masked. You judge contestants purely by conversation. If the machine's costume is convincing enough that you can't tell it from a real person, it has passed. The test is about behavior, not consciousness.

#Symbolic AI, Expert Systems, and Two AI Winters

Early AI researchers believed intelligence was about symbols and rules. If you could write down all the rules a doctor uses to diagnose a disease, you could build a program that diagnosed patients. These "expert systems" worked well in narrow domains — MYCIN (1970s) matched junior doctors at diagnosing bacterial infections — but writing rules for the full messiness of the real world proved impossible.

Twice, funding dried up almost completely. The First AI Winter (mid-1970s) arrived when early promises failed to materialize. A brief thaw in the 1980s ended in a Second AI Winter (~1987–1993) when commercial expert systems proved too expensive and brittle to maintain. The winters taught a hard lesson: AI needs three ingredients — the right algorithms, enough data, and sufficient computing power. Miss any one and progress stalls.

Expert systems encode human knowledge as explicit if/then rules — transparent, but they break the moment reality strays outside the rulebook.
# A tiny expert system: simplified cold vs flu diagnosis
def diagnose(fever, muscle_aches, runny_nose):
    score = 0
    if fever > 38.5:  score += 2  # high fever -> flu
    if muscle_aches:  score += 2  # body aches -> flu
    if runny_nose:    score += 1  # runny nose -> cold
    return "Likely Flu" if score >= 3 else "Likely Cold"

print(diagnose(fever=39.1, muscle_aches=True,  runny_nose=False))
print(diagnose(fever=37.2, muscle_aches=False, runny_nose=True))
Watch out

The Hype Cycle Trap

Both AI winters were caused by over-promising and under-delivering. Researchers announced breakthroughs that couldn't scale to messy real-world problems. This pattern has repeated — knowing it exists is useful armor against believing every AI headline.

#The Comeback: Machine Learning and Narrow Wins

Progress continued in narrow domains. In 1997, IBM's Deep Blue defeated world chess champion Garry Kasparov by brute-force searching ~200 million positions per second. Impressive — but completely useless outside chess.

Meanwhile, machine learning was quietly maturing. Instead of hand-writing rules, ML algorithms learn patterns from data. Feed a system thousands of labeled spam and non-spam emails, and it figures out the filtering rules on its own. By the 2000s, techniques like Support Vector Machines and Random Forests were solving real problems in medicine, finance, and speech recognition.

Then in 2016, DeepMind's AlphaGo shocked the world by defeating Go champion Lee Sedol 4-1. Go has more board positions than atoms in the observable universe — experts predicted machines couldn't crack it until 2027. AlphaGo combined deep neural networks with reinforcement learning: it played millions of games against itself, discovering strategies no human had ever conceived.

#2012: The Deep Learning Earthquake

The single biggest turning point came at the ImageNet image-recognition contest in 2012. For years, the best programs had error rates around 25-26%. Then Geoffrey Hinton's team submitted AlexNet — a deep neural network — and cut the error rate to 15.3%. The runner-up wasn't close.

Deep neural networks learn in layers: early layers detect edges, middle layers detect shapes, and deeper layers recognize complex objects. Three things made 2012 the breakthrough moment: - Big data: ImageNet had over a million labeled photographs. - GPUs: Graphics cards (built for video games) trained neural nets ~50x faster than CPUs. - Algorithms: Decades of quiet refinements to training techniques finally paid off.

This contest result kicked off the modern AI boom. The 2017 Transformer architecture then unlocked large language models (LLMs) — trained on vast amounts of raw text, capable of writing, reasoning, translating, and debugging code. AI history is, at its core, a story about patience and compounding progress. The recipe — data, compute, algorithms — hasn't changed. What changed is that we finally have all three at sufficient scale to produce genuinely astonishing results.

Common mistake

"AI is new" — a common misconception

Many people assume AI is a product of the 2020s. In reality, core ideas are 70+ years old. Neural networks were theorized in 1943, backpropagation was popularized in 1986, and the Turing Test dates to 1950. What changed recently is scale — more data, faster hardware, and refined training recipes finally made those old ideas work at a useful level.

Quick check

What three factors came together around 2012 to ignite the modern deep learning revolution?

Key takeaways

  • Alan Turing's 1950 thought experiment and the 1956 Dartmouth workshop gave AI its name and first goals.
  • Symbolic AI (expert systems) worked in narrow domains but broke down when faced with the messiness of the real world.
  • Two AI winters taught the field that hype without results leads to funding cuts — real progress must be earned.
  • The 2012 ImageNet moment showed that big data + GPUs + deep neural networks was a breakthrough combination.
  • Modern AI is powered by three compounding forces: exponentially more data, faster hardware, and better algorithms.
Practice challenges
Test yourself · earn XP
0/4
Predict the output#1

The lesson's expert system scores symptoms with explicit if/then rules: +2 for a high fever, +2 for muscle aches, +1 for a runny nose, and diagnoses "Likely Flu" when the score is 3 or more. What does this snippet print?

predict-output
def diagnose(fever, muscle_aches, runny_nose):
    score = 0
    if fever > 38.5:  score += 2
    if muscle_aches:  score += 2
    if runny_nose:    score += 1
    return "Likely Flu" if score >= 3 else "Likely Cold"

print(diagnose(fever=38.0, muscle_aches=False, runny_nose=True))
Fix the bug#2

This code has a bug — what's wrong? It is meant to compute how much AlexNet improved on the previous best ImageNet error rate (the lesson says the old best was ~26% and AlexNet cut it to 15.3%).

fix-bug
old_error = 26.0     # % error, previous best
alexnet_error = 15.3 # % error, AlexNet 2012

# how many percentage points did AlexNet improve?
improvement = alexnet_error - old_error
print(f"Improved by {improvement} points")
Fill in the blank#3

Complete the lesson's core idea about machine learning. Unlike an expert system where a human writes the rules, a machine learning model ______ the rules directly from data (for example, from thousands of labeled spam emails).

# Expert system: a human hand-writes the rules.
# Machine learning: the model  the rules from labeled data.
model.fit(training_emails, spam_labels)
Reorder the lines#4

Put these AI milestones in chronological order, from earliest to latest, matching the timeline described in the lesson.

1
(1997, "Deep Blue defeats Kasparov at chess")
2
(2017, "The Transformer architecture unlocks large language models")
3
(1950, "Turing proposes the Turing Test")
4
(1956, "Dartmouth workshop names the field 'Artificial Intelligence'")
5
(2016, "AlphaGo defeats Lee Sedol at Go")
6
(2012, "AlexNet wins ImageNet, igniting deep learning")
Your turn
Practice exercise

Build a tiny timeline explorer. You are given a list of AI milestones as (year, event) tuples. Write a function milestones_in_range(events, start, end) that returns all milestones whose year falls within [start, end] inclusive, sorted by year. Then print each result formatted as YEAR: event.

Try it live — edit the code and hit Run to execute real Python:

solution.py · editable