ML FoundationsBeginner9 min02 / 13

Types of Machine Learning

Discover the three core families of machine learning — supervised, unsupervised, and reinforcement learning — and learn which to reach for when.

Every app that recommends a song, detects spam, or plays chess at a superhuman level is powered by machine learning. But not all ML systems learn the same way. Just as humans learn differently — sometimes from a teacher giving feedback, sometimes by exploring on their own, sometimes through trial and error — machines have three distinct learning styles too.

Here is the core idea in one sentence each: - Supervised Learning — learn from labeled examples to predict something new. - Unsupervised Learning — find hidden structure in data when you have no labels at all. - Reinforcement Learning — learn by trial and error, earning rewards for good moves.

#Supervised Learning: Learning from Examples

Think of it like

The Flashcard Analogy

Imagine studying for a test using flashcards. Each card has a question on the front (the input) and the answer on the back (the label). After seeing thousands of cards, you start to notice patterns and can answer new questions you have never seen before. That is exactly what supervised learning does — every training example is a (input, label) pair, and the model learns to map inputs to the correct label.

Supervised learning has two flavors: regression predicts a number (e.g., house price), while classification predicts a category (e.g., spam vs. not-spam).

Linear regression — the model finds a best-fit line through the training data and uses it to predict new values.
# Regression from scratch: predict house price from size
# Data: (size in 100 sqft, price in $1000s)
points = [(10, 200), (15, 280), (20, 350)]

n = len(points)
sum_x  = sum(x for x, y in points)
sum_y  = sum(y for x, y in points)
sum_xy = sum(x * y for x, y in points)
sum_xx = sum(x * x for x, y in points)

slope     = (n * sum_xy - sum_x * sum_y) / (n * sum_xx - sum_x ** 2)
intercept = (sum_y - slope * sum_x) / n

print(f"Best-fit line: price = {slope}*size + {intercept}")
print(f"Predicted price for size 18: ${slope * 18 + intercept:.0f}k")
K-Nearest Neighbors — classify by majority vote among the K closest training examples.
import math

# Classification from scratch: K-Nearest Neighbors
# Training: (color_score 0=yellow/1=orange/2=red, weight_g, label)
training = [
    (0, 150, "banana"), (0, 140, "banana"),
    (1, 180, "orange"), (1, 190, "orange"),
    (2, 80,  "apple"),  (2, 90,  "apple"),
]

def knn_predict(point, k=3):
    dist = lambda t: math.sqrt((point[0]-t[0])**2 + (point[1]-t[1])**2)
    ranked = sorted(training, key=dist)
    votes = [label for _, _, label in ranked[:k]]
    return max(set(votes), key=votes.count)

print(knn_predict((1, 185)))  # orange-ish, 185g
print(knn_predict((2, 85)))   # red-ish, 85g

#Unsupervised Learning: Finding Hidden Structure

Unsupervised learning has no labels at all. The algorithm receives only raw inputs and must discover patterns on its own. The most common task is clustering — grouping similar data points together.

K-Means clustering works in four elegant steps: 1. Randomly place K "centroids" (imaginary cluster centers) in the data. 2. Assign every point to its nearest centroid. 3. Move each centroid to the average position of all its assigned points. 4. Repeat steps 2–3 until the centroids stop moving.

Real-world uses: customer segmentation, grouping news articles by topic, and compressing images.

K-Means discovers three natural groups with no labels — just the raw numbers.
# K-Means from scratch (1D for clarity)
data = [1, 1.5, 2, 8, 8.5, 9, 25, 26, 27]
centroids = [1.0, 10.0, 20.0]   # K=3 initial guesses

for _ in range(10):
    clusters = [[] for _ in centroids]
    for x in data:
        nearest = min(range(len(centroids)), key=lambda i: abs(x - centroids[i]))
        clusters[nearest].append(x)
    new_c = [sum(c)/len(c) for c in clusters if c]
    if new_c == centroids:
        break
    centroids = new_c

for i, c in enumerate(clusters):
    print(f"Cluster {i+1} (center={centroids[i]:.1f}): {c}")

#Reinforcement Learning: Learning by Doing

Reinforcement Learning (RL) is different from both families above — there is no fixed dataset at all. Instead, an agent takes actions in an environment and receives rewards (or penalties) based on the outcomes.

Think of learning to ride a bike: nobody gives you a labeled dataset of "good balances". You just try, fall (negative reward), adjust, and eventually stay upright. The core RL loop is: observe state → choose action → receive reward → update policy → repeat

Famous RL examples: - Game bots — AlphaGo and OpenAI Five beat world champions purely through self-play. - Robotics — robot arms learn to grasp objects by trial and error. - Recommendation systems — a newsfeed learns which stories keep you engaged.

Common mistake

Common Misconception: "More Data = Supervised Learning"

It is tempting to think that if you have a huge dataset, you should use supervised learning. But the key question is: do your data points have labels? A million customer transactions with no category tags calls for unsupervised clustering, not supervised classification. Always ask — "Do I already have the correct answers for my training examples?" — before choosing your approach.

#When to Use Which?

Here is a quick decision guide:

| Situation | Best fit | |---|---| | Labeled data; predict a continuous number | Supervised — Regression | | Labeled data; predict a category | Supervised — Classification | | No labels; discover natural groupings | Unsupervised — Clustering | | Agent must learn a strategy through interaction | Reinforcement Learning |

Most beginners start with supervised learning because labeled datasets are easiest to reason about. Unsupervised learning shines for exploration when labeling is expensive. Reinforcement learning is powerful but complex — reserve it for sequential decision problems like games and robotics.

Real libraries: scikit-learn covers supervised and unsupervised learning with clean one-liners. Gymnasium and Stable Baselines3 are the go-to tools for RL.

Quick check

A music app wants to group its 10 million users into listener "personas" based on their listening habits — no one has manually tagged any users into categories. Which type of ML fits best?

Key takeaways

  • Supervised learning trains on (input, label) pairs to predict labels for new inputs — regression for numbers, classification for categories.
  • Unsupervised learning finds hidden structure (like clusters) when no labels exist.
  • Reinforcement learning trains an agent to maximize rewards through trial and error — no static dataset required.
  • The first question for any ML problem: do I have labels? That single question steers you toward supervised vs. unsupervised.
  • Simple from-scratch implementations (linear regression, KNN, K-Means) reveal exactly how these algorithms work under the hood.
Practice challenges
Test yourself · earn XP
0/4
Predict the output#1

This is the lesson's from-scratch linear regression. Given the training points, what does it print?

predict-output
points = [(10, 200), (15, 280), (20, 350)]

n = len(points)
sum_x  = sum(x for x, y in points)
sum_y  = sum(y for x, y in points)
sum_xy = sum(x * y for x, y in points)
sum_xx = sum(x * x for x, y in points)

slope     = (n * sum_xy - sum_x * sum_y) / (n * sum_xx - sum_x ** 2)
intercept = (sum_y - slope * sum_x) / n

print(f"slope = {slope}")
print(f"size 18 -> ${slope * 18 + intercept:.0f}k")
Fill in the blank#2

In K-Nearest Neighbors, a new point is classified by a majority vote among its K closest neighbors. Complete the line that returns the winning label from the collected votes.

votes = ["orange", "orange", "apple"]
winner = max(set(votes), key=votes.)
print(winner)  # -> orange
Reorder the lines#3

Put the four steps of the K-Means clustering algorithm (as taught in the lesson) into the correct order.

1
Move each centroid to the average of its assigned points
2
Randomly place K centroids in the data
3
Assign every point to its nearest centroid
4
Repeat the assign-and-move steps until centroids stop moving
Fix the bug#4

This code has a bug — what's wrong?

fix-bug
# Group 10 million users into personas from raw listening data.
# No user has been manually tagged into a category.
from sklearn.linear_model import LinearRegression

model = LinearRegression()
model.fit(user_features, user_persona_labels)
Your turn
Practice exercise

Implement a tiny 1-Nearest Neighbor classifier from scratch. Given a list of labeled training points [(x, label), ...] and a new input value x_new, find the single training point closest to x_new (using absolute difference) and return its label.

Then use your function to classify the mystery values 4.0 and 14.0 given the training data below.

Try it live — edit the code and hit Run to execute real Python:

solution.py · editable