In Lesson 1 you built a neuron: multiply, add, squash. Today you'll see what that little machine is actually for. Give it two inputs and it does something surprisingly geometric: it draws a straight line across the plane and says yes on one side, no on the other. By the end you will set its weights by hand and make it compute a real logic gate — your first working decision-maker.
Everything from Lesson 1 in one line. Two inputs x₁, x₂, two weights, a bias, a squash:
And the one sigmoid fact we need today: σ(0) = 0.5, and the curve only ever goes up — bigger z, bigger a. So a > 0.5 happens exactly when z > 0. If we call a > 0.5 "the neuron fires" (says yes), then the whole decision boils down to one question: is z positive?
Think like a programmer first: the neuron is if z > 0 { yes } else { no }. So where, in the plane of all possible inputs (x₁, x₂), does the answer flip? Precisely where z crosses zero. Let's write that set of "perfectly undecided" points down — first spelled out, then in school form, then the way the pros write it.
Read it: "all the input pairs where the weighted sum plus bias lands exactly on zero." On one side of these points z > 0 (fires), on the other z < 0 (quiet). But what shape is this set?
Treat it like algebra homework: solve for x₂ (assuming w₂ ≠ 0 — move the other terms across, divide by w₂):
Squint at it. That is y = m·x + c — the straight-line equation from school. Symbol by symbol:
−(w₁/w₂) — the slope. It's built only from the weights, so weights set the line's direction and steepness.−b/w₂ — the intercept, where the line crosses the x₂-axis. It's the only place the bias appears, so the bias slides the line without rotating it (Wikipedia: the bias "shifts the position (though not the orientation)" of the boundary).w₂ = 0, you can't divide — the boundary is then the vertical line x₁ = −b/w₁. Still a straight line.Same dot-product shorthand as Lesson 1. ML people call this the decision boundary. With 2 inputs it's a line; with 3 inputs it would be a flat plane slicing 3D space; with n inputs, an "(n−1)-dimensional hyperplane" — scary word, same idea: a flat cut through input space, yes on one side, no on the other.
Time to drive. The four dots are the four possible pairs of binary inputs: (0,0), (0,1), (1,0), (1,1). A filled dot means the neuron fires there (a > 0.5); hollow means quiet. The red line is the decision boundary, live. Your mission: pick a target gate, then tune w₁, w₂, b until every dot gets a ✓. AND fires only when both inputs are 1; OR fires when at least one is. Hint for AND: make both weights equally positive, then drag the bias down until the line isolates (1,1).
Here is one clean solution (you may have found another — infinitely many work): w₁ = 10, w₂ = 10, b = −15. Check all four inputs by arithmetic you can do in your head:
Every output is below 0.01 or above 0.99 — the neuron isn't just right, it's confidently right. And the geometry matches section ②: with these numbers the boundary is x₂ = −x₁ + 1.5 (slope −10/10 = −1, intercept 15/10 = 1.5) — a diagonal line passing between (1,1) and the other three corners. One more trick: keep the same weights and raise the bias to b = −5, and the line slides down past (0,1) and (1,0) — now it computes OR. Same direction, different position: that's the bias doing its one job.
Don't take my word for the numbers — run them:
import Foundation
func sigmoid(_ z: Double) -> Double {
1 / (1 + exp(-z))
}
// our hand-set AND gate
let w1 = 10.0, w2 = 10.0, b = -15.0
for (x1, x2) in [(0.0, 0.0), (0.0, 1.0), (1.0, 0.0), (1.0, 1.0)] {
let z = w1*x1 + w2*x2 + b
let a = sigmoid(z)
let fires = a > 0.5 ? "YES" : "no"
print("(\(Int(x1)),\(Int(x2))) z = \(String(format: "%+6.1f", z)) a = \(String(format: "%.7f", a)) fires: \(fires)")
}
import math
def sigmoid(z):
return 1 / (1 + math.exp(-z))
# our hand-set AND gate
w1, w2, b = 10.0, 10.0, -15.0
for x1, x2 in [(0,0), (0,1), (1,0), (1,1)]:
z = w1*x1 + w2*x2 + b
a = sigmoid(z)
fired = "YES" if a > 0.5 else "no"
print(f"({x1},{x2}) z = {z:+6.1f} a = {a:.7f} fires: {fired}")
Both print the exact same table (the Python runs in Google Colab, the Swift in an Xcode playground):
(0,0) z = -15.0 a = 0.0000003 fires: no
(0,1) z = -5.0 a = 0.0066929 fires: no
(1,0) z = -5.0 a = 0.0066929 fires: no
(1,1) z = +5.0 a = 0.9933071 fires: YES
This idea — a neuron as a yes/no decider — is the original neural network. Frank Rosenblatt's perceptron, developed in the 1950s and 60s, output a hard 1 if w·x + b > 0 and a hard 0 otherwise; no sigmoid, just the cliff. Nielsen calls it "a device that makes decisions by weighing up evidence," and builds a NAND gate from one with weights (−2, −2) and bias 3 — exactly the game you played above. Our sigmoid neuron is, in Nielsen's words, a "smoothed out perceptron": instead of a cliff at the boundary it gives 0.0000003 — which will matter enormously when we get to learning, because smooth things can be nudged gradually.
Now the famous catch. XOR fires when the inputs differ: yes for (0,1) and (1,0), no for (0,0) and (1,1). Look at the widget plot and try to imagine one straight line with (0,1) and (1,0) on one side and the two other corners — which sit diagonally between them — on the other. There isn't one. No weights, no bias, no luck: a single neuron carves the plane with one straight cut, and XOR's yes-points and no-points interlock across the diagonal. This is exactly the limitation Minsky and Papert proved in their 1969 book Perceptrons (Wikipedia). The fix is not a cleverer line — it's more neurons, whose lines combine into bent, flexible boundaries. That is where we go next: a layer of neurons.
No peeking back. Pull it from memory.
Primary source — Nielsen, Neural Networks and Deep Learning, chapter 1, the perceptrons section. It's the cheese-festival story: a perceptron deciding whether you go to a festival by weighing up the weather, company, and transit — the friendliest serious treatment of today's idea, and the book we'll keep returning to. Read up to (and including) "Sigmoid neurons"; the rest of the chapter previews where this whole course is headed.