AI Foundations · Lesson 1

The Artificial Neuron

Everything in modern AI — every chatbot, every image model — is built from one tiny repeated unit. Before layers, before training, before any of the scary words, there is just this: a neuron that takes some numbers in and pushes one number out. Learn this one thing and you own the foundation.

A neuron is a thing that holds a number

An artificial neuron holds a single number called its activation — think of it as how "lit up" the neuron is. Where does that number come from? From the neurons feeding into it. Each incoming number x arrives over a connection that has a weight w — the strength of that connection. This is a deliberate cartoon of a biological neuron. The brain inspired the name; the math is its own thing. — 3Blue1Brown

Feel it: tune a neuron by hand

Before any formulas — drive the thing. Two inputs, two weights, one bias. The neuron multiplies each input by its weight, adds everything (bias included) into one number z, then squashes z into the range 0 to 1; that squashed value is the activation a = σ(z) — σ is just the name of the squashing step, and we'll unpack the math after you've played. Move the sliders and watch both numbers react. Notice: big positive z → activation near 1; big negative → near 0; z = 0 → exactly 0.5.

1.0
1.5
-1.0
0.5
0.0
z = w₁·x₁ + w₂·x₂ + b = 1.00
activation a = σ(z) = 0.731

Write it in code

What you just did with the sliders is a handful of lines: multiply each input by its weight, sum, add bias, squash. Here it is in Swift and Python — read whichever feels like home, then run one for real. (Python looking foreign? The Python-for-Swift-developers reference decodes every construct this book uses.) Paste the Python version into a fresh Google Colab notebook (zero install) and run it — you should see 0.7310585786300049. The Swift version runs as-is in an Xcode playground.

import Foundation

func sigmoid(_ z: Double) -> Double {
    1 / (1 + exp(-z))
}

func neuron(inputs: [Double], weights: [Double], bias: Double) -> Double {
    // weighted sum: w1*x1 + w2*x2 + ...
    var z = bias
    for i in 0..<inputs.count {
        z += weights[i] * inputs[i]
    }
    return sigmoid(z)
}

let a = neuron(inputs: [1.0, -1.0], weights: [1.5, 0.5], bias: 0.0)
print(a)   // 0.73105...  (z = 1.5 - 0.5 + 0 = 1.0)

Same idea, two languages. The bias is just the starting value of z. And the printed 0.73105… is the same value the slider widget above shows for those same numbers, rounded to three decimals: 0.731 (z = 1.5·1 + 0.5·(−1) + 0 = 1.0, and σ(1.0) ≈ 0.7311, verified in Wolfram Alpha). The sliders, the code, and — next — the formula are all the same neuron.

The real formula — three ways to write it

You've now tuned a neuron by hand and run it in code. Time to name what you were doing. The neuron does three steps:

1. weighted sum:  w₁·x₁ + w₂·x₂ + … + wₙ·xₙ
2. add bias:  z = (weighted sum) + b
3. squash:  a = σ(z)

The bias b shifts the whole result up or down — it sets how eager the neuron is to fire at all. The activation function σ (here the sigmoid) squashes any number into a tidy range between 0 and 1, so the output is always a clean "how active." That's the entire neuron. Two operations you already know — multiply-and-add — followed by one squash.

Here is the part nobody tells beginners: the formula isn't one fixed thing. It's the same neuron written at three levels of shorthand. Start with the easy one. The other two just pack it tighter.

① The easy way — spelled out

For a neuron with 2 inputs, write every term by hand. Nothing hidden:

z = w₁·x₁ + w₂·x₂ + b
a = σ(z)

Read it left to right: take input one, multiply by its weight; take input two, multiply by its weight; add them; add the bias. That total is z. Then squash z with σ to get the activation a. If you only remember this one, you understand the neuron.

② The compact way — with Σ

What if a neuron has 784 inputs (like a 28×28 image)? You can't write w₁·x₁ + … + w₇₈₄·x₇₈₄ by hand. So mathematicians use Σ ("sigma", capital S for "sum") as shorthand for "loop and add":

z = ( Σi=1n wᵢ·xᵢ ) + b

This is identical to ① — it's just a for loop on paper. Read the symbols:

So Σ wᵢxᵢ reads literally as: var sum = 0.0; for i in 0..<n { sum += w[i]*x[i] }. That's the whole mystery. (One small shift to notice: math counts inputs from 1, Swift arrays count from 0 — which is exactly why the real code above loops over 0..<inputs.count.)

③ The popular way — vectors (what pros use)

In real ML papers and code you'll almost always see this shortest form:

a = σ( w · x + b )

Here w and x are bold because they're vectors — just lists of numbers, exactly like a Swift [Double]. The dot · between them is the dot product, which is a one-word name for "multiply the lists element-by-element and add it all up" — i.e. the exact same Σ from ②. It's popular because (a) it's short, and (b) computers run it blazingly fast on whole lists at once. Same neuron, fewer symbols.

The one truly new piece: the sigmoid. All three forms end in σ(z), defined as
σ(z) = 1 / (1 + e−z)
e is just a fixed number (≈ 2.718, like π). You don't compute this by hand — you call exp(). All it does is bend any input into a smooth S-curve between 0 and 1: very negative → near 0, zero → exactly 0.5, very positive → near 1. That's why activations are always a clean "how active."

Bottom line: ①, ②, ③ are the same neuron. Keep ① in your head for understanding; recognise ② and ③ when you read other people's math and code.

Check yourself

No peeking back. Pull it from memory.

1. What does the weight on a connection represent?
2. What is the job of the activation function σ?
3. If z equals exactly zero, the sigmoid activation is:

Watch this next

Primary source — the single best 20 minutes you can spend on this: "But what is a Neural Network?" by 3Blue1Brown. It animates exactly what you just tuned, then shows how thousands of these neurons stack into layers. That stacking is Lesson 2 (coming next).