Before you optimize software, you need to know what hardware is doing. Today: ALU, registers, control unit, and the fetch-decode-execute cycle.
The CPU has four core components: the ALU (does arithmetic and logic), registers (tiny ultra-fast storage, ~1 cycle access), the control unit (decodes instructions, coordinates execution), and the clock (synchronizes everything). Modern x86-64 CPUs have 16 general-purpose 64-bit registers. Accessing a register takes 1 clock cycle — the fastest possible memory.
The ISA is the contract between software and hardware — a list of every instruction the CPU understands, how to encode them in binary, and what they do. Two dominant ISAs: x86-64 (Intel/AMD, used in desktops and servers) and ARM (Apple Silicon, mobile). Every piece of code you write eventually becomes a sequence of ISA instructions.
Every CPU runs one loop forever: Fetch the next instruction from memory (using the Program Counter register), Decode it to determine the operation and operands, Execute it in the ALU, Writeback the result to a register or memory, then advance the PC. Modern CPUs pipeline and parallelize these stages — but the logical model is always this loop.
; x86-64 Assembly: what your code compiles to
; gcc -O0 -S hello.c generates this
section .text
global _start
_start:
mov rax, 10 ; load 10 into register RAX (1 cycle)
mov rbx, 32 ; load 32 into register RBX (1 cycle)
add rax, rbx ; RAX = RAX + RBX = 42 (1 cycle)
cmp rax, 42 ; set flags: ZF=1 if equal
je done ; jump if ZF=1
done:
mov rdi, 0 ; exit code 0
mov rax, 60 ; syscall number: exit
syscall
gcc -O0 -S file.c to see the assembly. Every C statement becomes a handful of mov/add/cmp instructions. This is what the CPU actually runs.int a=5,b=7,c=a+b; printf("%d\n",c);gcc -O0 -S program.c and open the .s output fileadd instruction. What registers hold a and b?-O2 and compare — the optimizer often reduces 10 instructions to 3Install godbolt.org (Compiler Explorer) — paste a C function and watch the assembly update in real time. Try optimizing the C code by hand: remove a branch, simplify math. See how each change affects the assembly output.