CNNs revolutionised computer vision by learning to detect edges, shapes, and objects from raw pixels — no manual feature engineering needed.

A CNN doesn't look at the whole image at once. It uses **sliding filters (kernels)** to scan for local patterns.

**Key layers:**

- **Conv layer**: apply learnable filters → detects edges, textures

- **ReLU**: adds non-linearity (negative values → 0)

- **Pooling**: reduce spatial dimensions (max-pool takes the biggest value in a region)

- **Fully connected**: combine learned features → make prediction

**The magic:** Early layers detect simple patterns (edges). Later layers combine them into complex objects (faces, cars). This hierarchical feature learning is why CNNs are so powerful for vision.

Convolutional Neural Networks (CNNs) Explained