How does an AI know that 'king' and 'queen' are related, or that 'Paris' is to 'France' as 'Tokyo' is to 'Japan'? It converts words into numbers — and the math is beautiful.

Everything in AI starts with numbers. Text, images, audio — it all gets converted into vectors (lists of numbers) before a model can process it. Embeddings are how we make that conversion meaningful.

An embedding is a high-dimensional vector representation where **similar things end up close together**. 'Dog' and 'puppy' will have similar vectors. 'Dog' and 'spreadsheet' will be far apart.

The magic: relationships are preserved in the math. The famous example:

`king − man + woman ≈ queen`

This works because the model learned gender and royalty as separate dimensions in its vector space during training on billions of text examples.

**Where embeddings are used:**

- Semantic search: find documents by meaning, not just keywords

- RAG systems: convert documents to vectors for retrieval

- Recommendation engines: find similar items

- Clustering: automatically group related content

- Anomaly detection: outliers in vector space are unusual

Modern embedding models (like OpenAI's text-embedding-3-large or Google's Gecko) produce vectors with 1,536 to 3,072 dimensions. Each dimension loosely captures some aspect of meaning — though they're not human-interpretable.

Vector databases (Pinecone, Weaviate, Qdrant) store millions of these vectors and do **approximate nearest-neighbor search** in milliseconds — finding the 10 most semantically similar items in a library of 10 million.

**Key takeaway:** Embeddings turn text into numbers that preserve meaning — the foundation of semantic search, RAG, and recommendation systems.

Embeddings: The Numbers Behind Meaning