Model Parameters = Words × Dimensions
Key Concepts
- Word: A symbolic label (token) such as cat, dog, or car.
- Dimension: The number of numeric values used to represent a word.
- Parameter: A single stored numeric value. Every number in an embedding vector is one parameter.
Why This Example Has Exactly 6 Parameters
- There are 3 words: cat, dog, and car.
- Each word has 2 dimensions.
- Total parameters = 3 × 2 = 6.
Where the 6 Parameters Actually Live
Feature Dimensions: [Object, Animal]
cat → [0, 1] → 2 Dimensions
dog → [0, 1] → 2 Dimensions
car → [1, 0] → 2 Dimensions
-------------------------------------
Total: → 3 × 2 = 6 Parameters
+---------------------------------------+
| Vector DB: with Feature Dimensions |
+--------+--------+--------+------------+
| Word | Object | Animal | Dimensions |
+--------+--------+--------+------------+
| cat | 0 | 1 | 2 |
| dog | 0 | 1 | 2 |
| car | 1 | 0 | 2 |
+--------+--------+--------+------------+
| Total | 3 × 2 = 6 Parameters |
+--------+--------+--------+------------+
Bridging the example to real models:
+------------------+------------+------------+------------------+
| Model | Words | Dimensions | Parameters |
+------------------+------------+------------+------------------+
| This example | 3 | 2 | 6 |
| GPT-2 embeddings | 50,257 | 768 | 38,597,376 |
+------------------+------------+------------+------------------+
Python Code Example
import numpy as np
embedding_table = {
#--- Feature Dim: [Object, Animal] ---#
"cat": np.array([ 0 , 1 ]),
"dog": np.array([ 0 , 1 ]),
"car": np.array([ 1 , 0 ]),
}
def embed(word):
return embedding_table[word]
print(embed("cat"))
print(embed("dog"))
Final Summary
- Words are labels.
- Dimensions describe vector length per word.
- Parameters are the total stored numeric values.
- This example has 6 parameters.
How the Embedding Numbers Are Computed
Goal
- Show a simple, concrete way to compute the numbers in embedding vectors.
- Explain where values like [0, 1] and [0, 1] come from.
Important Note
- In real models, these numbers are learned during training.
- Here we use a toy mathematical rule to make the idea understandable.
Simple Rule to Compute Embedding Numbers
- Assign each word two numeric features:
- Feature 1: how “animal-like” the word is
- Feature 2: how “object-like” the word is
- Each feature is a number between 0.0 and 1.0.
- The two feature values together form a 2-dimensional embedding.
Manual Feature Assignment
- cat: mostly animal → [0, 1]
- dog: mostly animal → [0, 1]
- car: mostly object → [1, 0]
Python Example: Computing the Numbers
def compute_embedding(animal_score, object_score):
# Combine two numeric features into a vector
return [animal_score, object_score]
cat_embedding = compute_embedding(0, 1)
dog_embedding = compute_embedding(0, 1)
print("cat:", cat_embedding)
print("dog:", dog_embedding)
What This Represents
- Each number is a parameter value.
- The vector length (2 numbers) is the dimension.
- The rule that assigns numbers is the model logic.
Connection to Real Models
- Real embedding models start with random numbers.
- Numbers are adjusted using training data and optimization.
- After training, vectors encode semantic meaning.
Final Summary
- [0, 1] and [0, 1] are examples of computed feature values.
- The computation rule is simple here for clarity.
- Real models learn these values automatically.