torch.nn.CosineSimilarity
class CosineSimilarity extends Modulenew CosineSimilarity(options?: CosineSimilarityOptions)
- readonly
dim(number) - readonly
eps(number)
Cosine similarity: measures angular distance between vectors, ignoring magnitude.
Computes the cosine of the angle between two vectors, producing values in [-1, 1]. Measures how aligned two vectors are: 1 = same direction, 0 = orthogonal, -1 = opposite. Unlike Euclidean distance, cosine similarity ignores vector magnitude and only looks at direction. Essential for:
- Semantic similarity in NLP (word embeddings, sentence representations)
- Information retrieval (document similarity)
- Recommendation systems (user/item similarity)
- Face verification (embedding similarity)
- Clustering with direction-based metrics
Cosine similarity is widely used in machine learning because it's scale-invariant and focuses on direction rather than magnitude. Two vectors of different scales but same direction have cosine similarity of 1.0, while Euclidean distance depends heavily on magnitude.
When to use CosineSimilarity:
- Comparing high-dimensional embeddings (text, images, audio)
- NLP tasks (semantic similarity, retrieval)
- Face/person recognition (normalized embeddings)
- Any task where direction matters more than magnitude
- Batch processing of multiple vector pairs
Compared to other metrics:
- vs Euclidean/L2: Cosine ignores magnitude; L2 sensitive to scale
- vs L1/Manhattan: L1 sums absolute differences; cosine uses angles
- vs dot product: Similar but normalized to [-1, 1] range
- Interpretation: Cosine similarity directly interpretable as angular similarity
Mathematical properties:
- Range: [-1, 1] (or [0, 1] if vectors have non-negative components)
- Symmetric: cos_sim(x, y) = cos_sim(y, x)
- Scale invariant: cos_sim(αx, y) = cos_sim(x, y) for α > 0
- Not a metric: Doesn't satisfy triangle inequality
Input shape expectations:
- Both inputs must have same shape
- Similarity computed along specified dimension
- Typically: (batch_size, feature_dim) for batch processing
CosineSimilarity computation: For vectors x1 and x2 along dimension dim:
- Compute dot product: x1 · x2 (sum of element-wise products)
- Compute norms: ||x1|| = sqrt(sum(x1²)), ||x2|| = sqrt(sum(x2²))
- Normalize: cos_sim = (x1 · x2) / (||x1|| × ||x2||)
- Clamp norms to prevent division by zero
- Scale invariant: Doubling vector magnitude doesn't change similarity
- Direction focused: Only cares about angle, not magnitude
- Normalized output: Always in [-1, 1] range (interpretable as confidence)
- Symmetric: Cos(x, y) = Cos(y, x)
- Not a metric: Doesn't satisfy triangle inequality (not a true distance)
- Efficient: Faster than many other similarity metrics
- Dimension ordering: Similarity computed along specified dimension, not across it
- Same shape required: Both input tensors must have identical shape
- Zero vector handling: Similarity undefined for zero vectors, eps prevents NaN
- Numerical stability: Very small vectors may give unstable results
Examples
// Basic cosine similarity
const cos_sim = new torch.nn.CosineSimilarity(1, 1e-8);
const x1 = torch.tensor([[1, 2, 3]]);
const x2 = torch.tensor([[2, 4, 6]]);
const similarity = cos_sim.forward(x1, x2);
// similarity ≈ 1.0 (vectors point in same direction, even though x2 is scaled)// Batch similarity computation (text embeddings)
const cos_sim = new torch.nn.CosineSimilarity(1);
const query_embeddings = torch.randn([32, 768]); // 32 BERT embeddings
const doc_embeddings = torch.randn([32, 768]); // 32 document embeddings
const similarities = cos_sim.forward(query_embeddings, doc_embeddings);
// Shape: [32] - similarity between each query and corresponding document// Find most similar embedding from a set
const cos_sim = new torch.nn.CosineSimilarity(1);
const query = torch.randn([1, 512]); // Single query embedding
const database = torch.randn([1000, 512]); // 1000 document embeddings
// Expand query to match database size
const query_expanded = query.expand([1000, 512]);
const similarities = cos_sim.forward(query_expanded, database);
// similarities shape: [1000] - similarity to each document// Face verification with embeddings
class FaceVerifier {
cos_sim: torch.nn.CosineSimilarity;
constructor() {
this.cos_sim = new torch.nn.CosineSimilarity(1);
}
verify(embedding1: torch.Tensor, embedding2: torch.Tensor): boolean {
const similarity = this.cos_sim.forward(embedding1, embedding2);
const threshold = 0.6;
return similarity.item() > threshold; // Same person if similarity > 0.6
}
}