torch.Tensor.Tensor.addcmul
Tensor.addcmul<T1 extends Shape, T2 extends Shape>(tensor1: Tensor<T1>, tensor2: Tensor<T2>, options?: ValueOptions): Tensor<DynamicShape, D, Dev>Tensor.addcmul<T1 extends Shape, T2 extends Shape>(tensor1: Tensor<T1>, tensor2: Tensor<T2>, value: number, options?: ValueOptions): Tensor<DynamicShape, D, Dev>Performs out = self + (tensor1 * tensor2) * value element-wise.
Computes a scaled element-wise product of two tensors and adds it to this tensor. Useful for:
- Weighted updates in gradient descent
- Attention-like operations (element-wise scaling)
- Masked element-wise operations
- Feature scaling and normalization
- All three tensors must have broadcastable shapes
- Useful for efficient fused multiply-add operations
- Inverse of addcdiv: uses multiplication instead of division
Parameters
tensor1Tensor<T1>- First tensor (will be multiplied)
tensor2Tensor<T2>- Second tensor (will be multiplied)
optionsValueOptionsoptional
Returns
Tensor<DynamicShape, D, Dev>– Tensor with out = self + (tensor1 * tensor2) * valueExamples
// Element-wise scaled addition
const params = torch.tensor([1.0, 2.0, 3.0]);
const grad = torch.tensor([0.1, 0.2, 0.3]);
const lr = 0.01;
const updated = params.addcmul(grad, grad.neg(), lr);
// params += lr * grad * (-grad)
// Attention-like masking
const values = torch.tensor([1, 2, 3, 4]);
const mask = torch.tensor([1, 0, 1, 0]); // Only keep odd indices
const scale = torch.tensor([2, 2, 2, 2]);
values.addcmul(mask.float(), scale); // [1+2, 2+0, 3+2, 4+0]
// Feature scaling
const features = torch.randn(32, 64);
const scales = torch.ones(32, 1);
const scales = torch.ones(32, 1);
const biases = torch.zeros(32, 1);
features.addcmul(scales, torch.ones_like(features), 0.1);See Also
- PyTorch torch.addcmul()
- addcdiv - Similar but uses division instead of multiplication
- mul - Element-wise multiplication
- add - Element-wise addition