torch.autograd.profile

function profile(options: ProfilerOptions = {}): ProfilerContext

Creates a profiler context for measuring neural network performance.

Enables profiling of tensor operations to identify performance bottlenecks. Records timing information for forward/backward passes and tracks kernel execution. Useful for:

Performance debugging: Finding slow layers and operations
Training optimization: Identifying which operations consume most time
Memory analysis: Tracking memory usage by operation
Profiling models: Understanding computational bottlenecks
Performance tuning: Comparing different implementations

The profiler captures timing for all tensor operations within the context, including GPU kernel execution, memory allocation, and data transfers. Use as a context manager to enable/disable profiling automatically.

GPU profiling: WebGPU profiling is limited; use for relative comparisons
Overhead: Profiling adds overhead; disable for production training
Context manager: Can be used with enable()/disable() or start()/stop()

Performance impact: Profiling slows down execution significantly
Memory usage: Stores timing data for all operations (can be large)
GPU limitations: Some WebGPU metrics may not be available

Parameters

optionsProfilerOptionsoptional: Profiler configuration: - use_cuda: Enable CUDA-specific profiling (no-op for WebGPU) - use_cpu: Enable CPU profiling - use_kineto: Enable Kineto backend (no-op for WebGPU) - record_shapes: Record tensor shapes for each operation - with_stack: Include Python stack traces (limited in JavaScript) - with_flops: Estimate FLOPs for operations (approximate)

Returns

ProfilerContext– ProfilerContext object to use with context manager pattern

Examples

// Profile a forward pass
const profiler = torch.profiler.profile({ use_cpu: true });
profiler.enable();
model.forward(x).backward();
profiler.disable();
console.log(profiler.table());  // Print timing summary

// Context manager pattern (recommended)
const profiler = torch.profiler.profile({ record_shapes: true });
profiler.start();
// Operations here are profiled
const output = model.forward(x);
output.sum().backward();
profiler.stop();
console.log(profiler.key_averages());

// Compare layer performance
for (const layer of model.layers) {
  const profiler = torch.profiler.profile({ use_cpu: true });
  profiler.start();
  layer.forward(x);
  profiler.stop();
  console.log(`Layer ${layer.name}: ${profiler.total_time()}ms`);
}

torch.autograd.profile

Parameters

Returns

Examples

See Also

torch.autograd.profile

Parameters

Returns

Examples

See Also