torch.profiler.record_function_async
function record_function_async<T>(name: string, fn: () => Promise<T>): Promise<T>Records an asynchronous function execution with GPU synchronization.
Similar to record_function() but for async operations. Includes GPU synchronization
to ensure accurate timing of GPU operations. This function awaits the provided async
callback and then synchronizes with the GPU device before recording the measurement.
Essential for profiling GPU-accelerated code with accurate timing. Useful for:
- Profiling GPU-bound operations with accurate end-to-end timing
- Measuring async tensor computations on WebGPU
- Breaking down async workflows into named profiling sections
- Debugging GPU performance bottlenecks with sync guarantees
This ensures that GPU queue submissions and completions are properly reflected
in the timing. Unlike record_function(), this waits for GPU work to complete.
- GPU synchronization: Includes GPU sync for accurate WebGPU timing
- Awaits completion: Waits for async work to finish before returning
- Memory tracking: Tracks memory changes across entire async operation
- Accurate GPU timing: Unlike sync version, captures actual GPU execution time
- GPU sync overhead: GPU synchronization adds latency for measurement
- Not for CPU-only: Use record_function() if no GPU operations needed
Parameters
namestring- Human-readable name of the operation being profiled
fn() => Promise<T>- Async callback function to execute and measure
Returns
Promise<T>– Promise resolving to the return value of the callback functionExamples
// Basic GPU operation profiling
const profiler = new Profiler();
profiler.start();
await record_function_async('gpu_operation', async () => {
const x = torch.randn([1000, 1000], { device: 'webgpu' });
const y = torch.randn([1000, 1000], { device: 'webgpu' });
return x.matmul(y);
});
// Get profiling results with GPU timing
await profiler.stop();
const stats = profiler.key_averages();
console.log(stats.table()); // Includes GPU timing
// Profile multiple async operations
await record_function_async('data_load', async () => loadDataAsync());
await record_function_async('model_inference', async () => model.forward(batch));
await record_function_async('post_process', async () => postProcessResults());See Also
- PyTorch torch.profiler.record_function()
- record_function - For synchronous operations without GPU sync
- Profiler - Main profiler class for managing profiling sessions