|  | CUTLASS
    CUDA Templates for Linear Algebra Subroutines and Solvers | 
Threadblock swizzling function for batched GEMMs.
#include <threadblock_swizzle.h>
| Public Member Functions | |
| CUTLASS_HOST_DEVICE GemmCoord | get_tiled_shape (GemmCoord problem_size, int batch_count, GemmCoord tile_size) const | 
| Returns the shape of the problem in units of logical tiles.  More... | |
| CUTLASS_HOST_DEVICE dim3 | get_grid_shape (GemmCoord tiled_shape) const | 
| Computes CUDA grid dimensions given a size in units of logical tiles.  More... | |
| CUTLASS_DEVICE GemmCoord | get_tile_offset () const | 
| Obtains the threadblock offset (in units of threadblock-scoped tiles)  More... | |
| CUTLASS_DEVICE int | get_batch_idx () const | 
| Gets the batch index.  More... | |
| 
 | inline | 
| 
 | inline | 
| 
 | inline | 
| 
 | inline | 
 1.8.11
 1.8.11