torchvision
- How is the C++ code compiled and how is it loaded in PyTorch. 
from torchvision.io import read_file to read a binary file and it returns
a 1-d tensor of dtype torch.uint8.
There is also a write_file to write a 1-d tensor of dtype torch.uint8.
from torchvision.io import read_image to read an image. Supported formats are
jpg, png, gif.
Color images are represented as a uint8 tensor of shape (channels, height, width).
transforms.v2._utils contains functions for checking and parsing sizes, paddings, fill.
Use v2.ToDtype(torch.float32, scale=True) to convert dtype.
- RandomAffinecalls- F.affine.- RandomAffineis in- transforms/v2/_geometry.py. Affine has 4 parameters: angle for rotation, translate, scale, shear. RandomAffine._get_params() just return the 4 parameters. Note that we can call the class method- RandomAffine.get_params(), which is defined in- transform/transform.py
- F.affine()is defined in- transforms/v2/functional/_geometry.pywith name- def affine_image- _get_inverse_affine_matrix()is used to get the matrix for affine transform.- Inside - F.affine(), it calls- _affine_gridto get a grid from the affine transform matrix and then use- _apply_grid_transformto transform the input.
TODO: look at torch.nn.functional.affine_grid() and grid_sample(). See also https://github.com/wuneng/WarpAffine2GridSample/blob/master/main.py#L56
Deep Learning Paper Implementations: Spatial Transformer Networks - Part I