medusa.io#

Module with functionality (mostly) for working with video data.

The VideoLoader class allows for easy looping over frames of a video file, which is used in the reconstruction process (e.g., in the videorecon function).

Module Contents#

class medusa.io.VideoLoader(video_path, dataset_type='iterable', batch_size=32, device=DEVICE, frames=None, **kwargs)[source]#

Contains (meta)data and functionality associated with video files (mp4 files only currently).

Parameters:
  • video_path (str, Path) – Path to mp4 file

  • dataset_type (str) – One of ‘iterable’, ‘map’, or ‘subset’. If ‘iterable’, batches are loaded sequentially from the video file. If ‘map’, frames can be loaded in any order and ‘subset’ allows for loading a subset of frames (using the ‘frames’ arg)

  • batch_size (int) – Batch size to use when loading frames

  • device (str) – Either ‘cpu’ or ‘cuda’

  • frames (list) – List of frame indices to loaded; only relevant when dataset_type is ‘subset’

  • **kwargs – Extra keyword arguments passed to the initialization of the parent class

get_metadata()[source]#

Returns all (meta)data needed for initialization of a Data object.

Returns:

  • A dictionary with keys “img_size” (image size of frames), “n_img” (total number

  • of frames), and “fps” (frames-per-second)

close()[source]#

Closes the opencv videoloader in the underlying pytorch Dataset.

crop(batch, crop_params)[source]#

Crops an image batch.

Parameters:
  • batch (torch.tensor) – A B x H x W x 3 tensor with image data

  • crop_params (tuple) – A tuple of (x0, y0, x1, y1) crop parameters

class medusa.io.VideoIterableDataset(video_path)[source]#

A pytorch Dataset class based on loading frames from a single video.

Parameters:
  • video_path (Path, str) – A video file (any format that pyav can handle)

  • device (str) – Either ‘cuda’ (for GPU) or ‘cpu’

class medusa.io.VideoMapDataset(video_path)[source]#

A pytorch Dataset class based on loading frames from a single video.

Parameters:
  • video_path (Path, str) – A video file (any format that pyav can handle)

  • device (str) – Either ‘cuda’ (for GPU) or ‘cpu’

class medusa.io.VideoWriter(path, fps, codec='libx264', pix_fmt='yuv420p', size=None)[source]#

A PyAV based images-to-video writer.

Parameters:
  • path (str, Path) – Output path (including extension)

  • fps (float, int) – Frames per second of output video; if float, it’s rounded and cast to int

  • codec (str) – Video codec to use (e.g., ‘mpeg4’, ‘libx264’, ‘h264’)

  • pix_fmt (str) – Pixel format; should be compatible with codec

  • size (tuple[int]) – Desired output size of video (if None, wil be set the first time a frame is written)

write(imgs)[source]#

Writes one or more images to the video stream.

Parameters:

imgs (array_like) – A torch tensor or numpy array with image data; can be a single image or batch of images

close()[source]#

Closes the video stream.

medusa.io.load_inputs(inputs, load_as='torch', channels_first=True, with_batch_dim=True, dtype='float32', device=DEVICE)[source]#

Generic image loader function, which also performs some basic preprocessing and checks. Is used internally for detection, crop, and reconstruction models.

Parameters:
  • inputs (str, Path, iterable, array_like) – String or Path to a single image or an iterable (list, tuple) with multiple image paths, or a numpy array or torch Tensor with already loaded images (in which the first dimension represents the number of images)

  • load_as (str) – Either ‘torch’ (returns torch Tensor) or ‘numpy’ (returns numpy ndarray)

  • to_bgr (bool) – Whether the color channel is ordered BGR (True) or RGB (False); only works when inputs are image path(s)

  • channels_first (bool) – Whether the data is ordered as (batch_size, 3, h, w) (True) or (batch_size, h, w, 3) (False)

  • with_batch_dim (bool) – Whether a singleton batch dimension should be added if there’s only a single image

  • dtype (str) – Data type to be used for loaded images (e.g., ‘float32’, ‘float64’, ‘uint8’)

  • device (str) – Either ‘cuda’ (for GPU) or ‘cpu’; ignored when load_as='numpy'

Returns:

imgs – Images loaded in memory; object depends on the load_as parameter

Return type:

np.ndarray, torch.tensor

Examples

Load a single image as a torch Tensor: >>> from medusa.data import get_example_image >>> path = get_example_image() >>> img = load_inputs(path, device=’cpu’) >>> img.shape torch.Size([1, 3, 384, 480])

Or as a numpy array (without batch dimension):

>>> img = load_inputs(path, load_as='numpy', with_batch_dim=False)
>>> img.shape
(3, 384, 480)

Putting the channel dimension last:

>>> img = load_inputs(path, load_as='numpy', channels_first=False)
>>> img.shape
(1, 384, 480, 3)

Setting the data type to uint8 instead of float32:

>>> img = load_inputs(path, load_as='torch', dtype='uint8', device='cpu')
>>> img.dtype
torch.uint8

Loading in a list of images:

>>> img = load_inputs([path, path], load_as='numpy')
>>> img.shape
(2, 3, 384, 480)
medusa.io.download_file(url, f_out, data=None, verify=True, overwrite=False, cmd_type='post')[source]#

Downloads a file using requests. Used internally to download external data.

Parameters:
  • url (str) – URL of file to download

  • f_out (Path) – Where to save the downloaded file

  • data (dict) – Extra data to pass to post request

  • verify (bool) – Whether to verify the request

  • overwrite (bool) – Whether to overwrite the file when it already exists

  • cmd_type (str) – Either ‘get’ or ‘post’

medusa.io.load_obj(f, device=None)[source]#

Loads data from obj file, based on the DECA implementation, which in turn is based on the pytorch3d implementation.

Parameters:
  • f (str, Path) – Filename of object file

  • device (str, None) – If None, returns numpy arrays. Otherwise, returns torch tensors on this device

Returns:

out – Dictionary with outputs (keys: ‘v’, ‘tris’, ‘vt’, ‘tris_uv’)

Return type:

dict

medusa.io.save_obj(f, data)[source]#

Saves data to an obj file, based on the implementation from PRNet.

Parameters:
  • f (str, Path) – Path to save file to

  • data (dict) – Dictionary with 3D mesh data, with keys ‘v’, ‘tris’, and optionally ‘vt’