visionsim.cli package

Submodules

visionsim.cli.blender module

visionsim.cli.blender.sequence_info(dataset: str | PathLike, keyframe_multiplier: float = 1.0, original_fps: int = 50, output: str | PathLike | None = None)[source]

Query dataset to collect some extra metadata, write it to a json file

Parameters:
  • dataset (str | os.PathLike) – Root pathy of dataset

  • keyframe_multiplier (float, optional) – Keyframe stretch amount.

  • original_fps (int, optional) – Framerate of native blender animation. Defaults to 50fps.

  • output (str | os.PathLike | None, optional) – Path of output info file. Defaults to “info.json” in the dataset’s root directory.

visionsim.cli.blender.render_animation(blend_file: str | PathLike, root_path: str | PathLike, /, render_config: RenderConfig, frame_start: int | None = None, frame_end: int | None = None, output_blend_file: str | PathLike | None = None, dry_run: bool = False)[source]

Create datasets by rendering out a sequence from a _single_ blend-file.

Parameters:
  • blend_file (str | os.PathLike) – Path to blend file.

  • root_path (str | os.PathLike) – Dataset output folder.

  • render_config (RenderConfig) – Render configuration.

  • frame_start (int) – Start rendering at this frame index (inclusive).

  • frame_end (int) – Stop rendering at this frame index (inclusive).

  • output_blend_file (str | os.PathLike | None, optional) – If set, write the modified blend file to this path. Helpful for troubleshooting. Defaults to not saving.

  • dry_run (bool, optional) – if true, nothing will be rendered at all. Defaults to False.

visionsim.cli.dataset module

visionsim.cli.dataset.imgs_to_npy(input_dir: str | PathLike, output_dir: str | PathLike, bitpack: bool = False, bitpack_dim: int | None = None, batch_size: int = 4, alpha_color: str = '(255, 255, 255)', is_grayscale: bool = False, force: bool = False)[source]

Convert an image folder based dataset to a NPY dataset

Parameters:
  • input_dir – directory in which to look for frames

  • output_dir – directory in which to save npy file

  • bitpack – if true, each chunk of 8 binary pixels will by packed into a single byte. Only enable if data is binary valued

  • bitpack_dim – axis along which to pack bits (H=1, W=2)

  • batch_size – number of frames to write at once

  • alpha_color – if set, blend with this background color and do not store alpha channel

  • is_grayscale – If set, assume images are grayscale and only save first channel

  • force – if true, overwrite output file(s) if present

visionsim.cli.dataset.npy_to_imgs(input_dir: str | PathLike, output_dir: str | PathLike, batch_size: int = 4, pattern: str = 'frame_{:06}.png', step: int = 1, force: bool = False)[source]

Convert an NPY based dataset to an image-folder dataset

Parameters:
  • input_dir – directory in which to look for frames

  • output_dir – directory in which to save npy file

  • batch_size – number of frames to write at once

  • pattern – filenames of frames will match this

  • step – skip some frames when converting between formats

  • force – if true, overwrite output file(s) if present

visionsim.cli.dataset.info(input_dir: str | PathLike, as_json: bool = False)[source]

Print information about the dataset

Parameters:
  • input_dir – directory in which to look for dataset

  • as_json – print the output in a json-formatted string

visionsim.cli.emulate module

visionsim.cli.emulate.spad(input_dir: str | PathLike, output_dir: str | PathLike, pattern: str = 'frame_{:06}.png', factor: float = 1.0, seed: int = 2147483647, mode: Literal['npy', 'img'] = 'npy', batch_size: int = 4, force: bool = False)[source]

Perform bernoulli sampling on linearized RGB frames to yield binary frames

Parameters:
  • input_dir – directory in which to look for frames

  • output_dir – directory in which to save binary frames

  • pattern – filenames of frames should match this

  • factor – multiplicative factor controlling dynamic range of output

  • seed – random seed to use while sampling, ensures reproducibility

  • mode – how to save binary frames

  • batch_size – number of frames to write at once

  • force – if true, overwrite output file(s) if present

visionsim.cli.emulate.events(input_dir: str | PathLike, output_dir: str | PathLike, fps: int, pos_thres: float = 0.2, neg_thres: float = 0.2, sigma_thres: float = 0.03, cutoff_hz: int = 200, leak_rate_hz: float = 1.0, shot_noise_rate_hz: float = 10.0, seed: int = 2147483647, force: bool = False)[source]

Emulate an event camera using v2e and high speed input frames

Parameters:
  • input_dir – directory in which to look for frames

  • output_dir – directory in which to save events

  • fps – frame rate of input sequence

  • pos_thres – nominal threshold of triggering positive event in log intensity

  • neg_thres – nominal threshold of triggering negative event in log intensity

  • sigma_thres – std deviation of threshold in log intensity

  • cutoff_hz – 3dB cutoff frequency in Hz of DVS photoreceptor, default: 200,

  • leak_rate_hz – leak event rate per pixel in Hz, from junction leakage in reset switch

  • shot_noise_rate_hz – shot noise rate in Hz

  • seed – random seed to use while sampling, ensures reproducibility

  • force – if true, overwrite output file(s) if present

visionsim.cli.emulate.rgb(input_dir: str | PathLike, output_dir: str | PathLike, chunk_size: int = 10, factor: float = 1.0, readout_std: float = 20.0, fwc: int | None = None, duplicate: float = 1.0, pattern: str = 'frame_{:06}.png', mode: Literal['npy', 'img'] = 'npy', force: bool = False)[source]

Simulate real camera, adding read/poisson noise and tonemapping

Parameters:
  • input_dir – directory in which to look for frames

  • output_dir – directory in which to save binary frames

  • chunk_size – number of consecutive frames to average together

  • factor – multiply image’s linear intensity by this weight

  • readout_std – standard deviation of gaussian read noise

  • fwc – full well capacity of sensor in arbitrary units (relative to factor & chunk_size)

  • duplicate – when chunk size is too small, this model is ill-suited and creates unrealistic noise. This parameter artificially increases the chunk size by using each input image duplicate number of times

  • pattern – filenames of frames should match this

  • mode – how to save binary frames

  • force – if true, overwrite output file(s) if present

visionsim.cli.emulate.imu(input_dir: str | PathLike = '.', output_file: str | PathLike = '', seed: int = 2147483647, gravity: str = '(0.0, 0.0, -9.8)', dt: float = 0.00125, init_bias_acc: str = '(0.0, 0.0, 0.0)', init_bias_gyro: str = '(0.0, 0.0, 0.0)', std_bias_acc: float = 5.5e-05, std_bias_gyro: float = 2e-05, std_acc: float = 0.008, std_gyro: float = 0.0012)[source]

Simulate data from a co-located IMU using the poses in transforms.json.

Parameters:
  • input_dir – directory in which to look for transforms.json,

  • output_file – file in which to save simulated IMU data. Prints to stdout if empty. default: ‘’,

  • seed – RNG seed value for reproducibility. default: 2147483647,

  • gravity – gravity vector in world coordinate frame. Given in m/s^2. default: [0,0,-9.8],

  • dt – time between consecutive transforms.json poses (assumed regularly spaced). Given in seconds. default: 0.00125,

  • init_bias_acc – initial bias/drift in accelerometer reading. Given in m/s^2. default: [0,0,0],

  • init_bias_gyro – initial bias/drift in gyroscope reading. Given in rad/s. default: [0,0,0],

  • std_bias_acc – stdev for random-walk component of error (drift) in accelerometer. Given in m/(s^3 sqrt(Hz))

  • std_bias_gyro – stdev for random-walk component of error (drift) in gyroscope. Given in rad/(s^2 sqrt(Hz))

  • std_acc – stdev for white-noise component of error in accelerometer. Given in m/(s^2 sqrt(Hz))

  • std_gyro – stdev for white-noise component of error in gyroscope. Given in rad/(s sqrt(Hz))

visionsim.cli.ffmpeg module

visionsim.cli.ffmpeg.animate(input_dir: str | PathLike, pattern: str = 'frame_*.png', outfile: str | PathLike = 'out.mp4', fps: int = 25, crf: int = 22, vcodec: str = 'libx264', step: int = 1, multiple: int = 2, force: bool = False, bg_color: str = 'black', strip_alpha: bool = False)[source]

Combine generated frames into an MP4 using ffmpeg wizardry

Parameters:
  • input_dir – directory in which to look for frames,

  • pattern – filenames of frames should match this

  • outfile – where to save generated mp4

  • fps – frames per second in video

  • crf – constant rate factor for video encoding (0-51), lower is better quality but more memory

  • vcodec – video codec to use (either libx264 or libx265)

  • step – drop some frames when making video, use frames 0+step*n

  • multiple – some codecs require size to be a multiple of n

  • force – if true, overwrite output file if present

  • bg_color – for images with transparencies, namely PNGs, use this color as a background

  • strip_alpha – if true, do not pre-process PNGs to remove transparencies

visionsim.cli.ffmpeg.combine(matrix: str, outfile: str = 'combined.mp4', mode: str = 'shortest', color: str = 'white', multiple: int = 2, force: bool = False)[source]

Combine multiple videos into one by stacking, padding and resizing them using ffmpeg.

Internally this task will first optionally pad all videos to length using ffmpeg’s tpad filter, then scale all videos in a row to have the same height, combine rows together using the hstack filter before finally scaleing row-videos to have same width and vstacking them together.

Parameters:
  • matrix – Way to specify videos to combine as a 2D matrix of file paths

  • outfile – where to save generated mp4

  • mode – if ‘shortest’ combined video will last as long s shortest input video. If ‘static’, the last frame of videos that are shorter than the longest input video will be repeated. If ‘pad’, all videos as padded with frames of color to last the same duration.

  • color – color to pad videos with, only used if mode is ‘pad’

  • multiple – some codecs require size to be a multiple of n

  • force – if true, overwrite output file if present

Example

The input videos can also be specified in a 2D array using the --matrix argument like so:

$ visionsim ffmpeg.combine --matrix='[["a.mp4", "b.mp4"]]' --outfile="output.mp4"
visionsim.cli.ffmpeg.grid(input_dir: str | PathLike, width: int = -1, height: int = -1, pattern: str = '*.mp4', outfile: str = 'combined.mp4', force: bool = False)[source]

Make a mosaic from videos in a folder, organizing them in a grid

Parameters:
  • input_dir – directory containing all video files (mp4’s expected),

  • width – width of video grid to produce

  • height – height of video grid to produce

  • pattern – use files that match this pattern as inputs

  • outfile – where to save generated mp4

  • force – if true, overwrite output file if present

visionsim.cli.ffmpeg.count_frames(input_file: str | PathLike)[source]

Count the number of frames a video file contains using ffprobe

Parameters:

input_file – video file input

visionsim.cli.ffmpeg.duration(input_file: str | PathLike, /)[source]

Return duration (in seconds) of first video stream in file using ffprobe

Parameters:

input_file – video file input

visionsim.cli.ffmpeg.dimensions(input_file: str | PathLike)[source]

Return size (WxH in pixels) of first video stream in file using ffprobe

Parameters:

input_file – video file input

visionsim.cli.ffmpeg.extract(input_file: str | PathLike, output_dir: str | PathLike, pattern: str = 'frames_%06d.png')[source]

Extract frames from video file

Parameters:
  • input_file – path to video file from which to extract frames,

  • output_dir – directory in which to save extracted frames,

  • pattern – filenames of frames will match this pattern

visionsim.cli.interpolate module

visionsim.cli.interpolate.video(input_file: str | PathLike, output_file: str | PathLike, method: str = 'rife', n: int = 2)[source]

Interpolate video by extracting all frames, performing frame-wise interpolation and re-assembling video

Parameters:
  • input_file – path to video file from which to extract frames

  • output_file – path in which to save interpolated video

  • method – interpolation method to use, only RIFE (ECCV22) is supported for now, default: ‘rife’

  • n – interpolation factor, must be a multiple of 2, default: 2

visionsim.cli.interpolate.frames(input_dir: str | PathLike, output_dir: str | PathLike, method: Literal['rife'] = 'rife', file_name: str = 'transforms.json', n: int = 2)[source]

Interpolate poses and frames separately, then combine into transforms.json file

Parameters:
  • input_dir – directory in which to look for frames,

  • output_dir – directory in which to save interpolated frames,

  • method – interpolation method to use, only RIFE (ECCV22) is supported for now, default: ‘rife’,

  • file_name – name of file containing transforms, default: ‘transforms.json’,

  • n – interpolation factor, must be a multiple of 2, default: 2,

visionsim.cli.transforms module

visionsim.cli.transforms.colorize_depths(input_dir: str | PathLike, output_dir: str | PathLike, pattern: str = 'depth_*.exr', cmap: str = 'turbo', ext: str = '.png', vmin: float | None = None, vmax: float | None = None, quantile: float = 0.01, step: int = 1)[source]

Convert .exr depth maps into color-coded images for visualization

Parameters:
  • input_dir – directory in which to look for frames

  • output_dir – directory in which to save colorized frames

  • pattern – filenames of frames should match this

  • cmap – which matplotlib colormap to use

  • ext – which format to save colorized frames as

  • vmin – minimum expected depth used to normalize colormap

  • vmax – maximum expected depth used to normalize colormap

  • quantile – if vmin/vmax are None, use this quantile to estimate them

  • step – drop some frames when colorizing, use frames 0+step*n

visionsim.cli.transforms.colorize_flows(input_dir: str | PathLike, output_dir: str | PathLike, direction: Literal['forward', 'backward'] = 'forward', pattern: str = 'flow_*.exr', ext: str = '.png', vmax: float | None = None, quantile: float = 0.01, step: int = 1)[source]

Convert .exr optical flow maps into color-coded images for visualization

Parameters:
  • input_dir – directory in which to look for frames

  • output_dir – directory in which to save colorized frames

  • direction – direction of flow to colorize

  • pattern – filenames of frames should match this

  • ext – which format to save colorized frames as

  • vmax – maximum expected flow magnitude

  • quantile – if vmax is None, use this quantile to estimate it

  • step – drop some frames when colorizing, use frames 0+step*n

visionsim.cli.transforms.colorize_normals(input_dir: str | PathLike, output_dir: str | PathLike, pattern: str = 'normal_*.exr', ext: str = '.png', step: int = 1)[source]

Convert .exr normal maps into color-coded images for visualization

Parameters:
  • input_dir – directory in which to look for frames

  • output_dir – directory in which to save colorized frames

  • pattern – filenames of frames should match this

  • ext – which format to save colorized frames as

  • step – drop some frames when colorizing, use frames 0+step*n

visionsim.cli.transforms.colorize_segmentations(input_dir: str | PathLike, output_dir: str | PathLike, pattern: str = 'segmentation_*.exr', ext: str = '.png', num_objects: int | None = None, shuffle: bool = True, seed: int = 1234, step: int = 1)[source]

Convert .exr segmentation maps into color-coded images for visualization

Parameters:
  • input_dir – directory in which to look for frames

  • output_dir – directory in which to save colorized frames

  • pattern – filenames of frames should match this

  • ext – which format to save colorized frames as

  • num_objects – number of unique objects to expect in the scene

  • shuffle – if true, colorize items in a random order

  • seed – seed used when shuffling colors

  • step – drop some frames when colorizing, use frames 0+step*n

visionsim.cli.transforms.tonemap_exrs(input_dir: str | PathLike, output_dir: str | PathLike | None = None, batch_size: int = 4, hdr_quantile: float = 0.01, force: bool = False)[source]

Convert .exr linear intensity frames into tone-mapped sRGB images

Parameters:
  • input_dir – directory in which to look for frames

  • output_dir – directory in which to save tone mapped frames, if not specified the dynamic range is calculated and no tonemapping occurs

  • batch_size – number of frames to write at once

  • hdr_quantile – calculate dynamic range using brightness quantiles instead of extrema

  • force – if true, overwrite output file(s) if present

Module contents

visionsim.cli.post_install(executable: str | PathLike | None = None, editable: bool = False)[source]

Install additional dependencies

Parameters:
  • executable (str | os.PathLike | None, optional) – Path to Blender executable. Defaults to one found on $PATH.

  • editable – (bool, optional): If set, install current visionsim as editable in blender. Only works if visionsim is already installed as editable locally.

visionsim.cli.main()[source]