visionsim.cli package¶

Submodules¶

visionsim.cli.blender module¶

visionsim.cli.blender.sequence_info(dataset: str | PathLike, keyframe_multiplier: float = 1.0, original_fps: int = 50, output: str | PathLike | None = None)[source]¶

Query dataset to collect some extra metadata, write it to a json file

Parameters:

dataset (str | os.PathLike) – Root pathy of dataset
keyframe_multiplier (float, optional) – Keyframe stretch amount.
original_fps (int, optional) – Framerate of native blender animation. Defaults to 50fps.
output (str | os.PathLike | None, optional) – Path of output info file. Defaults to “info.json” in the dataset’s root directory.

Create datasets by rendering out a sequence from a _single_ blend-file.

Parameters:

blend_file (str | os.PathLike) – Path to blend file.
root_path (str | os.PathLike) – Dataset output folder.
render_config (RenderConfig) – Render configuration.
frame_start (int) – Start rendering at this frame index (inclusive).
frame_end (int) – Stop rendering at this frame index (inclusive).
output_blend_file (str | os.PathLike | None, optional) – If set, write the modified blend file to this path. Helpful for troubleshooting. Defaults to not saving.
dry_run (bool, optional) – if true, nothing will be rendered at all. Defaults to False.

visionsim.cli.dataset module¶

visionsim.cli.dataset.imgs_to_npy(input_dir: str | PathLike, output_dir: str | PathLike, bitpack: bool = False, bitpack_dim: int | None = None, batch_size: int = 4, alpha_color: str = '(255, 255, 255)', is_grayscale: bool = False, force: bool = False)[source]¶

Convert an image folder based dataset to a NPY dataset

Parameters:

input_dir – directory in which to look for frames
output_dir – directory in which to save npy file
bitpack – if true, each chunk of 8 binary pixels will by packed into a single byte. Only enable if data is binary valued
bitpack_dim – axis along which to pack bits (H=1, W=2)
batch_size – number of frames to write at once
alpha_color – if set, blend with this background color and do not store alpha channel
is_grayscale – If set, assume images are grayscale and only save first channel
force – if true, overwrite output file(s) if present

visionsim.cli.dataset.npy_to_imgs(input_dir: str | PathLike, output_dir: str | PathLike, batch_size: int = 4, pattern: str = 'frame_{:06}.png', step: int = 1, force: bool = False)[source]¶

Convert an NPY based dataset to an image-folder dataset

Parameters:

input_dir – directory in which to look for frames
output_dir – directory in which to save npy file
batch_size – number of frames to write at once
pattern – filenames of frames will match this
step – skip some frames when converting between formats
force – if true, overwrite output file(s) if present

visionsim.cli.dataset.info(input_dir: str | PathLike, as_json: bool = False)[source]¶

Print information about the dataset

Parameters:

input_dir – directory in which to look for dataset
as_json – print the output in a json-formatted string

visionsim.cli.emulate module¶

visionsim.cli.emulate.spad(input_dir: str | PathLike, output_dir: str | PathLike, pattern: str = 'frame_{:06}.png', factor: float = 1.0, seed: int = 2147483647, mode: Literal['npy', 'img'] = 'npy', batch_size: int = 4, force: bool = False)[source]¶

Perform bernoulli sampling on linearized RGB frames to yield binary frames

Parameters:

input_dir – directory in which to look for frames
output_dir – directory in which to save binary frames
pattern – filenames of frames should match this
factor – multiplicative factor controlling dynamic range of output
seed – random seed to use while sampling, ensures reproducibility
mode – how to save binary frames
batch_size – number of frames to write at once
force – if true, overwrite output file(s) if present

visionsim.cli.emulate.events(input_dir: str | PathLike, output_dir: str | PathLike, fps: int, pos_thres: float = 0.2, neg_thres: float = 0.2, sigma_thres: float = 0.03, cutoff_hz: int = 200, leak_rate_hz: float = 1.0, shot_noise_rate_hz: float = 10.0, seed: int = 2147483647, force: bool = False)[source]¶

Emulate an event camera using v2e and high speed input frames

Parameters:

input_dir – directory in which to look for frames
output_dir – directory in which to save events
fps – frame rate of input sequence
pos_thres – nominal threshold of triggering positive event in log intensity
neg_thres – nominal threshold of triggering negative event in log intensity
sigma_thres – std deviation of threshold in log intensity
cutoff_hz – 3dB cutoff frequency in Hz of DVS photoreceptor, default: 200,
leak_rate_hz – leak event rate per pixel in Hz, from junction leakage in reset switch
shot_noise_rate_hz – shot noise rate in Hz
seed – random seed to use while sampling, ensures reproducibility
force – if true, overwrite output file(s) if present

visionsim.cli.emulate.rgb(input_dir: str | PathLike, output_dir: str | PathLike, chunk_size: int = 10, factor: float = 1.0, readout_std: float = 20.0, fwc: int | None = None, duplicate: float = 1.0, pattern: str = 'frame_{:06}.png', mode: Literal['npy', 'img'] = 'npy', force: bool = False)[source]¶

Simulate real camera, adding read/poisson noise and tonemapping

Parameters:

input_dir – directory in which to look for frames
output_dir – directory in which to save binary frames
chunk_size – number of consecutive frames to average together
factor – multiply image’s linear intensity by this weight
readout_std – standard deviation of gaussian read noise
fwc – full well capacity of sensor in arbitrary units (relative to factor & chunk_size)
duplicate – when chunk size is too small, this model is ill-suited and creates unrealistic noise. This parameter artificially increases the chunk size by using each input image duplicate number of times
pattern – filenames of frames should match this
mode – how to save binary frames
force – if true, overwrite output file(s) if present

visionsim.cli.emulate.imu(input_dir: str | PathLike = '.', output_file: str | PathLike = '', seed: int = 2147483647, gravity: str = '(0.0, 0.0, -9.8)', dt: float = 0.00125, init_bias_acc: str = '(0.0, 0.0, 0.0)', init_bias_gyro: str = '(0.0, 0.0, 0.0)', std_bias_acc: float = 5.5e-05, std_bias_gyro: float = 2e-05, std_acc: float = 0.008, std_gyro: float = 0.0012)[source]¶

Simulate data from a co-located IMU using the poses in transforms.json.

Parameters:

input_dir – directory in which to look for transforms.json,
output_file – file in which to save simulated IMU data. Prints to stdout if empty. default: ‘’,
seed – RNG seed value for reproducibility. default: 2147483647,
gravity – gravity vector in world coordinate frame. Given in m/s^2. default: [0,0,-9.8],
dt – time between consecutive transforms.json poses (assumed regularly spaced). Given in seconds. default: 0.00125,
init_bias_acc – initial bias/drift in accelerometer reading. Given in m/s^2. default: [0,0,0],
init_bias_gyro – initial bias/drift in gyroscope reading. Given in rad/s. default: [0,0,0],
std_bias_acc – stdev for random-walk component of error (drift) in accelerometer. Given in m/(s^3 sqrt(Hz))
std_bias_gyro – stdev for random-walk component of error (drift) in gyroscope. Given in rad/(s^2 sqrt(Hz))
std_acc – stdev for white-noise component of error in accelerometer. Given in m/(s^2 sqrt(Hz))
std_gyro – stdev for white-noise component of error in gyroscope. Given in rad/(s sqrt(Hz))

visionsim.cli.ffmpeg module¶

visionsim.cli.ffmpeg.animate(input_dir: str | PathLike, pattern: str = 'frame_*.png', outfile: str | PathLike = 'out.mp4', fps: int = 25, crf: int = 22, vcodec: str = 'libx264', step: int = 1, multiple: int = 2, force: bool = False, bg_color: str = 'black', strip_alpha: bool = False)[source]¶

Combine generated frames into an MP4 using ffmpeg wizardry

Parameters:

input_dir – directory in which to look for frames,
pattern – filenames of frames should match this
outfile – where to save generated mp4
fps – frames per second in video
crf – constant rate factor for video encoding (0-51), lower is better quality but more memory
vcodec – video codec to use (either libx264 or libx265)
step – drop some frames when making video, use frames 0+step*n
multiple – some codecs require size to be a multiple of n
force – if true, overwrite output file if present
bg_color – for images with transparencies, namely PNGs, use this color as a background
strip_alpha – if true, do not pre-process PNGs to remove transparencies

visionsim.cli.ffmpeg.combine(matrix: str, outfile: str = 'combined.mp4', mode: str = 'shortest', color: str = 'white', multiple: int = 2, force: bool = False)[source]¶

Combine multiple videos into one by stacking, padding and resizing them using ffmpeg.

Internally this task will first optionally pad all videos to length using ffmpeg’s tpad filter, then scale all videos in a row to have the same height, combine rows together using the hstack filter before finally scaleing row-videos to have same width and vstacking them together.

Parameters:

matrix – Way to specify videos to combine as a 2D matrix of file paths
outfile – where to save generated mp4
mode – if ‘shortest’ combined video will last as long s shortest input video. If ‘static’, the last frame of videos that are shorter than the longest input video will be repeated. If ‘pad’, all videos as padded with frames of color to last the same duration.
color – color to pad videos with, only used if mode is ‘pad’
multiple – some codecs require size to be a multiple of n
force – if true, overwrite output file if present

Example

The input videos can also be specified in a 2D array using the --matrix argument like so:

$ visionsim ffmpeg.combine --matrix='[["a.mp4", "b.mp4"]]' --outfile="output.mp4"

visionsim.cli.ffmpeg.grid(input_dir: str | PathLike, width: int = -1, height: int = -1, pattern: str = '*.mp4', outfile: str = 'combined.mp4', force: bool = False)[source]¶

Make a mosaic from videos in a folder, organizing them in a grid

Parameters:

input_dir – directory containing all video files (mp4’s expected),
width – width of video grid to produce
height – height of video grid to produce
pattern – use files that match this pattern as inputs
outfile – where to save generated mp4
force – if true, overwrite output file if present

visionsim.cli.ffmpeg.count_frames(input_file: str | PathLike)[source]¶

Count the number of frames a video file contains using ffprobe

Parameters:: input_file – video file input

visionsim.cli.ffmpeg.duration(input_file: str | PathLike, /)[source]¶

Return duration (in seconds) of first video stream in file using ffprobe

Parameters:: input_file – video file input

visionsim.cli.ffmpeg.dimensions(input_file: str | PathLike)[source]¶

Return size (WxH in pixels) of first video stream in file using ffprobe

Parameters:: input_file – video file input

visionsim.cli.ffmpeg.extract(input_file: str | PathLike, output_dir: str | PathLike, pattern: str = 'frames_%06d.png')[source]¶

Extract frames from video file

Parameters:

input_file – path to video file from which to extract frames,
output_dir – directory in which to save extracted frames,
pattern – filenames of frames will match this pattern

visionsim.cli.interpolate module¶

visionsim.cli.interpolate.video(input_file: str | PathLike, output_file: str | PathLike, method: str = 'rife', n: int = 2)[source]¶

Interpolate video by extracting all frames, performing frame-wise interpolation and re-assembling video

Parameters:

input_file – path to video file from which to extract frames
output_file – path in which to save interpolated video
method – interpolation method to use, only RIFE (ECCV22) is supported for now, default: ‘rife’
n – interpolation factor, must be a multiple of 2, default: 2

visionsim.cli.interpolate.frames(input_dir: str | PathLike, output_dir: str | PathLike, method: Literal['rife'] = 'rife', file_name: str = 'transforms.json', n: int = 2)[source]¶

Interpolate poses and frames separately, then combine into transforms.json file

Parameters:

input_dir – directory in which to look for frames,
output_dir – directory in which to save interpolated frames,
method – interpolation method to use, only RIFE (ECCV22) is supported for now, default: ‘rife’,
file_name – name of file containing transforms, default: ‘transforms.json’,
n – interpolation factor, must be a multiple of 2, default: 2,

visionsim.cli.transforms module¶

visionsim.cli.transforms.colorize_depths(input_dir: str | PathLike, output_dir: str | PathLike, pattern: str = 'depth_*.exr', cmap: str = 'turbo', ext: str = '.png', vmin: float | None = None, vmax: float | None = None, quantile: float = 0.01, step: int = 1)[source]¶

Convert .exr depth maps into color-coded images for visualization

Parameters:

input_dir – directory in which to look for frames
output_dir – directory in which to save colorized frames
pattern – filenames of frames should match this
cmap – which matplotlib colormap to use
ext – which format to save colorized frames as
vmin – minimum expected depth used to normalize colormap
vmax – maximum expected depth used to normalize colormap
quantile – if vmin/vmax are None, use this quantile to estimate them
step – drop some frames when colorizing, use frames 0+step*n

visionsim.cli.transforms.colorize_flows(input_dir: str | PathLike, output_dir: str | PathLike, direction: Literal['forward', 'backward'] = 'forward', pattern: str = 'flow_*.exr', ext: str = '.png', vmax: float | None = None, quantile: float = 0.01, step: int = 1)[source]¶

Convert .exr optical flow maps into color-coded images for visualization

Parameters:

input_dir – directory in which to look for frames
output_dir – directory in which to save colorized frames
direction – direction of flow to colorize
pattern – filenames of frames should match this
ext – which format to save colorized frames as
vmax – maximum expected flow magnitude
quantile – if vmax is None, use this quantile to estimate it
step – drop some frames when colorizing, use frames 0+step*n

visionsim.cli.transforms.colorize_normals(input_dir: str | PathLike, output_dir: str | PathLike, pattern: str = 'normal_*.exr', ext: str = '.png', step: int = 1)[source]¶

Convert .exr normal maps into color-coded images for visualization

Parameters:

input_dir – directory in which to look for frames
output_dir – directory in which to save colorized frames
pattern – filenames of frames should match this
ext – which format to save colorized frames as
step – drop some frames when colorizing, use frames 0+step*n

visionsim.cli.transforms.colorize_segmentations(input_dir: str | PathLike, output_dir: str | PathLike, pattern: str = 'segmentation_*.exr', ext: str = '.png', num_objects: int | None = None, shuffle: bool = True, seed: int = 1234, step: int = 1)[source]¶

Convert .exr segmentation maps into color-coded images for visualization

Parameters:

input_dir – directory in which to look for frames
output_dir – directory in which to save colorized frames
pattern – filenames of frames should match this
ext – which format to save colorized frames as
num_objects – number of unique objects to expect in the scene
shuffle – if true, colorize items in a random order
seed – seed used when shuffling colors
step – drop some frames when colorizing, use frames 0+step*n

visionsim.cli.transforms.tonemap_exrs(input_dir: str | PathLike, output_dir: str | PathLike | None = None, batch_size: int = 4, hdr_quantile: float = 0.01, force: bool = False)[source]¶

Convert .exr linear intensity frames into tone-mapped sRGB images

Parameters:

input_dir – directory in which to look for frames
output_dir – directory in which to save tone mapped frames, if not specified the dynamic range is calculated and no tonemapping occurs
batch_size – number of frames to write at once
hdr_quantile – calculate dynamic range using brightness quantiles instead of extrema
force – if true, overwrite output file(s) if present

Module contents¶

visionsim.cli.post_install(executable: str | PathLike | None = None, editable: bool = False)[source]¶

Install additional dependencies

Parameters:

executable (str | os.PathLike | None, optional) – Path to Blender executable. Defaults to one found on $PATH.
editable – (bool, optional): If set, install current visionsim as editable in blender. Only works if visionsim is already installed as editable locally.

visionsim.cli.main()[source]¶