cvcg_utils.image

The subpackage contains tools for

  1. image IO;

  2. image processing

cvcg_utils.image.image_io

In cvcg_utils.image.image_io, each read/write method is also marked by the image format (e.g., whether it’s RGB or RGBA) and dtype (e.g., np.uint8 or np.float32). In research code, not knowing the exact data type (which usually happens if no data type check is done) can potentially cause very obscure bugs. Our choice is to sacrifice flexibility for readability and stability.

Note

All image IO methods use the cv2 backend. The BGR-RGB conversion is handled by our wrappers.

cvcg_utils.image.image_io.read_depth_compressed_dr_png(path: str | PathLike | IO, read_nan_inf_as_zero: bool = False) Tuple[ndarray, float]

Take from https://github.com/microsoft/MoGe/blob/main/moge/utils/io.py#L89

Read a depth image, return float32 depth array of shape (H, W).

This depth should be in uint16 png format, values are dynamically log-scaled to 1 ~ 65534

cvcg_utils.image.image_io.read_grayscale(fn: str) ndarray

Reads and returns a grayscale image (cv2 backend).

Raises AssertionError if image read fails or if the image is not 1-channel.

cvcg_utils.image.image_io.read_rgb(fn: str) ndarray

Reads and returns a 3-channel RGB image (cv2 backend).

Raises AssertionError if image read fails or if the image is not 3-channel.

cvcg_utils.image.image_io.read_rgb_compressed_flat_dr_png(path: str | PathLike | IO, read_nan_as: float = 0.0) Tuple[ndarray, ndarray]

This is used to read images written with write_rgb_compressed_dr_png

cvcg_utils.image.image_io.read_rgb_exr(fn) ndarray

Reads and returns an exr format RGB image (cv2 backend).

Raises AssertionError if the image is not in exr format, if image read fails or if the image is not 3-channel.

cvcg_utils.image.image_io.read_rgba(fn: str) ndarray

Reads and returns a 4-channel RGBA image (cv2 backend).

Raises AssertionError if image read fails or if the image is not 4-channel.

cvcg_utils.image.image_io.to_u8_s255(src: ndarray)

change type to uint8 and scale by 255

cvcg_utils.image.image_io.write_bgr(fn: str, bgr: ndarray)

Write np.uint8 BGR image bgr to path fn (cv2 backend). Supports only .png, .jpg and .jpeg formats.

Raises AssertionError if the image is not np.uint8 RGB, if fn has a wrong suffix, or if image write fails.

cvcg_utils.image.image_io.write_depth_compressed_dr_png(path: str | PathLike | IO, depth: ndarray, unit: float = None, max_range: float = 100000.0, compression_level: int = 7)

This is taken from https://github.com/microsoft/MoGe/blob/main/moge/utils/io.py#L110

This depth will be dynamically log-scaled to 1 ~ 65534 and converted to uint16 png format.

Encode and write a depth image as 16-bit PNG format. ### Parameters: - path: Union[str, os.PathLike, IO]

The file path or file object to write to.

  • depth: np.ndarray

    The depth array, float32 array of shape (H, W). May contain NaN for invalid values and Inf for infinite values.

  • unit: float = None

    The unit of the depth values.

Depth values are encoded as follows: - 0: unknown - 1 ~ 65534: depth values in logarithmic - 65535: infinity

metadata is stored in the PNG file as text fields: - near: the minimum depth value - far: the maximum depth value - unit: the unit of the depth values (optional)

cvcg_utils.image.image_io.write_grayscale(fn: str, grayscale: ndarray)

Write np.uint8 grayscale image grayscale to path fn (cv2 backend). Supports only .png, .jpg and .jpeg format.

Raises AssertionError if the image is not np.float32 grayscale, if fn has a wrong suffix, or if image write fails.

cvcg_utils.image.image_io.write_grayscale_exr(fn: str, grayscale: ndarray)

Write np.float32 grayscale image grayscale to path fn (cv2 backend). Supports only .exr format.

Raises AssertionError if the image is not np.float32 grayscale, if fn has a wrong suffix, or if image write fails.

cvcg_utils.image.image_io.write_grayscale_uint16(fn: str, grayscale: ndarray)

Write np.uint16 grayscale image rgb to path fn (cv2 backend). Supports only .png format.

Raises AssertionError if the image is not np.uint16 grayscale, if fn has a wrong suffix, or if image write fails.

cvcg_utils.image.image_io.write_rgb(fn: str, rgb: ndarray)

Write np.uint8 RGB image rgb to path fn (cv2 backend). Supports only .png, .jpg and .jpeg formats.

Raises AssertionError if the image is not np.uint8 RGB, if fn has a wrong suffix, or if image write fails.

cvcg_utils.image.image_io.write_rgb_compressed_flat_dr_png(path: str | PathLike | IO, data: ndarray, mask: ndarray = None, compression_level: int = 7)

This is taken from https://github.com/microsoft/MoGe/blob/main/moge/utils/io.py#L110

This depth will be dynamically scaled to 1 ~ 65534 and converted to uint16 png format.

Encode and write a float image as 16-bit PNG format

Range is computed channel-wise.

BG pixels will be converted to nan for compression

### Parameters: - path: Union[str, os.PathLike, IO]

The file path or file object to write to.

  • depth: np.ndarray

    The depth array, float32 array of shape (H, W). May contain NaN for invalid values and Inf for infinite values.

  • unit: float = None

    The unit of the depth values.

Depth values are encoded as follows: - 0: unknown - 1 ~ 65534: depth values in logarithmic - 65535: infinity

metadata is stored in the PNG file as text fields: - near: the minimum depth value - far: the maximum depth value - unit: the unit of the depth values (optional)

cvcg_utils.image.image_io.write_rgb_exr(fn: str, rgb: ndarray)

Write np.float32 RGB image rgb to path fn (cv2 backend). Supports only .exr format.

Raises AssertionError if the image is not np.float32 RGB, if fn has a wrong suffix, or if image write fails.

cvcg_utils.image.image_io.write_rgb_uint16(fn: str, rgb: ndarray)

Write np.uint16 RGB image rgb to path fn (cv2 backend). Supports only .png, .jpg and .jpeg formats.

Raises AssertionError if the image is not np.uint16 RGB, if fn has a wrong suffix, or if image write fails.

cvcg_utils.image.image_io.write_rgba(fn: str, rgba: ndarray)

Write np.uint8 RGBA image rgb to path fn (cv2 backend). Supports only .png, .jpg and .jpeg formats.

Raises AssertionError if the image is not np.uint8 RGBA, if fn has a wrong suffix, or if image write fails.

cvcg_utils.image.image_io.write_rgba_uint16(fn: str, rgba: ndarray)

Write np.uint16 RGBA image rgba to path fn (cv2 backend). Supports only .png, .jpg and .jpeg formats.

Raises AssertionError if the image is not np.uint16 RGBA, if fn has a wrong suffix, or if image write fails.

cvcg_utils.image.image_proc

cvcg_utils.image.image_proc.get_laplacian(H: int, W: int) csc_array

Computes the sparse Laplacian for an image of shape [H, W].

Pixels in the image are indexed from 0 to H*W-1 in a row-major order.

The output is a scipy.sparse.csc_array sparse matrix L such that

  • L[i,i] = -1

  • L[i,j] = 1 / deg(i), if j in the 4-neighborhood of i

where i, j here are flattened pixel indices.

Returns:

L – of shape (H*W, H*W)

Return type:

scipy.sparse.csc_array

cvcg_utils.image.image_proc.get_value_and_laplacian(H: int, W: int, mask: ndarray, value_scale: float = 1.0, return_separate: bool = False)

Computes a selection matrix V for a masked region given by mask together with the Laplacian L as in get_laplacian.

Returns the vertical concatenation VL of V and L

Parameters:
  • H (int) – image height

  • W (int) – image width

  • mask (np.ndarray[bool]) – whose shape must be equal to (H, W)

  • value_scale (float) – a constant scale applied to the selection matrix V

  • return_separate (bool) – If True, returns (VL, V, L). If False, V and L are set to None

Returns:

  • VL (scipy.sparse.csc_array) – of shape (N + H*W, H*W), where N is the number of valid pixels in mask

  • V (scipy.sparse.csc_array) – of shape (N, H*W)

  • L (scipy.sparse.csc_array) – of shape (H*W, H*W)

cvcg_utils.image.image_proc.get_value_and_laplacian_masked(H: int, W: int, vmask: ndarray, dmask: ndarray, value_scale: float = 1.0)

Sames as get_value_and_laplacian, but now Laplacian is only computed for dmask

cvcg_utils.image.image_proc.get_value_and_uv_laplacian_masked(H: int, W: int, vmask: ndarray, dmask: ndarray, value_scale: float = 1.0)

Sames as get_value_and_laplacian_masked, but only 1D Laplacians are computed