Visual Information Fidelity (VIF)

Module Interface

class torchmetrics.image.VisualInformationFidelity(sigma_n_sq=2.0, reduction='mean', **kwargs)[source]

Compute Pixel Based Visual Information Fidelity (VIF).

As input to forward and update the metric accepts the following input

  • preds (Tensor): Predictions from model of shape (N,C,H,W) with H,W ≥ 41

  • target (Tensor): Ground truth values of shape (N,C,H,W) with H,W ≥ 41

As output of forward and compute the metric returns the following output

  • vif-p (Tensor):
    • If reduction='mean' (default), returns a Tensor mean VIF score.

    • If reduction='none', returns a tensor of shape (N,) with VIF values per sample.

Parameters:
  • sigma_n_sq (float) – variance of the visual noise

  • reduction (Literal['mean', 'none']) –

    The reduction method for aggregating scores.

    • 'mean': return the average VIF across the batch.

    • 'none': return a VIF score for each sample in the batch.

  • kwargs (Any) – Additional keyword arguments, see Advanced metric settings for more info.

Example

>>> from torch import randn
>>> from torchmetrics.image import VisualInformationFidelity
>>> preds = randn([32, 3, 41, 41])
>>> target = randn([32, 3, 41, 41])
>>> vif = VisualInformationFidelity(reduction='mean')
>>> vif(preds, target)
tensor(0.0032)

Functional Interface

torchmetrics.functional.image.visual_information_fidelity(preds, target, sigma_n_sq=2.0)[source]

Compute Pixel Based Visual Information Fidelity (VIF).

Parameters:
  • preds (Tensor) – predicted images of shape (N,C,H,W). (H, W) has to be at least (41, 41).

  • target (Tensor) – ground truth images of shape (N,C,H,W). (H, W) has to be at least (41, 41)

  • sigma_n_sq (float) – variance of the visual noise

Return type:

Tensor

Returns:

Tensor with vif-p score

Raises:

ValueError – If predicted or ground truth image shape is not at least (41, 41)