Visual Information Fidelity (VIF)¶

Module Interface¶

class torchmetrics.image.VisualInformationFidelity(sigma_n_sq=2.0, reduction='mean', **kwargs)[source]¶

Compute Pixel Based Visual Information Fidelity (VIF).

As input to forward and update the metric accepts the following input

preds (Tensor): Predictions from model of shape (N,C,H,W) with H,W ≥ 41
target (Tensor): Ground truth values of shape (N,C,H,W) with H,W ≥ 41

As output of forward and compute the metric returns the following output

vif-p (Tensor):
- If reduction='mean' (default), returns a Tensor mean VIF score.
- If reduction='none', returns a tensor of shape (N,) with VIF values per sample.

Parameters:

sigma_n_sq¶ (float) – variance of the visual noise
reduction¶ (Literal['mean', 'none']) –
The reduction method for aggregating scores.
- 'mean': return the average VIF across the batch.
- 'none': return a VIF score for each sample in the batch.
kwargs¶ (Any) – Additional keyword arguments, see Advanced metric settings for more info.

Example

>>> from torch import randn
>>> from torchmetrics.image import VisualInformationFidelity
>>> preds = randn([32, 3, 41, 41])
>>> target = randn([32, 3, 41, 41])
>>> vif = VisualInformationFidelity(reduction='mean')
>>> vif(preds, target)
tensor(0.0032)

Functional Interface¶

torchmetrics.functional.image.visual_information_fidelity(preds, target, sigma_n_sq=2.0)[source]¶

Compute Pixel Based Visual Information Fidelity (VIF).

Parameters:

preds¶ (Tensor) – predicted images of shape (N,C,H,W). (H, W) has to be at least (41, 41).
target¶ (Tensor) – ground truth images of shape (N,C,H,W). (H, W) has to be at least (41, 41)
sigma_n_sq¶ (float) – variance of the visual noise

Return type:

Tensor

Returns:

Tensor with vif-p score

Raises:

ValueError – If predicted or ground truth image shape is not at least (41, 41)