-->
Save your seat for Streaming Media NYC this May. Register Now!

比较编码阶梯上和下的质量度量

Article Featured Image

在编码自适应比特率阶梯时,通常必须这样做 compare videos with different resolutions, which raises multiple issues. For example, 在测量峰值信噪比(PSNR)或视频多方法评估融合(VMAF)时,将640x360视频与854x480视频进行比较, what resolution do you compare them at? 如何解释PSNR或VMAF评分,哪个指标是最好的? In this column, I’ll tackle all of these issues.

Regarding the first issue, there’s a theoretically correct answer, and then there’s how it’s generally done, and they don’t always correspond. 理论上正确的答案是比较观看视频的分辨率. For example, 如果你确定视频将在480p的窗口中观看, 您应该根据需要将源文件和输出文件缩放到480p,并在那里运行比较. However, few publishers have that degree of certainty, 因此,大多数人将编码文件扩展到源视频的分辨率,并在那里进行比较. 对于视频几乎总是全屏观看的OTT提供商来说,这当然是有意义的, and is a nice compromise position for other publishers.

Some programs handle this scaling behind the scenes; for most others, you have to scale in FFmpeg, 从时间和磁盘空间的角度来看,哪个是最痛苦的. 我的一个技巧是将编码文件转换为Y4M容器格式, rather than YUV, because the Y4M header contains resolution, frame rate, 并在质量控制工具中简化比较的格式信息. If you use the YUV container format, you’ll have to insert resolution, frame rate, 或者将数据格式化到命令行或将其输入到程序本身, which can be time-consuming.

第二个问题是,一旦你得到了分数,如何解释它们. 如果您将跨分辨率文件与源文件进行比较, 要明白,在较低的分辨率下,分数会下降,因为较小的文件包含更多的缩放工件和细节丢失. 这意味着以源分辨率编码的文件将获得最高分, with lower resolutions scoring increasingly lower.

For example, in an article I wrote on per-title encoding, 我比较了从1080p到180p的编码阶梯技术. 1080p级的典型PSNR评分为45-50 dB, and dropped to around 30 dB for the lowest rung. That’s not a lot of range. PSNR的经验法则是,超过45 dB的质量通常是观众无法察觉的, 而分数低于35通常预示着可见的人工制品. But that’s only for the 1080p rung; the 180p rung will never get close to 45 dB, although the files might look good at 32 dB. 所以你无法预测一个人会如何理解一个PSNR值为38 dB的360p文件, although when you’re comparing cross-resolution results, higher is always better.

VMAF的伟大之处在于它是为这种类型的交叉分辨率分析而设计的. Specifically, 分数100被映射到以22的恒定速率因子(CRF)编码的1080p文件, 而分数为20则映射到编码为240p、CRF值为28的文件. In the same per-title analysis, typical 1080p scores were in the mid- to upper 90s, while the 180p files often scored in the single digits.

这个范围使得VMAF分数比PSNR更容易解释, 但你仍然无法预测观众会如何看待中间片段的质量, say a 480p clip with a VMAF score of 42. 但是,您确实知道6个VMAF点等于一个刚刚可注意到的差异(JND)。. Technically, 这意味着75%的观众会注意到6个点的摆动, while closer to 90 percent would notice a 12-point, two-JND swing.

识别JND的能力对于一系列编码决策非常有用, 从配置编码阶梯到选择编码器或编解码器. 如果您还没有开始使用VMAF,那么是时候尝试一下了.

[This article appears in the October 2017 issue of Streaming Media Magazine as "Quality Metrics Up and Down the Encoding Ladder."]

Streaming Covers
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues
Related Articles

How YouTube Encodes Videos

寻找YouTube如何编码数十亿视频的真知灼见? Jan Ozer走进兔子洞,分享了他对AV1, VP9和分辨率的发现.

QoE工作小组将于年底交付标准文件

一个由CTA监督的工作小组正在制定衡量绩效质量的建议, 一些业内最知名的公司也在参与其中.

How to Choose and Use Objective Video Quality Benchmarks

如果你没有使用视频质量测量工具,你就落后了. 下面是最流行的工具以及它们是如何工作的.

一次一个标题:比较每个标题的视频编码选项

节省带宽并降低成本:每个标题的视频编码解决方案让出版商摆脱了固定的编码阶梯. Explore the benefits of four methods.

走向低:小尺寸高质量的编码秘密

Netflix的紧凑移动下载文件看起来非常棒. 以下是视频创作者如何让自己的低比特率文件看起来同样令人印象深刻的方法.

回顾:卡佩拉系统坎布里亚联邦贸易委员会提供标题编码

每款游戏优化的好处不再只适用于主要玩家. 流媒体回顾了第一个针对小型内容所有者的解决方案,发现结果很有希望.

Netflix是如何开创逐标题视频编码优化的

一刀切的编码并不能产生最好的结果, so Netflix recently moved to per-title optimization. 了解为什么这样可以提高视频质量并节省带宽, but isn't the right model for every company.