Comparing compression in AV1, x264, and x265

DaGeek247@kbin.social · edit-2 2 years ago

Comparing compression in AV1, x264, and x265

Shdwdrgn@mander.xyz · 2 years ago

You might want to use a code block instead of bullet points for your table, the way you presented it is unreadable but I found the info on your blog page.

One of my criteria for video formats is the portability. Like sometimes I might watch something through a web browser which natively supports x264. Yeah x265 provides better compression, and AV1 certainly looks interesting, but they both require the addition of codecs on most of my viewing devices and in some cases that’s not possible.

For most cases I’ve found that CRF25 with x264 works reasonably well. I tend to download 720p videos to watch on our 1080p TV and don’t notice the difference except in very minor situations like rapid motion on a solid-color background (usually only seen on movie studio logo screens). Any sort of animated shows can go even lower without noticeable degradation.

DaGeek247@kbin.social · 2 years ago

I did try to format the table here better. I used code blocks the first time, and it ended up being even uglier. After about four edit attempts i kinda just gave up. Tables don’t seem to exist as far as I can tell either.

Your experience with x264 just about matches up with mine. As long as I don’t pixel peep, crf 24 does a pretty great job of conveying the information. It also does a pretty great job of working with just about everything compatibility-wise. I don’t expect it to go away any time soon specifically because of that.

AV1 is super neat in that we can buy hardware accelerated encoding for it for really cheap using the Intel Arc video cards, and can be decoded by their latest CPU generation. It makes for a great choice for something like security camera footage where playback compatibility is good enough (you can play it in a modern pc), hardware encoding works with a 200$ card, and you save a lot of money using the video card instead of buying extra storage space.

Atemu@lemmy.ml · 2 years ago

Tables	do	exist	!

| Tables | do | exist | ! |
|--------|----|-------|---|

DaGeek247@kbin.social · 2 years ago

Stolen. Thank you.

Atemu@lemmy.ml · 2 years ago

with a 200$ card, and you save a lot of money using the video card instead of buying extra storage space.

With $200, you could buy ~12TB worth of HDD(s) instead. You’d need >36TB of video for that to make financial sense and you’d always lose quality.

Additionally, you’d have to factor in the power it needs to transcode but, with HW accel, it’s not quite as much as with CPUs.

DaGeek247@kbin.social · 2 years ago

Sure, but that is a choice that couldn’t be made without first checking how much space is saved by switching codecs. This helps with making that decision, but i’m well aware it is only part of the information needed.

Atemu@lemmy.ml · 2 years ago

Oh the data is absolutely fine and helpful; I only take issue with the conclusion ;)

umulu@lemmy.world · 2 years ago

I would like to have seen more data on that table. The time it took to run each video compression… the final bitrate of each stream. Besides that, very interesting results.

Atemu@lemmy.ml · 2 years ago

The “av1” numbers, which codec is that? There are many av1 encoders and even for Intel HW accel, there are at least two.

vividspecter@lemm.ee · 2 years ago

It’s svt-av1, as can be seen from the ffmpeg command in the article.

DaGeek247@kbin.social · 2 years ago

From my blogpost, i’m using the following command to encode the video;

ffmpeg -i source.2160p.mkv
-map 0:v:0
-map -0:a -map -0:s -map_metadata -1
-c:v libsvtav1
-preset 3
-vf scale=w=1920:-2
-crf 23
dest.1080p.av1.mkv

Atemu@lemmy.ml · 2 years ago

That is not representative of what you’d get with an Intel card then. While they implement the same standard (AV1), they’re entirely different encoders with entirely different image quality characteristics.

Victor@lemmy.world · 2 years ago

How does that work? Aren’t two encoders of the same format supposed to produce the same output for the same input and configuration using some given algorithm? Otherwise I’d consider them different formats/codecs… 🤷‍♂️ Maybe that’s wrong of me?

LufyCZ@lemmy.world · 2 years ago

The issue is, you can optimize a software encoders continually, you can use tricks for better quality etc.

A hardware encoder is just that - hardware. As soon as it’s burned to the silicon, you’re not making any (at least substantial) changes to it. You might also be limited by what you can actually do directly in hardware without using too much die space.

Tldr.: no, you won’t get the same result

Victor@lemmy.world · 2 years ago

Tldr.: no, you won’t get the same result

What I’m saying is, shouldn’t you?

rentar42@kbin.social · edit-2 2 years ago

What you describe is true for many file formats, but for most lossy compression systems the “standard” basically only strictly explains how to decode the data and any encoder that produces output that successfully decodes that way is fine.

And the standard defines a collection of “tools” that the encoders can use and how exactly to use, combine and tweak those tools is up to the encoder.

And over time new/better combinations of these tools are found for specific scenarios. That’s how different encoders of the same codec can produce very different output.

As a simple example, almost all video codecs by default describe each frame relative to the previous one (I.e. it describes which parts moved and what new content appeared). There is of course also the option to send a completely new frame, which usually takes up more space. But when one scene cuts to another, then sending a new frame can be much better. A “bad” codec might not have “new scene” detection and still try to “explain the difference” to the previous scene, which can easily take up more space than just sending the entire new frame.

Victor@lemmy.world · 2 years ago

the “standard” basically only strictly explains how to decode the data and any encoder that produces output that successfully decodes that way is fine

Ah, okay, this explains the whole aspect of it then, for me. :-) If this is how a certain format is described, then it makes sense that encoders can produce different data, which then will be decoded as different output as well, all while all parties are compliant with the specification. That makes much more sense. Thanks for taking the time to explain everything, including I-frames and P-frames! ;-)

jbk@discuss.tchncs.de · 2 years ago

Doesn’t libsvtav1 do the same on all platforms since it’s CPU-based? At least that’s the exact encoder OP specified

Atemu@lemmy.ml · 2 years ago

Yes, yes it will. (Well, at least it should. If it doesn’t, that’s a bug.)

The problem here is that the premise of this post is evaluating buying a GPU with AV1 encoder in order to transcode a media library. Any GPU-based AV1 encoder will produce very different results than svt-av1, likely much worse results that is.

OpticalMoose@discuss.tchncs.de · 2 years ago

deleted by creator

GenderNeutralBro@lemmy.sdf.org · 2 years ago

Can you explain what you mean by “visually lossless”? Is this a purely subjective classification, or is there a specific definition or benchmark you used?

DaGeek247@kbin.social · edit-2 2 years ago

Visually lossless means I couldn’t tell an image difference even when pixel peeping with imgsli. Good enough means I couldn’t tell a difference in video, but could occasionally see a compression artifact in imgsli.

The VMAF results are purely objective measurements. You can read more about it here; https://en.wikipedia.org/wiki/Video_Multimethod_Assessment_Fusion

exu@feditown.com · 2 years ago

I’ve also gone down that rabbit hole and found Vivictpp pretty good. It allows you to play two videos so you can swipe between them like imgsli you mentioned.

There’s a whole range measurements trying to approximate quality differences between a video source and encode. PSNR, SSIM, VMAF, MS-SSIM
All of them with some strong areas and tricks you can use to cheat them.

Eskuero@lemmy.fromshado.ws · 2 years ago

Just buy bigger disks 🫢

force@lemmy.world · 2 years ago

Larger file size means significantly larger cost when you’re working with lots of data… especially when transferring data over the internet

crf	av1 KB	x265 KB	x264 KB
18	419,261	632,079	685,217 – x246 visually lossless
21	352,337	390,358 – x265 visually lossless	411,439
24	301,517 – av1 VAMF visually lossless	250,426	263,524 – x264 good enough
27	245,685	165,079 – x265 good enough	176,919
30	205,008	110,062	122,458
33	168,192	73,528	86,899
36	139,379 – av1 My visually lossless	48,516	63,214
39	116,096	31,670	47,161
42	97,365 – av1 my good enough	20,636	35,801
45	81,805	13,598	27,484
48	69,044	9,726	20,823
51	58,316	8,586 – worst possible	16,120 – worst possible
54	48,681	-	-
57	39,113	-	-
60	29,062	-	-
63	16,533 – worst possible	-	-