• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Kraken Vs ZLIB: 29% Smaller Game Sizes Losslessly, 297% Faster Decompression on PS5

GreyHand23

Member
Good luck with that.

Unless oodler starts to move those tasks towards the GPU, oodle seems like a nich for console exclusive games.

Oodle data compression is already used in many PC games. Compression in general is. Decompression of that data is already being done by the CPU. No developer has made a game yet for PC that requires decompression of a fast SSD most likely because most of their potential buyers don't have them and because there are many other bottlenecks on PC that prevent the full utilization of SSD speed in games. Once DirectStorage arrives for PC then you will see more developers making games with higher SSD speed in mind. Of course it would be better to decompress on GPU, and Nvidia has already shown a commitment to that. Most likely AMD will as well, but PC has always been about options so being able to use CPU to decompress will remain an option as well.
 

Bo_Hazem

Banned
Good luck with that.

Unless oodler starts to move those tasks towards the GPU, oodle seems like a nich for console exclusive games.

PS5's Kraken HW decompressor is an equivalent of 9x ZEN2 cores, the whole I/O is an equivalent of 11-12x ZEN2 cores.

Compression Offload Hardware​
Microsoft​
Sony​
Console​
Xbox Series X​
Playstation 5​
Algorithm​
ZLIB​
Kraken and ZLIB​
Maximum Output Rate​
6 GB/s​
22 GB/s​
Average Output Rate​
4.8 GB/s (6GB/s max with BCpack)​
8–9 GB/s (17GB/s with Oodle Texture)​
Equivalent Zen 2 CPU Cores​
3* + 1/10 of main CPU core​
9x independent​


The CPU time saved by these decompression units sounds astounding: the equivalent of about 9 Zen 2 CPU cores for the PS5, and about 5 for the Xbox Series X. Keep in mind these are peak numbers that assume the SSD bandwidth is being fully utilized—real games won't be able to keep these SSDs 100% busy, so they wouldn't need quite so much CPU power for decompression.


That's the advantage next gen consoles have that PC gamers seem to not understand.

EDIT:*It's 3 not 5:

 
Last edited:

Kenpachii

Member
Oodle data compression is already used in many PC games. Compression in general is. Decompression of that data is already being done by the CPU. No developer has made a game yet for PC that requires decompression of a fast SSD most likely because most of their potential buyers don't have them and because there are many other bottlenecks on PC that prevent the full utilization of SSD speed in games. Once DirectStorage arrives for PC then you will see more developers making games with higher SSD speed in mind. Of course it would be better to decompress on GPU, and Nvidia has already shown a commitment to that. Most likely AMD will as well, but PC has always been about options so being able to use CPU to decompress will remain an option as well.

SSD compression gets done on the GPU or not at all. there is no other option. If there shit doesn't work on PC gpu's there is no adoption there and the whole thing falls flat on there face as nobody has the cores to spend on it.

PS5's Kraken HW decompressor is an equivalent of 9x ZEN2 cores, the whole I/O is an equivalent of 11-12x ZEN2 cores.

Compression Offload Hardware​
Microsoft
Xbox Series X
Sony
Playstation 5
Algorithm
BCPack​
Kraken (and ZLib?)​
Maximum Output Rate
6 GB/s​
22 GB/s​
Typical Output Rate
4.8 GB/s​
8–9 GB/s​
Equivalent Zen 2 CPU Cores
5​
9​


The CPU time saved by these decompression units sounds astounding: the equivalent of about 9 Zen 2 CPU cores for the PS5, and about 5 for the Xbox Series X. Keep in mind these are peak numbers that assume the SSD bandwidth is being fully utilized—real games won't be able to keep these SSDs 100% busy, so they wouldn't need quite so much CPU power for decompression.


That's the advantage next gen consoles have that PC gamers seem to not understand.

And the switch GPU is 500 zen 2 cores performance wise. That's how useless that information is when u talk about PC architectures.

Now make a comparison that actually is useful in the real world such as GPU performance.

Here i will give you a hint.

b067ff41ae80d1a109656800cc595a72.png


CPU performance is barely relevant for SSD compression on PC.
 
Last edited:

Bo_Hazem

Banned
SSD compression gets done on the GPU or not at all. there is no other option. If there shit doesn't work on PC gpu's there is no adoption there and the whole thing falls flat on there face as nobody has the cores to spend on it.



And the switch GPU is 500 zen 2 cores performance wise. That's how useless that information is when u talk about PC architectures.

Now make a comparison that actually is useful in the real world such as GPU performance.

Here i will give you a hint.

b067ff41ae80d1a109656800cc595a72.png


CPU performance is barely relevant for SSD compression on PC.

Looks interesting, better wait for UE5 demo to be tested for comparison. Also it peaks at 14GB/s vs 22GB/s. All will be clear in the future as that tech isn't ready yet until around 2022.
 

Aladin

Member
Kraken VS ZLIB:

According to the official graph by RAD Game Tools, Kraken has 29% higher compression ratio over ZLIB (used in PS4 and other platforms), and does it 3-5x faster (doesn't really concern us as gamers, only good for devs). Both Kraken and ZLIB are lossless compressions, meaning data is kept as it is, like textures, audio, game files, etc. Also another advantage is being 297% (~4x times) faster in decompression, which is very critical for data streaming at least.

oodle-typical-vbar.png


Oodle Texture:

Textures in games take a big chunk of the total size. To amplify compression even further, Oodle Texture go 5-15% "near" lossless compression:

Lossless Transform for BC7: Oodle Texture also includes a lossless transform for BC7 blocks called "BC7Prep" that makes them more compressible. BC7Prep takes BC7 blocks that are often very difficult to compress and rearranges their bits, yielding 5-15% smaller files after subsequent compression. BC7Prep does require runtime reversal of the transform, which can be done on the GPU. BC7Prep can be used on existing BC7 encoded blocks, or for additional savings can be used with Oodle Texture RDO in near lossless mode. This allows significant size reduction on textures where maximum quality is necessary.

oodle_texture_bc7prep_ratios.png


Pushing compression with a slight loss, you can go up to 50% smaller textures files:

Oodle Texture Rate Distortion Optimization (RDO), sometimes known as "super compression", lets you encode BCN textures with your choice of size-quality tradeoff. Oodle Texture RDO searches the space of possible ways to convert your source texture into BCN, finding encodings that are both high visual quality and smaller after compression. RDO can often find near lossless encodings that save 10% in compressed size, and with only small visual difference can save 20-50%.

Oodle Texture RDO provides fine control with the lambda parameter, ranging from the same quality as non-RDO BCN encoding, to gradually lower quality, with no sudden increase in distortion, or unexpected bad results on some textures. Oodle Texture RDO is predictable and consistent across your whole data set; you can usually use the same lambda level on most of your textures with no manual inspection and tuning. Visit Oodle Texture RDO examples to see for yourself.

RDO encoding just produces BCN block data, which can be put directly into textures, and stored in hardware tiled order. Oodle Texture RDO does not make a custom output format and therefore requires no runtime unpacking.


Kraken Hardware Decompressor:

That's said, Kraken hardware decompressor on PS5 should work best with Oodle Texture, but Oodle Texture can work with any lossless decompressor, like ZLIB HW decompressor on PS4, Xbox Series X|S. Xbox Series X|S already have a great solution for texture compression known as BCpack.

Oodle Texture RDO optimizes for the compressed size of the texture after packing with any general purpose lossless data compressor. This means it works great with hardware decompression. On consoles with hardware decompression, RDO textures can be loaded directly to GPU memory, Oodle Texture doesn't require any runtime unpacking or CPU interaction at all. Of course it works great with Kraken and all the compressors in Oodle Data Compression, but you don't need Oodle Data to use Oodle Texture, they are independent.

Sources:



PS5 vs Xbox Series X

As Xbox Series X|S have around similar texture reduction, we can't say Oodle Texture is better than BCpack nor vice versa. But as PS5 has licensed Kraken to all PS4/PS5 developers and it still has 29% size reduction advantage over ZLIB on PS4 and Xbox Series X|S, mated with Kraken hardware decompressor. We also know that Xbox Series S will have gimped versions of the Series X textures, so will put it aside from the comparison. Assuming the textures/assets are the same on both consoles:

meta-chart.jpg


But, with the 297% (~4x times) faster decompression, which should be even better with Kraken HW decompressor, PS5 might leverage the extra space to use higher quality assets.

It doesn't stop right there though, with PS5 unprecedented SSD and I/O, it could eliminate the use of LOD's (Levels of Details), which can be 5-7 images in different resolutions/qualities. Sony Atom View has brought this technique in 2017, using only one high quality asset and streaming polygons as you get closer, resulting in much higher quality textures without wasting size on duplicates (5-7 versions of the same asset) in the old fashion LOD system. Atom View works as a plugin on UE and Unity engines, and probably with other Sony WWS proprietary engines:




Sony has helped Epic Games achieve similar tech known as "Nanite" on their upcoming Unreal Engine 5, using hundreds of billions of polygons of uncompressed 8K Hollywood-level assets with up to 16K shadows, all crunched losslessly to around 20 million polygons per frame for this 4.2 million pixel (1440p) gameplay demo:

(Timestamped)





TL;DR

PS5 should benefit from this headroom advantage to use higher quality assets than other platforms, or be more efficient with its storage for the same assets quality but smaller game sizes. All this while still being able to eliminate LOD's and use that extra space to have a bump on assets quality for the same smaller game size overall!

Thank you for reading.

EDIT: Thanks to GreyHand23 GreyHand23 for the addition:

This is facelifted PS5 SSD thread.
 

GreyHand23

Member
SSD compression gets done on the GPU or not at all. there is no other option. If there shit doesn't work on PC gpu's there is no adoption there and the whole thing falls flat on there face as nobody has the cores to spend on it.



And the switch GPU is 500 zen 2 cores performance wise. That's how useless that information is when u talk about PC architectures.

Now make a comparison that actually is useful in the real world such as GPU performance.

Here i will give you a hint.

b067ff41ae80d1a109656800cc595a72.png


CPU performance is barely relevant for SSD compression on PC.

Here's the problem. Neither you nor I know exactly how CPU is being used in games for PC currently. Can you offer evidence that data isn't currently compressed for PC games and that decompression isn't happening on the CPU? Now I agree with you that if a high decompression speed is needed its much more likely that the work will be done on a GPU because even if you have 16 cores, most PC users wouldn't want to use half of those just for decompression.

As for the RTX IO graph, Nvidia shared very little details about this. We don't know the performance hit for them to achieve that 14 GB/s figure nor do we know what kind of compression is being used or anything else. Not to mention that RTX IO literally doesn't matter until Direct Storage comes out and has to be implemented separately into each game. I'll wait to see what it actually is before falling for Nvidia's likely misleading PR machine.
 

Kumomeme

Member
..On the PC you don't have hardware Kraken, so software Kraken is used on the CPU. To keep up with the fastest SSD speeds this requires several cores; luckily high end PC's also have lots of CPU cores!

so for long term more cpu core is better for pc. Assuming RTX IO+directstorageAPI and equavalent tech from AMD gpu will take quite some time before becoming new standard.
 
Last edited:

Soodanim

Gold Member
Even ignoring the one man hype show, we're getting to the stage where speculation is mostly done and we're moving into facts. But that just makes me want to see it all in action. I want to know what the reality is.
 

Bo_Hazem

Banned
Exactly. A waste of resources. I doubt anyone will bother.

It's not a waste at all. The closer you get, the more sharpness and quality is till reserved for higher assets. Not necessarily 8K, but 4K should be the standard in next gen. Yakuza for example is more like 1080p models at best on next gen, that's why skins are flat and blurry, and overall graphics are pretty last gen.

Even ignoring the one man hype show, we're getting to the stage where speculation is mostly done and we're moving into facts. But that just makes me want to see it all in action. I want to know what the reality is.

Reality is the better compression you get, the more room for higher quality assets to be stored in the game, the faster decompression you get + faster SSD + faster I/O, the higher assets you can use directly from the storage without choking the RAM, directly to GPU caches.
 
Last edited:
It's not a waste at all. The closer you get, the more sharpness and quality is till reserved for higher assets. Not necessarily 8K, but 4K should be the standard in next gen. Yakuza for example is more like 1080p models at best on next gen, that's why skins are flat and blurry, and overall graphics is pretty last gen.



Reality is the better compression you get, the more room for higher quality assets to be stored in the game, the faster decompression you get + faster SSD + faster I/O, the higher assets you can use directly from the storage without choking the RAM, directly to GPU caches.
Exactly just looking at unreal engine 5 movie quality looks as it was using 8k assets running at 1440p and thats without optimization i personally think once oddle comes into play and optimizing it. Ita crystal clear futher down the line it could be 8k assets around 1800p. Im not complaining 4k is gonna be around for a good while. So having 8k assets at 1800p using ai image to scale upto 4k its really going to look amazing
 
Last edited:

Bo_Hazem

Banned
Exactly just looking at unreal engine 5 movie quality looks as it was using 8k assets running at 1440p and thats without optimization i personally think once oddle comes into play and optimizing it. Ita crystal clear futher down the line it could be 8k assets around 1800p. Im not complaining 4k is gonna be around for a good while. So having 8k assets at 1800p using ai image to scale upto 4k its really going to look amazing

Those were uncompressed 8K movie assets, not 8K gameplay assets. THose assets are 5.4x higher than the assets used in the famous Rebirth trailer from Quixel, which are 4K assets with 25% compression:







As you can see, 4K with 25% compression looks insanely good up close, and that's 5.4x less than the ones used in the UE5 demo gameplay on PS5. Also with that UE5 demo it was more like a stress test, and they said that it was hitting around 40-50fps but capped at 30fps, and aiming for 60fps. So you can go for solid 4K@30fps even with that crazy assets! Also that Lumen tech is software based raytracing for global illumination, if they can make it compatible with raytracing cores (intersection engines) that will lift the performance even higher.

Not just that, new smarter ways of raytracing is introduced with LocalRay Engine by ADSHIR:







Seems Sony has secured a deal with them, as they mentioned next gen "console" and using Spiderman in their trailer, which is a property of Sony and Marvel. It makes real-time raytracing possible even on smartphones! And works on PC as well.

Exciting times ahead!
 
Those were uncompressed 8K movie assets, not 8K gameplay assets. THose assets are 5.4x higher than the assets used in the famous Rebirth trailer from Quixel, which are 4K assets with 25% compression:







As you can see, 4K with 25% compression looks insanely good up close, and that's 5.4x less than the ones used in the UE5 demo gameplay on PS5. Also with that UE5 demo it was more like a stress test, and they said that it was hitting around 40-50fps but capped at 30fps, and aiming for 60fps. So you can go for solid 4K@30fps even with that crazy assets! Also that Lumen tech is software based raytracing for global illumination, if they can make it compatible with raytracing cores (intersection engines) that will lift the performance even higher.

Not just that, new smarter ways of raytracing is introduced with LocalRay Engine by ADSHIR:







Seems Sony has secured a deal with them, as they mentioned next gen "console" and using Spiderman in their trailer, which is a property of Sony and Marvel. It makes real-time raytracing possible even on smartphones! And works on PC as well.

Exciting times ahead!


Who knows see what happens further down the console life cycle new assets new game engines
 

Bo_Hazem

Banned
Who knows see what happens further down the console life cycle new assets new game engines

Indeed, it's just like every generation, and new engines should come with smarter, efficient ways, and current ones need to be overhauled. The more we get into next gen, the better games we should see after dropping support of current gen and HDD storage.
 
Seems like there would be a diminishing returns point with this. Makes me curious what game sizes would be around, even with compression, with these ultra high quality textures. It's great that they look super clean zoomed in an insane amount, but the average user doesn't do this.
 
Last edited:

GymWolf

Gold Member
PS5's Kraken HW decompressor is an equivalent of 9x ZEN2 cores, the whole I/O is an equivalent of 11-12x ZEN2 cores.

Compression Offload Hardware​
Microsoft
Xbox Series X
Sony
Playstation 5
Algorithm
BCPack​
Kraken (and ZLib?)​
Maximum Output Rate
6 GB/s​
22 GB/s​
Typical Output Rate
4.8 GB/s​
8–9 GB/s​
Equivalent Zen 2 CPU Cores
5​
9​


The CPU time saved by these decompression units sounds astounding: the equivalent of about 9 Zen 2 CPU cores for the PS5, and about 5 for the Xbox Series X. Keep in mind these are peak numbers that assume the SSD bandwidth is being fully utilized—real games won't be able to keep these SSDs 100% busy, so they wouldn't need quite so much CPU power for decompression.


That's the advantage next gen consoles have that PC gamers seem to not understand.
That new tech from nvidia called rtx IO is a different thing?
 

Bo_Hazem

Banned
That new tech from nvidia called rtx IO is a different thing?

Yes, it's using Microsoft DirectStorage that will come in around 2022, and uses GPU power to replicate decompressors instead of relying traditionally on the CPU on PC which puts PC's a bit behind PS5 and XSX in that regard. Kraken HW on PS5 is fully independent. ZLIB HW decompressor on XSX uses a tiny assist from the CPU.

For a PC equivalent, you might need AMD threadripper with like 18 cores (PS5's I/O is an equivalent of ~12-11x total, 9x for Kraken alone). Also to replicate PS5's GPU-based Tempest Engine, HRTF-based true 3D audio PC GPU's must use some resources. With the new massive RTX 30 series and upcoming Big Navi, they might brute force their way to achieve that, assuming that developers will take the effort to code their games to leverage that, which is critical. If developers code their games to take advantage of 32-128GB RAM's, you can match that PS5 I/O throughput with probably SATA3 SSD's, but the market isn't big enough for that yet.
 

Bo_Hazem

Banned
Seems like there would be a diminishing returns point with this. Makes me curious what game sizes would be around, even with compression, with these ultra high quality textures. It's great that they look super clean zoomed in an insane amount, but the average user doesn't do this.

Yup, it depends on the type of game you're making. But high quality assets can be reused endlessly without bloating the overall game size like rocks/dirt/forest/etc, those reuse assets extensively from a handful of unique assets.
 
Last edited:

GymWolf

Gold Member
Yes, it's using Microsoft DirectStorage that will come in around 2022, and uses GPU power to replicate decompressors instead of relying traditionally on the CPU on PC which puts PC's a bit behind PS5 and XSX in that regard. Kraken HW on PS5 is fully independent. ZLIB HW decompressor on XSX uses a tiny assist from the CPU.

For a PC equivalent, you might need AMD threadripper with like 18 cores (PS5's I/O is an equivalent of ~12-11x total, 9x for Kraken alone). Also to replicate PS5's GPU-based Tempest Engine, HRTF-based true 3D audio PC GPU's must use some resources. With the new massive RTX 30 series and upcoming Big Navi, they might brute force their way to achieve that, assuming that developers will take the effort to code their games to leverage that, which is critical. If developers code their games to take advantage of 32-128GB RAM's, you can match that PS5 I/O throughput with probably SATA3 SSD's, but the market isn't big enough for that yet.
I don't care about the tempest audio, but don't you think that a pc can cover any difference in decompression with a fuckload of more power in literally every parts?? From gpu to cpu to ram to ssd (future ssd are gonna crush the ssd inside ps5) etc.

Or this is something that you can't surpass just with raw power\speed?!
 

Bo_Hazem

Banned
I don't care about the tempest audio, but don't you think that a pc can cover any difference in decompression with a fuckload of more power in literally every parts?? From gpu to cpu to ram to ssd (future ssd are gonna crush the ssd inside ps5) etc.

Or this is something that you can't surpass just with raw power\speed?!

They can even with RTX 20 series. But the major problem is if the devs are gonna go hard for like 1-3% of the PC gaming community. But at least now with that theoretical 14GB/s from NVIDIA, we can expect PC's to catch up by 2022, and easily crush PS5 with PCIe 5.0 that would saturate 64GB/s!!! But decompression speed would be the only bottleneck, which as well should be much better by then.


You can expect PS5 Pro with 72CU's and another version from Xbox to compete with the mid-gen refreshes as PC's in 3-4 years will be way too overpowered and 20-23TF PS5 Pro would still be low-mid tier.
 
Last edited:
sony could double on the cu on the ps5 pro which equates to around 72cu possibly 70 which would be a cut down version of the big navi with pcie5 if that released if not a more powerful ssd then current ps5 model
 

GymWolf

Gold Member
GymWolf GymWolf So my current suggestion is, you keep your current PC for now, and get PS5 or XSX or both. Upgrade your whole PC by 2023 or when PCIe 5.0 is out.
Ps5 is already in preorder but i'm not gonna way 2 full years to upgrade my pc, i have a 4k tv and i love 60 fps, my gpu doesn't cut at all and i wanna use that rtx io thing from nvidia and i want a 8 core cpu like the consolles.
Imagine having a pc and playing worse than a console, not on my watch sir :lollipop_grinning_sweat:

Also third party devs are gonna develop on sex ssd not the one inside ps5 so i only need to get on par with that one to get all the ssd fuckery, outside of ps5 devs, nobody in this earth is gonna do groundbreaking stuff with the ps5 ssd, slower common denominator is the general rule.

I hate the waiting game on pc but it is what it is.
 
Last edited:

Bo_Hazem

Banned
sony could double on the cu on the ps5 pro which equates to around 72cu possibly 70 which would be a cut down version of the big navi with pcie5 if that released if not a more powerful ssd then current ps5 model

They would do something like PS4 Pro, waiting for like 5-3nm, make a PS5 Slim and slightly more powerful, make a butterfly/stacked 2x dies of it for the Pro, making it 72CU. With their weird motherboard shown recently, you can even expect 2 dies, one on each side, connected together, and double-sided cooling. But patents shown already suggest stackable dies, something like AMD's roadmap. Everything should be clear with RDNA2 reveal later this month, I guess.

Ps5 is already in preorder but i'm not gonna way 2 full years to upgrade my pc, i have a 4k tv and i love 60 fps, my gpu doesn't cut at all and i wanna use that rtx io thing from nvidia and i want a 8 core cpu like the consolles.

Also third party devs are gonna develop on sex ssd not the one inside ps5 so i only need to get on par with that one to get all the ssd fuckery.

I hate the waiting game on pc but it is what it is.

I would highly suggest you wait for PCIe 5.0 motherboards, at least. I built my PC then few months later PCIe 4.0 came out. It fucking stings. :lollipop_tears_of_joy: But you won't be disappointed at all building your PC now for RTX 30 series as things will get better after like 2021-2022 when DirectStorage is out.
 

GymWolf

Gold Member
They would do something like PS4 Pro, waiting for like 5-3nm, make a PS5 Slim and slightly more powerful, make a butterfly/stacked 2x dies of it for the Pro, making it 72CU. With their weird motherboard shown recently, you can even expect 2 dies, one on each side, connected together, and double-sided cooling. But patents shown already suggest stackable dies, something like AMD's roadmap. Everything should be clear with RDNA2 reveal later this month, I guess.



I would highly suggest you wait for PCIe 5.0 motherboards, at least. I built my PC then few months later PCIe 4.0 came out. It fucking stings. :lollipop_tears_of_joy: But you won't be disappointed at all building your PC now for RTX 30 series as things will get better after like 2021-2022 when DirectStorage is out.
Dude there is no way that i can wait for 2023 and playing third party worse than a console with a gaming pc at home.

Third party are 95% of games on the market, the 3-4 good exclusives from sony every year are not nearly enough to play for an entire year.

I mean it's different from your case, you jist told me that pcie 5.0 wont come out before 2 years, so if i buy something now i know for sure that is gonna be the top tier hardware for at least 2 years.

I can wait like 3-4 months tops and i'm still overstimating my patience...
 

Bo_Hazem

Banned
Dude there is no way that i can wait for 2023 and playing third party worse than a console with a gaming pc at home.

Third party are 95% of games on the market, the 3-4 good exclusives from sony every year are not nearly enough to play for an entire year.

I mean it's different from your case, you jist told me that pcie 5.0 wont come out before 2 years, so if i buy something now i know for sure that is gonna be the top tier hardware for at least 2 years.

I can wait like 3-4 months tops and i'm still overstimating my patience...

Don't worry mate, though you already have like a good, powerful PC that you may just upgrade the GPU to like 3080. Go ahead and enjoy ;)
 

GymWolf

Gold Member
Don't worry mate, though you already have like a good, powerful PC that you may just upgrade the GPU to like 3080. Go ahead and enjoy ;)
I already have a good pc (but maybe you weren't asking) with a 2070super and a 8600k but it's not nearly enough for 4k60...my plan is for a complete renovation except maybe the ram because i'm not gonna wait ddr5 and my current ram is pretty good, maybe i'm gonna add another 16gb just for the laugh.

so gpu-cpu-ssd-mobo-psu and probably new case too, it's gonna be a fucking bloodbath 🕺
 
Last edited:

Bo_Hazem

Banned
I already have a good pc (but maybe you weren't asking) with a 2070super and a 8600k but it's not nearly enough for 4k60...my plan is for a complete renovation except maybe the ram because i'm not gonna wait ddr5 and my current ram is pretty good, maybe i'm gonna add another 16gb just for the laugh.

so gpu-cpu-ssd-mobo-psu and probably new case too, it's gonna be a fucking bloodbath 🕺

Sounds solid to me! Hope you enjoy it. :messenger_winking: (y)
 
For those who have problems visualizing what high bandwidth data access does think of the neogeo games ported to the PS1... The PS1 had more processing power (enough to do 3D even),much more RAM and video memory than the neogeo... Yet neogeo games had to be seriously compromised when ported to it.

That's...not exactly 100% the reason, for this particular example. Many of those same games got ported to the Saturn and played better on that platform, more or less on par with the Neo-Geo AVS/MVS versions aside from load times (for games not utilizing expansion carts)...

Want to know what really made the difference for Neo-Geo ports (and Capcom ones, while we're at it) on Saturn compared to PS1? Extra RAM. If those games used the 4MB RAM cartridge that basically cut down on a metric ton of the load times and limitations in having fast access to graphics data. Some games like the KOF '97 port on Saturn used ROM cartridges instead that essentially did the same thing, but obviously were read-only format.

So you're looking at two platforms (Saturn, PS1) that had their advantages and disadvantages to one another, but when you look at those Neo-Geo ports you bring up, the main factor between them in Saturn getting the better ports was because it had more (optional) RAM. Actually, the probably bigger reason for it is because the Saturn had actual dedicated 2D hardware designed into it (technically you don't need 2D hardware like blitters and VDPs to do 2D, especially nowadays, but back then 3D console-level hardware was not capable enough to provide AAA 2D-style games as good as a system that had dedicated 2D hardware in its design).

In a way, you can transform that argument in favor of PS5 to a modern context in that it has more "hardware" built into it to handle faster data transfer to and from storage...but you can also make that argument in favor of Series X to a modern context because it has some custom "hardware" (mip-blending hardware unit on the GPU for SFS) built into it to handle expedient usage of a specific type of data (textures). Can be looked at either way, or (preferably) both ways.
 

Bo_Hazem

Banned
That's...not exactly 100% the reason, for this particular example. Many of those same games got ported to the Saturn and played better on that platform, more or less on par with the Neo-Geo AVS/MVS versions aside from load times (for games not utilizing expansion carts)...

Want to know what really made the difference for Neo-Geo ports (and Capcom ones, while we're at it) on Saturn compared to PS1? Extra RAM. If those games used the 4MB RAM cartridge that basically cut down on a metric ton of the load times and limitations in having fast access to graphics data. Some games like the KOF '97 port on Saturn used ROM cartridges instead that essentially did the same thing, but obviously were read-only format.

So you're looking at two platforms (Saturn, PS1) that had their advantages and disadvantages to one another, but when you look at those Neo-Geo ports you bring up, the main factor between them in Saturn getting the better ports was because it had more (optional) RAM. Actually, the probably bigger reason for it is because the Saturn had actual dedicated 2D hardware designed into it (technically you don't need 2D hardware like blitters and VDPs to do 2D, especially nowadays, but back then 3D console-level hardware was not capable enough to provide AAA 2D-style games as good as a system that had dedicated 2D hardware in its design).

In a way, you can transform that argument in favor of PS5 to a modern context in that it has more "hardware" built into it to handle faster data transfer to and from storage...but you can also make that argument in favor of Series X to a modern context because it has some custom "hardware" (mip-blending hardware unit on the GPU for SFS) built into it to handle expedient usage of a specific type of data (textures). Can be looked at either way, or (preferably) both ways.

Great input! But the data goes directly from the SSD>I/O>GPU caches, and with GPU caches scrubbers you can be more efficient while streaming what's needed on the fly. The advantage is massively in favor of PS5.

By the way, you can still use Oodle Textures over BCpack on Xbox:



But overall, Kraken decompression is around 297% faster than ZLIB. That's mated with other factors in the SSD:

PS5 has DRAM, XSX is DRAM-less. PS5 uses 4x PCIe 4.0 lanes, XSX uses 2x PCIe 4.0 lanes. PS5 has 12-channel for 12-chips (1:), XSX is 4-channle for probably 16 chips (1:4). PS5 has 6 priority levels to make 6 unique orders from the SSD, XSX has 2 priority levels. PS5 is 5.5GB/s RAW, XSX is 2.4GB/s RAW. PS5's decompressor is capped at 22GB/s, XSX is capped at 6GB/s.

As you can see, it's a sum of many factors, and it's impossible for XSX to surpass 25% of PS5's total speed physically, and that gap extends with 297% decompression speed advantage for PS5, and overall 29% smaller for the same game assets.
 
Last edited:

Vaztu

Member
Microsoft
Xbox Series X
Sony
Playstation 5
Algorithm
BCPack​
Kraken (and ZLib?)​
Maximum Output Rate
6 GB/s​
22 GB/s​
Typical Output Rate
4.8 GB/s​
8–9 GB/s​
Equivalent Zen 2 CPU Cores
5
9​

You are wrong here Bo. For Decompression only, XSX uses 3 Zen2 CPU Core Equivalent not 5 that you have written.

5 Zen2 CPU Core Equivalent workload is for total IO system, of which now XSX uses decompression chip + 1/10 of Zen2 CPU Core.

So it still has to talk to the CPU, as workload is not completely offloaded. So there is still latency, and you cant compare it to the PS5 solution.

Source: https://www.eurogamer.net/articles/digitalfoundry-2020-inside-xbox-series-x-full-specs
 

Bo_Hazem

Banned
You are wrong here Bo. For Decompression only, XSX uses 3 Zen2 CPU Core Equivalent not 5 that you have written.

5 Zen2 CPU Core Equivalent workload is for total IO system, of which now XSX uses decompression chip + 1/10 of Zen2 CPU Core.

So it still has to talk to the CPU, as workload is not completely offloaded. So there is still latency, and you cant compare it to the PS5 solution.

Source: https://www.eurogamer.net/articles/digitalfoundry-2020-inside-xbox-series-x-full-specs

Thanks a lot for correct me! I'll edit that post now and use the source provided. That adds further unwanted latency, even if it's only 1/10th of a Zen2 core.
 

geordiemp

Member
Yes:


Patent:


What is exciting is the upscaling is a mix of temporal and ML.

It would be nice if this somehow replaced the ps4 pro checkerboard and so we got a natural enhancement for Back compat.

That would be AWESOME as all pr4 pro compatible titles,, which is most of the AAA games, would be 60 FPS and enhanced.

There is more secrets to be revealed most likely.
 
Last edited:

geordiemp

Member
Thanks a lot for correct me! I'll edit that post now and use the source provided. That adds further unwanted latency, even if it's only 1/10th of a Zen2 core.

What we are maybe missing is how ps4 games are historically compressed on disk, if they currently use Kraken then even BC titles could have fast loads but not up to full ps5 titles, as Kraken is so much faster to unpack.

We have no idea if the system apis can trick the ps4 games to use the dedicated HW decoder, that would be something.
 
Last edited:

Vaztu

Member
What we are maybe missing is how ps4 games are historically compressed on disk, if they currently use Kraken then even BC titles could have fast loads but not up to full ps5 titles, as Kraken is so much faster to unpack.

We have no idea if the system apis can trick the ps4 games to use the dedicated HW decoder, that would be something.

IIRC that dedicated HW decompression chip supports Zlib too. So older PS4 games that use Zlib should see an automatic improvement.

Edit: Bo_Hazem Bo_Hazem looking at your table above, you should add that PS5 supports Kraken AND Zlib Algorithm.
 
Last edited:
Great input! But the data goes directly from the SSD>I/O>GPU caches, and with GPU caches scrubbers you can be more efficient while streaming what's needed on the fly. The advantage is massively in favor of PS5.

By the way, you can still use Oodle Textures over BCpack on Xbox:



But overall, Kraken decompression is around 297% faster than ZLIB. That's mated with other factors in the SSD:

PS5 has DRAM, XSX is DRAM-less. PS5 uses 4x PCIe 4.0 lanes, XSX uses 2x PCIe 4.0 lanes. PS5 has 12-channel for 12-chips (1:), XSX is 4-channle for probably 16 chips (1:4). PS5 has 6 priority levels to make 6 unique orders from the SSD, XSX has 2 priority levels. PS5 is 5.5GB/s RAW, XSX is 2.4GB/s RAW. PS5's decompressor is capped at 22GB/s, XSX is capped at 6GB/s.

As you can see, it's a sum of many factors, and it's impossible for XSX to surpass 25% of PS5's total speed physically, and that gap extends with 297% decompression speed advantage for PS5, and overall 29% smaller for the same game assets.


Gotta correct you on just a few slight things there. In a traditional sense, XSX doesn't have a DRAM cache, true. But like some people such as geordiemp geordiemp mentioned in the past, that's not necessarily a requirement to have a performant SSD. And as some people on B3D have speculated, they (MS) probably use some of the OS-reserved RAM for SSD cache and/or memory-mapping functions (you can look into the Flashmap papers for more info on this).

PS5 actually has 6x 128 MB modules, not 12x 64 MB ones. Easy mistake to make; I thought it was 12x chips too until the teardown video last week. Regarding priority levels, MS haven't actually said what theirs is; again if you look at the Flashmap papers they do talk about addressing priority levels in that solution and it's probably the solution most likely being implemented (in some form at least) with XvA. So they may have more than 2 priority levels; maybe not 6, but possibly more than 2, if they have indeed identified that as something to rectify with DirectStorage (and XvA) going forward.

I'd suspect for Sony, since they're actually 6x 128 MB modules, they're muxing two channels to a single channel like some multi-channel NAND design. So the single channel the two mux to would provide the required bandwidth. Dunno what the NAND and channel setup is with XSX, but it is probably less channels like you've mentioned. And nothing to contest in terms of the raw SSD bandwidths or compressed figures.

I think in practice, and I've said this for a long time now, it's things like what I've mentioned (and stuff anyone can research into) why in actuality the two systems will probably perform closer in terms of data I/O than the paper specs suggest. Not in absolute terms to close raw or compressed bandwidth caps; those are still going to be in PS5's favor. But enough to where leveraging stuff like XvA should make it perform better than some in gaming think it will, and that's ultimately best for 3P devs (the vast majority of devs).

It's why I actually mentioned the Saturn in that other response; ultimately effective data throughput in the I/O isn't simply down to the delivery medium. The raw capabilities of the delivery medium play a big part in it, but not the only big part. The "tricky" part with XvA is that devs have to program against it more explicitly, so things related to it (like SFS), might come with a higher learning curve. Could take time for some devs to leverage it depending on how integrated abstraction for it in various game engines (like UE5, Unity, etc.) for it are.
 
Last edited:

geordiemp

Member
IIRC that dedicated HW decompression chip supports Zlib too. So older PS4 games that use Zlib should see an automatic improvement.

It depends if the game needs recompiled to the new IO / disk format or it can use hardware to help the old game sas they are. We dont know yet.

I recall Cerny saying some ps4 games had used Kraken for a while, but when the switch from zlib is anybodys guess.
 
Last edited:

Vaztu

Member
It depends if the game needs recompiled to the new IO / disk format or it can use hardware to help the old game sas they are. We dont know yet.

Aye, thats true. The bottleneck could be somewhere else for PS4 games to see automatic improvement.

Didnt PS4 PRO use this new File IO mapping system first ? My memory is fuzzy on this.
 

Bo_Hazem

Banned
Gotta correct you on just a few slight things there. In a traditional sense, XSX doesn't have a DRAM cache, true. But like some people such as geordiemp geordiemp mentioned in the past, that's not necessarily a requirement to have a performant SSD. And as some people on B3D have speculated, they (MS) probably use some of the OS-reserved RAM for SSD cache and/or memory-mapping functions (you can look into the Flashmap papers for more info on this).

PS5 actually has 6x 128 MB modules, not 12x 64 MB ones. Easy mistake to make; I thought it was 12x chips too until the teardown video last week. Regarding priority levels, MS haven't actually said what theirs is; again if you look at the Flashmap papers they do talk about addressing priority levels in that solution and it's probably the solution most likely being implemented (in some form at least) with XvA. So they may have more than 2 priority levels; maybe not 6, but possibly more than 2, if they have indeed identified that as something to rectify with DirectStorage (and XvA) going forward.

I'd suspect for Sony, since they're actually 6x 128 MB modules, they're muxing two channels to a single channel like some multi-channel NAND design. So the single channel the two mux to would provide the required bandwidth. Dunno what the NAND and channel setup is with XSX, but it is probably less channels like you've mentioned. And nothing to contest in terms of the raw SSD bandwidths or compressed figures.

I think in practice, and I've said this for a long time now, it's things like what I've mentioned (and stuff anyone can research into) why in actuality the two systems will probably perform closer in terms of data I/O than the paper specs suggest. Not in absolute terms to close raw or compressed bandwidth caps; those are still going to be in PS5's favor. But enough to where leveraging stuff like XvA should make it perform better than some in gaming think it will, and that's ultimately best for 3P devs (the vast majority of devs).

It's why I actually mentioned the Saturn in that other response; ultimately effective data throughput in the I/O isn't simply down to the delivery medium. The raw capabilities of the delivery medium play a big part in it, but not the only big part. The "tricky" part with XvA is that devs have to program against it more explicitly, so things related to it (like SFS), might come with a higher learning curve. Could take time for some devs to leverage it depending on how integrated abstraction for it in various game engines (like UE5, Unity, etc.) for it are.

Ok I'll start with your post, but before digging deep in your thoughtful post, I would post this, it's 12 chips per 12 channels:

ps5-ssd-gdc-presentation.jpg


You can as well get to the Road to PS5 video. The can be separated inside, undistinguishable from the outside. Having 12 chips is much better for data access that could be scattered around many chips, and you can't have more channels that the chips themselves!

Here is an album of screenshots of the PS5 that might help:


kyliethicc kyliethicc suggested DDR4 here next to the SSD field:

vlcsnap-2020-10-07-16h53m21s829.png


I'll check if I can get a clearer image from the video.
 
Last edited:

Bo_Hazem

Banned
Maybe this? At the top:

vlcsnap-2020-10-07-16h53m33s380.png


Samsung DDR4:

samsung_ddr4_10nm-v2-2.jpg


As it usually happens with Samsung’s major DRAM-related announcements, the news today consists of two parts: the first one is about the new DDR4 IC itself, the second part is about the second generation '10 nm-class' (which Samsung calls '1y' nm) manufacturing technology that will be used for other DRAM products by the company. Both parts are important, but let’s start with the new chip.


Or is it just DRAM and they are similarly looking?
 
Last edited:

RaySoft

Member
I think its fair to conclude that Sony first party, at the very least, will be hitting that theoretical peak very soon.
Possibly, but designing the games the same way as it's been done for ages is not the way to go. This I/O opens up new opportunities to design you pipeline differently.
As Bo points out LODs are a part of this legacy game mechanic that will die very soon. More and more tools supports ways to import prim. model and automatically dowscale from there in small incremental steps instead of a few steps, wich we have today.
Calculation power have increased exponetially in a few years wich opens up this ability, and more, so they don't have to bake everything offline. This gen comes out really at the cusp of that threshold where many of the things we had to do offline, now can be done in real-time. The last drawback was the I/O, but now that's also been taken care of.

On a more personal note, I just wish ppl could have som friggin' patience, but everythig today has to materialzie at a goddamn lightspeed for millenials not to loose intrest.
If they can't have their instant gratification, they turn to internet to spew their utter bullocks "opinions". When Sony said "We believe in generations" you should know they were talking about the whole next-gen cycle, not the first interim of games released the first year. peace
 

RaySoft

Member
SSD compression gets done on the GPU or not at all. there is no other option. If there shit doesn't work on PC gpu's there is no adoption there and the whole thing falls flat on there face as nobody has the cores to spend on it.



And the switch GPU is 500 zen 2 cores performance wise. That's how useless that information is when u talk about PC architectures.

Now make a comparison that actually is useful in the real world such as GPU performance.

Here i will give you a hint.

b067ff41ae80d1a109656800cc595a72.png


CPU performance is barely relevant for SSD compression on PC.
Like a GPU could do a CPU's general purpose tasks. gl.
Now explain to us how the graph you presented explains that much more....
 

RaySoft

Member
That's...not exactly 100% the reason, for this particular example. Many of those same games got ported to the Saturn and played better on that platform, more or less on par with the Neo-Geo AVS/MVS versions aside from load times (for games not utilizing expansion carts)...

Want to know what really made the difference for Neo-Geo ports (and Capcom ones, while we're at it) on Saturn compared to PS1? Extra RAM. If those games used the 4MB RAM cartridge that basically cut down on a metric ton of the load times and limitations in having fast access to graphics data. Some games like the KOF '97 port on Saturn used ROM cartridges instead that essentially did the same thing, but obviously were read-only format.

So you're looking at two platforms (Saturn, PS1) that had their advantages and disadvantages to one another, but when you look at those Neo-Geo ports you bring up, the main factor between them in Saturn getting the better ports was because it had more (optional) RAM. Actually, the probably bigger reason for it is because the Saturn had actual dedicated 2D hardware designed into it (technically you don't need 2D hardware like blitters and VDPs to do 2D, especially nowadays, but back then 3D console-level hardware was not capable enough to provide AAA 2D-style games as good as a system that had dedicated 2D hardware in its design).

In a way, you can transform that argument in favor of PS5 to a modern context in that it has more "hardware" built into it to handle faster data transfer to and from storage...but you can also make that argument in favor of Series X to a modern context because it has some custom "hardware" (mip-blending hardware unit on the GPU for SFS) built into it to handle expedient usage of a specific type of data (textures). Can be looked at either way, or (preferably) both ways.
Saturn was always a 2D console and a beast at it to boot. The 3D portion was an afterthought when Sega got wind of Sonys PSX.
PSX was primarily a 3D console, so Saturn was much better at 2D than the PSX.
 
Last edited:

Soodanim

Gold Member
It's not a waste at all. The closer you get, the more sharpness and quality is till reserved for higher assets. Not necessarily 8K, but 4K should be the standard in next gen. Yakuza for example is more like 1080p models at best on next gen, that's why skins are flat and blurry, and overall graphics are pretty last gen.



Reality is the better compression you get, the more room for higher quality assets to be stored in the game, the faster decompression you get + faster SSD + faster I/O, the higher assets you can use directly from the storage without choking the RAM, directly to GPU caches.
But I want to SEEEEEEE it. Done talking. I want to see results!
 
Top Bottom