• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Digital Foundry - Playstation 5 Pro specs analysis, also new information

SlimySnake

Flashless at the Golden Globes
I was watching the DF video on control, and while the average delta between the xsx and ps5 was 16% in Alex's 25 scenarios, the corridor of doom which is the toughest RT room in the game had the xsx with just a 1 fps lead.

This corridor is full of reflections within reflections and the high CU xsx just doesnt give the same bang for buck. The other more simpler scenarios with fewer RT spaces gave a 10-20% increase in framerate.

To me, its clear that the 44% extra CUs didnt really matter when it came to actual RT performance. RT performance is either tied to higher clocks or infinity cache or both. The PS5 Pro better have infinity cache or they will run into the same bottlenecks as the XSX since they are all but confirmed to be using 60 CUs and 2.1 Ghz clocks. The leaked docs showing 2-4x increase hopefully means that infinity cache is on there.

hpx3HHP.jpg
 

Senua

Gold Member
Richard also labeled outlets referring to PS5 Pro as a "RT monster" as overblown. He said about PS5 "it's not a RT monster, it's just better at RT" Excuse me? The hypocrisy is beyond insane now. No way would leaked docs from Xbox claiming a 2-4x boost in RT get the same downplay and subdued reaction from them, and rightfully so.

Also, 60 CU (PS5 Pro) vs 96 CU (7900xtx) delta isn't far off from 36 vs 52 CU of PS5 and Series X. And the base consoles are of the same core architecture! It's no wonder why he/they are still confused about PS5 and Series X performance differences.
"Richard also labeled outlets referring to PS5 Pro as a "RT monster" as overblown."

Yes because he's not a fool.
 

SlimySnake

Flashless at the Golden Globes
On DF Direct (Early Access), they took a question asking whether the PS5 Pro could open the door for path-traced applications like CP and AW2. They all laughed and basically said no. I was actually shocked to see Richard one-up Alex with his ignorant reasoning, saying since 7900xtx has relatively "bad" performance with CP path tracing and has more CUs than PS5 Pro, PS5 Pro will not be able to run it. He takes no consideration for architectural improvements and other important unknowns for both raster and RT between RDNA 3 and RDNA 3.5/4. Then basically says since Sony didn't mention anything about developers being able to do PT, we should assume PT isn't feasible on PS5 Pro.
Cyberpunk with PT should be possible. Especially if they use 1 ray instead of 2 rays.

Now if the 2x RT performance increase is what most devs can get out of it then no, you are not getting PT on the PS5. You can look at 2x more powerful RDNA2 cards trying to run PT and its virtually impossible. Or you are running at resolutions so low that it looks like shit because of all the RT noise artifacts.

Now if we are looking at 4x then it puts the PS5 Pro at around 3080 levels of performance, and ive been able to run it at 30 fps on 4k dlss performance with some drops to 25 fps. If they are able to lower the number of rays from 2 to 1 like that PC mod does then it should be doable.

But you have to take the higher end of Sony's own estimates to get PT and we cant say just how likely that is going to be. If Sony had simply said 4x instead of 2-4x then i would be more upset at Richard.

Path tracing on AW2 is very expensive and i wasnt able to run it over 20 fps. most of the time it was under 20 fps at 4k dlss performance.
 
Last edited:

ChiefDada

Gold Member
I don't know what you are smoking here, this is the difference between 7900XTX and 7700XT (closest GPU to Pro in raster power):

gT4j8Eo.jpg


PS5 - 10.29 TFLOPS
XSX - 12.15 TFLOPS

vs.

7900XTX - 61.39 TFLOPS
7700XT - 35.17 TFLOPS (more than Pro)

Isn't far off...

1. Richard specifically mentioned CU count for his reasoning. Your inquiry should be directed at him.

2. We are talking about Ray Tracing performance, where PS5 Pro has better RT hardware. Do you remember the RDNA 2 TF count Digital Foundry/Microsoft equated the Series X RT hardware to? To refresh your memory:

b43yItd.png
 

Gaiff

SBI’s Resident Gaslighter
Cyberpunk with PT should be possible. Especially if they use 1 ray instead of 2 rays.

Now if the 2x RT performance increase is what most devs can get out of it then no, you are not getting PT on the PS5. You can look at 2x more powerful RDNA2 cards trying to run PT and its virtually impossible. Or you are running at resolutions so low that it looks like shit because of all the RT noise artifacts.

Now if we are looking at 4x then it puts the PS5 Pro at around 3080 levels of performance, and ive been able to run it at 30 fps on 4k dlss performance with some drops to 25 fps. If they are able to lower the number of rays from 2 to 1 like that PC mod does then it should be doable.

But you have to take the higher end of Sony's own estimates to get PT and we cant say just how likely that is going to be. If Sony had simply said 4x instead of 2-4x then i would be more upset at Richard.

Path tracing on AW2 is very expensive and i wasnt able to run it over 20 fps. most of the time it was under 20 fps at 4k dlss performance.
Do you run the game at Ultra settings or optimized settings? That’s a massive difference right there.

Not exactly betting on the PS5 Pro to be able to do path tracing but it isn’t a completely insane question either.
 

Bojji

Member
1. Richard specifically mentioned CU count for his reasoning. Your inquiry should be directed at him.

2. We are talking about Ray Tracing performance, where PS5 Pro has better RT hardware. Do you remember the RDNA 2 TF count Digital Foundry/Microsoft equated the Series X RT hardware to? To refresh your memory:

b43yItd.png

So far AMD RT performance is linked to number of cores, but of course overall GPU performance is the most important thing. Take for example 6700XT vs 6800

6700XT has 40CUs while 6800 has 50% more, difference is that 6700XT has higher clock (2581 MHz vs 2105 MHz) so overall performance difference is much less than 50%

rCIJwpf.jpg


Around 30% in very RT heavy scenarios:

RU12x5E.jpg
xBu5yVr.jpg


What you are forgetting here is that yeah... Pro may be 2-4x (that's a big fucking difference...) faster in RT but no game is JUST RT, even CP with PT is a mix of raster and RT calculations so only PART of rendering will be 2-4x faster compared to PS5.
 

ChiefDada

Gold Member
Not exactly betting on the PS5 Pro to be able to do path tracing but it isn’t a completely insane question either.

That's really all I'm getting at here. It's not them saying it can't be done that's frustrating, it's their reasoning behind it, or lack thereof. As a tech outlet, they should avoid the overt cynicism especially when it's backed by rudimentary (flawed) logic.
 

SlimySnake

Flashless at the Golden Globes
Do you run the game at Ultra settings or optimized settings? That’s a massive difference right there.

Not exactly betting on the PS5 Pro to be able to do path tracing but it isn’t a completely insane question either.
I did several tests on the settings. Medium-high settings vs Medium to low settings was roughly a 15% gain. It was 25-26 fps vs 29-30 fps in the most reflective area i could find.

you can see me play around with the settings to try and get 30 fps locked. ended up turning down all volumetric settings down to low to get 30+ fps consistently. this was when PT was still in beta. the performance has improved since and i think 30 fps locked is definitely doable at medium-low settings.



i was actually able to get it to run at 60 fps at 1440p performance or 720p internal. only dropping to low 50s and high 40s when the game became CPU bound in the city.

Edit: i should mention that geforce capture also shaves off 2-3 fps when turned on.
 
Last edited:

ChiefDada

Gold Member
I did several tests on the settings. Medium-high settings vs Medium to low settings was roughly a 15% gain. It was 25-26 fps vs 29-30 fps in the most reflective area i could find.

you can see me play around with the settings to try and get 30 fps locked. ended up turning down all volumetric settings down to low to get 30+ fps consistently. this was when PT was still in beta. the performance has improved since and i think 30 fps locked is definitely doable at medium-low settings.



i was actually able to get it to run at 60 fps at 1440p performance or 720p internal. only dropping to low 50s and high 40s when the game became CPU bound in the city.

Edit: i should mention that geforce capture also shaves off 2-3 fps when turned on.


I sincerely appreciate you uploading in HDR. It looks stunning on my fancy new Samsung S24 Ultra whatever whatever.
 

Gaiff

SBI’s Resident Gaslighter
I did several tests on the settings. Medium-high settings vs Medium to low settings was roughly a 15% gain. It was 25-26 fps vs 29-30 fps in the most reflective area i could find.

you can see me play around with the settings to try and get 30 fps locked. ended up turning down all volumetric settings down to low to get 30+ fps consistently. this was when PT was still in beta. the performance has improved since and i think 30 fps locked is definitely doable at medium-low settings.



i was actually able to get it to run at 60 fps at 1440p performance or 720p internal. only dropping to low 50s and high 40s when the game became CPU bound in the city.

Edit: i should mention that geforce capture also shaves off 2-3 fps when turned on.

I think the vanilla game has 2x bounces and 2x rays. As you mentioned before, 1x bounce and 2x rays could be feasible on the PS5 Pro. It's much more performant than the default 2x2 while not looking much worse. Also, I assume if Sony wanted it, they would aim for 30fps, not 60, so something like 1440p with an internal res of 1080p (or 960p?) with PSSR, and 1x bounce and 2x rays sounds plausible with a 30fps lock. Visually, it would look way better than the RT mode on the regular PS5 minus a softer IQ but if PSSR is decent, the IQ should still be passable.

The thing is, I'm not sure how much interest Sony will have in implementing something of the sort.

Whatever the case, I think it's a fair question and Rich just laughing it off is just bizarre. At that point, best come clean and just admit you were completely wrong about the performance of the machines instead of being befuddled whenever the Series X doesn't perform like you think it should. Didn't Cerny mention that it was much more difficult to keep a higher number of CUs busy, thus explaining in part why he went with 36? I vaguely remember him touching upon this in Road to PS5. Not that he'd need to tell us that anyway because it was obvious beforehand. DF are really looking like a bunch of tools there,
 
Last edited:
Whatever the case, I think it's a fair question and Rich just laughing it off is just bizarre. At that point, best come clean and just admit you were completely wrong about the performance of the machines instead of being befuddled whenever the Series X doesn't perform like you think it should. Didn't Cerny mention that it was much more difficult to keep a higher number of CUs busy, thus explaining in part why he went with 36? I vaguely remember him touching upon this in Road to PS5. Not that he'd need to tell us that anyway because it was obvious beforehand,

Yes he did @ 32:55

 
Last edited:

ChiefDada

Gold Member
So far AMD RT performance is linked to number of cores, but of course overall GPU performance is the most important thing. Take for example 6700XT vs 6800

You start off with bad logic - clearly PS5 Pro does not follow such a trend as a 67% CU increase and 28% bandwidth uplift doesn't explain a 2x-4x RT performance boost.

What you are forgetting here is that yeah... Pro may be 2-4x (that's a big fucking difference...) faster in RT but no game is JUST RT, even CP with PT is a mix of raster and RT calculations so only PART of rendering will be 2-4x faster compared to PS5.

I'm not forgetting anything. The more RT calculations a game has, the more favorable PS5 Pro will compare to any RDNA 3 Card. Furthermore, PS5 Pro raster isn't RDNA 3. Haven't you seen the discussions in this very thread suggesting RDNA 3.5 solves performance issues that dogged RDNA 3?
 
You start off with bad logic - clearly PS5 Pro does not follow such a trend as a 67% CU increase and 28% bandwidth uplift doesn't explain a 2x-4x RT performance boost.



I'm not forgetting anything. The more RT calculations a game has, the more favorable PS5 Pro will compare to any RDNA 3 Card. Furthermore, PS5 Pro raster isn't RDNA 3. Haven't you seen the discussions in this very thread suggesting RDNA 3.5 solves performance issues that dogged RDNA 3?


For all we know, PS5 Pro could be a prototype of RDNA 4 even in raster

I don't know how we can speculate on an architecture that basically doesn't exist yet.

What we learned today is that the power efficiency of RDNA 4 should be WAY better than what it is for RDNA 3 (that is very poor compared to NVIDIA Ada)
 
Last edited:

Gaiff

SBI’s Resident Gaslighter
Yes he did @ 32:55


Thanks, yes, so I don't see why in 2024, Rich is still wondering what's going on. You can even see this on the PC side where the RTX 4090 has an insane compute advantage of 70% over the RTX 4080 but in gaming scenarios, it's more like 30%, less than half of that. Parallelization becomes increasingly difficult the more CUs/Shader Cores/CUDAs you add, especially in gaming workloads which feature a slew of other bottlenecks besides raw compute.

Why doesn't Rich wonder why the 7900 XTX sometimes gets within 10-15% of the 4090 and can sometimes even match it when it should be losing by 50%?

Dumb takes from him and dismissing the Pro's RT performance as overblown when he hasn't even witnessed it.
 
Last edited:

FireFly

Member
Thanks, yes, so I don't see why in 2024, Rich is still wondering what's going on. You can even see this on the PC side where the RTX 4090 has an insane compute advantage of 70% over the RTX 4080 but in gaming scenarios, it's more like 30%, less than half of that. Parallelization becomes increasingly difficult the more CUs/Shader Cores/CUDAs you add, especially in gaming workloads which feature a slew of other bottlenecks besides raw compute.

Why doesn't Rich wonder why the 7900 XTX sometimes gets within 10-15% of the 4090 and can sometimes even match it when it should be losing by 50%?

Dumb takes from him and dismissing the Pro's RT performance as overblown when he hasn't even witnessed it.
On the other hand the 7600 XT - 7800 XT and 6600 XT - 6800 XT scaling is very close to the theoretical differences.
 

Bojji

Member
You start off with bad logic - clearly PS5 Pro does not follow such a trend as a 67% CU increase and 28% bandwidth uplift doesn't explain a 2x-4x RT performance boost.



I'm not forgetting anything. The more RT calculations a game has, the more favorable PS5 Pro will compare to any RDNA 3 Card. Furthermore, PS5 Pro raster isn't RDNA 3. Haven't you seen the discussions in this very thread suggesting RDNA 3.5 solves performance issues that dogged RDNA 3?

RT difference comes from something else obviously, but still you have that "basic" increase by 67% just by having more cores.

So far RDNA1-RDNA2-RDNA3 show no IPC gains (unlike GCN revisions) so I doubt "RDNA 3.5" will bring anything when it comes to raster. This GPU is rumored to have 33.5TF so it should perform exactly like 33.5TF RDNA3 GPU when it comes to raster.

RT performance will be obviously different but I doubt that this console will outperform top dog RDNA3 GPU even with RT improvements, outside of PT 7900XTX is ~3090ti

relative-performance-rt_3840-2160.png


 

Mr.Phoenix

Member
4 Shader Engines seems logical to me, but at that point, might as well use 10 WGP per SE like with the PS5 instead of 8.

That would be 80CUs total, with a 72CUs active. And it would still play nice with BC.

PS4: 1SE, 18CU on.
PS4/PS5: 2SE, 36CU on.
PS5 Pro: 2SE + 2SE / 36CU + 36CU
Butterfly mode like PS4 Pro.

Then PS6 can use the same strategy as how the PS5 kept the same CUs as the PS4 Pro and keep the same CUs as the PS5 Pro.
Naa they won't do that. The only reason for having an 80CU chip with 8 of them inactive is because in the situations that that is done, they usually have a lower tier product that they can sell that cut down chip as. If Sony is going the 4SE route, then it would make sense that they target 32WG. 8WG/SE is the most efficient config they can go with. That also allows them to just make the jump to 10WG/SE for the PS6.

I was watching the DF video on control, and while the average delta between the xsx and ps5 was 16% in Alex's 25 scenarios, the corridor of doom which is the toughest RT room in the game had the xsx with just a 1 fps lead.

This corridor is full of reflections within reflections and the high CU xsx just doesnt give the same bang for buck. The other more simpler scenarios with fewer RT spaces gave a 10-20% increase in framerate.

To me, its clear that the 44% extra CUs didnt really matter when it came to actual RT performance. RT performance is either tied to higher clocks or infinity cache or both. The PS5 Pro better have infinity cache or they will run into the same bottlenecks as the XSX since they are all but confirmed to be using 60 CUs and 2.1 Ghz clocks. The leaked docs showing 2-4x increase hopefully means that infinity cache is on there.

hpx3HHP.jpg
I doubt there is a snowball's chance in hell that the PS5 has an infinity cache. Even if they went with something as little as 16MB-32Mb of it, that takes up a shit ton of space on a die. Memory transistors don't shrink at the same pace/scale as logic transistors. This is the reason AMD has the cache on a separate die in their chipset designs. The use the more expensive fab node for the logic stuff, and the less expensive one for the MCD which has the infinity cache and the mem bus.

With the stuff you are talking about above, I expect that there is another reason the PS5 closes the performance gap when it comes to RT performance. Or maybe a better way to even say that is that in those situations the advantages the PS5 has over the XSX, may thenm be playing a more important role. Its also possible that the PS5 may be using a more aggressive DRS than the XSX.
 
Last edited:

Loxus

Member
Naa they won't do that. The only reason for having an 80CU chip with 8 of them inactive is because in the situations that that is done, they usually have a lower tier product that they can sell that cut down chip as. If Sony is going the 4SE route, then it would make sense that they target 32WG. 8WG/SE is the most efficient config they can go with. That also allows them to just make the jump to 10WG/SE for the PS6.
PS5 has 10WGP per SE.
1WGP or 2CU are disabled in each SE for yields.

Matching PS5 10WGP per SE with 1WGP disabled.
2SE = 20WGP / 2WGP disabled = 36CU
3SE = 30WGP / 3WGP disabled = 54CU
4SE = 40WGP / 4WGP disabled = 72CU

Unless some major change happens, this is a pattern I expect Sony to follow.
 

Bojji

Member
Then why isn't that the case with Series X compared to PS5 with 44% more CUs?

I showed you that 6800 have 50% more CUs but in reality performs only 25-30% better than 6700XT thanks to much lower clock speed, this situation is very similar to PS5 and SX.

What is bizzare is that 6800 has 22% TF advantage yet performs even better than that (memory bus?) while XSX has more TF, wider memory bus and more cores and still in many games it loses to PS5 (or it's a tie most of the time). People shouldn't be wondering why Richard is baffled by this, this is abnormal compared to PC RDNA2 scaling.
 

mckmas8808

Mckmaster uses MasterCard to buy Slave drives
I did several tests on the settings. Medium-high settings vs Medium to low settings was roughly a 15% gain. It was 25-26 fps vs 29-30 fps in the most reflective area i could find.

you can see me play around with the settings to try and get 30 fps locked. ended up turning down all volumetric settings down to low to get 30+ fps consistently. this was when PT was still in beta. the performance has improved since and i think 30 fps locked is definitely doable at medium-low settings.



i was actually able to get it to run at 60 fps at 1440p performance or 720p internal. only dropping to low 50s and high 40s when the game became CPU bound in the city.

Edit: i should mention that geforce capture also shaves off 2-3 fps when turned on.


This looks incredible. Thanks for the update. So you think the PS5 Pro can get 90% of this at 30fps is the PSSR can do 1080p to 4k upressing?
 

ChiefDada

Gold Member
I showed you that 6800 have 50% more CUs but in reality performs only 25-30% better than 6700XT thanks to much lower clock speed, this situation is very similar to PS5 and SX.

What? You can't have it both ways. You can't acknowledge evidence of RDNA not being anywhere close to scaling linearly with CU count, then go on to laugh at the possibility of PS5 Pro besting 7900XTX in RT simply because it has more CUs as Richard and the DF crew have done.
 

SlimySnake

Flashless at the Golden Globes
With the stuff you are talking about above, I expect that there is another reason the PS5 closes the performance gap when it comes to RT performance. Or maybe a better way to even say that is that in those situations the advantages the PS5 has over the XSX, may thenm be playing a more important role. Its also possible that the PS5 may be using a more aggressive DRS than the XSX.
there is no DRS. Alex talked to Remedy themselves and were told both XSX and PS5 are a settings match right down to resolution.

The consoles have virtually identical CPUs so it cant be that. especially not at 30 fps. They have the same GPU architecture. The only thing different are the clocks and CU count. We know that for RDNA1.0 cards without the infinity cache, AMD refused to release anything over 40 CUs. why? Maybe because they were like vega cards that didnt scale well like Vega 56 and Vega 64. They needed the infinity cache to scale performance 1:1 with CUs.

Maybe thats why despite the 65% increase in tflops, they are only seeing 45% gains in GPU performance. We saw the same exact thing with the 12 tflops XSX acting more like a 10 tflops PS5 in many many games. Something richard couldnt wrap his head around.

BTW, if the RT performance has increased 2-4x with just a 65% increase in raw power then that means they have added extra hardware in their GPU. So that would increase the size. Also if they are able to do machine learning upscaling then they would need something equivalent to tensor cores in nvidia GPUs which also take up a lot of space. This is going to be a big GPU. Bigger than the 60 CU 6800 for sure.
 

onQ123

Member
That has nothing to do with the number of WGP per Shader Engine.

Below you can see 8 WGP per Shader Engine.
Navi48 is the same, but with 4 Shader Engines.
bDatbVp.jpg



RDNA3.5
Strix Point: 1SE, 8WGP, 16CU (8WGP per SE)
Strix Halo: 2SE, 20WGP, 40CU (10WGP per SE)

RDNA4
Navi44: 2SE, 16WGP, 32CU (8WGP per SE)
Navi48: 4SE, 32WGP, 64CU (8WGP per SE)

PS5 Pro: 2SE, 30WGP, 64CU (16 WGP per SE)
See how different the PS5 Pro Shader Engines are compared to RDNA3.5/4?

Like I said. Something is missing from the puzzle in regards to the PS5 Pro Shader Engines.
It's 3 going by the 45% rendering boost unless they upgraded the rendering pipeline
 

Mr.Phoenix

Member
PS5 has 10WGP per SE.
1WGP or 2CU are disabled in each SE for yields.

Matching PS5 10WGP per SE with 1WGP disabled.
2SE = 20WGP / 2WGP disabled = 36CU
3SE = 30WGP / 3WGP disabled = 54CU
4SE = 40WGP / 4WGP disabled = 72CU

Unless some major change happens, this is a pattern I expect Sony to follow.
I just don't see Sony making hardware and having to disable 10% of said hardware to improve yields. Just doesn't make sense to me. Especially when there is a better more practical alternative.
The consoles have virtually identical CPUs so it cant be that. especially not at 30 fps. They have the same GPU architecture. The only thing different are the clocks and CU count. We know that for RDNA1.0 cards without the infinity cache, AMD refused to release anything over 40 CUs. why? Maybe because they were like vega cards that didnt scale well like Vega 56 and Vega 64. They needed the infinity cache to scale performance 1:1 with CUs.
Nope. First off, that's not the only difference. While there is no doubt an infinity cache helps, you need to realize/remember what its for. Its there to increase "effective" bandwidth. Its AMDs approach to get more bandwidth but not have to use a bigger mem bus or faster RAM. At least that was why they initially did it.

A more important difference between the PS5 and XSX, is the amount of GPU cache they have. Think its L2 cache. The PS5 has 4MB, the XSX has 5MB. That basically translates to like 111KB/CU for PS5 and 96KB/CU for XSX. And you need the cache to keep the GPU fed. This is why it has been highlighted that the XSX-wide GPu is being underutilized. Its possible, that RT tasks are more cache sensitive.

As for them stopping at 40CU with RDNA1, i think its obvious RDNA came in hot. Didn't even have RT.
Maybe thats why despite the 65% increase in tflops, they are only seeing 45% gains in GPU performance. We saw the same exact thing with the 12 tflops XSX acting more like a 10 tflops PS5 in many many games. Something richard couldnt wrap his head around.
There are just too many other thing that can lead to those. Thinking it all comes down to cache.... is confirmation bias.
BTW, if the RT performance has increased 2-4x with just a 65% increase in raw power then that means they have added extra hardware in their GPU. So that would increase the size. Also if they are able to do machine learning upscaling then they would need something equivalent to tensor cores in nvidia GPUs which also take up a lot of space. This is going to be a big GPU. Bigger than the 60 CU 6800 for sure.
All more reasons why they won't be adding an infinity cache. As I said, AMD puts the infinity cache (aka L3 cache) on the MCDs chiplets for a reason. And those chiplets are made on an older node. Eg. the leaked Navi 48 spec has the GPu die at around 240mm2. I can guarantee that that is only referring to the GCD and doesnt include the 4 MCDs. The PS5pro wouldn't have that luxury.

So that 240mm2, is RDNA4 with 64CU. And each RDNA4 CU has the better RT units and the AI units. Thi sis the GCD. It would not have the L3 cache and even the mem PHY controllers/bus which would all be in the MCDs. The PS5pro, would have to add a CPU, all 8x32bit PHY mem controllers, IO unit, and whatever extra stuff Sony adds to their chips on top that 240mm2 chip. In addition to maybe 8MB of L2 GPU cache at best(or theres about), which would be doubling the cache from whats in the current PS5 now.

Everything I have just described puts the PS5pro at my best guess size of around 320mm2. Or even less. If there is one thing Sony has always shown, is they never take the excessive just go for broke route in hardware design that companies like MS would take. Hell, even the PS4pro APU came in at a smaller size than the launch PS4s APU. Yh yh different-sized fab process but you get my point.
 

Loxus

Member
I just don't see Sony making hardware and having to disable 10% of said hardware to improve yields. Just doesn't make sense to me. Especially when there is a better more practical alternative.
36/40 is disabling 10% as well.

I see you seem to forgot the first set of PS5 Pro rumored specs had it with 72CUs.

Everyone went with the rumor making sense, as it copied what the PS4 Pro did.
Vk9a0w3.jpg
 
Last edited:

solidus12

Member
Richard also labeled outlets referring to PS5 Pro as a "RT monster" as overblown. He said about PS5 "it's not a RT monster, it's just better at RT" Excuse me? The hypocrisy is beyond insane now. No way would leaked docs from Xbox claiming a 2-4x boost in RT get the same downplay and subdued reaction from them, and rightfully so.

Also, 60 CU (PS5 Pro) vs 96 CU (7900xtx) delta isn't far off from 36 vs 52 CU of PS5 and Series X. And the base consoles are of the same core architecture! It's no wonder why he/they are still confused about PS5 and Series X performance differences.
 
This is quite good test, underclocked 7800XT pretty much is PS5 Pro (aside RT performance).
A console environment should mean it performs the same as the 7800xt remember these tests have a better cpu than the one leaked for the pro which Cana fetched t
 

Bojji

Member
What? You can't have it both ways. You can't acknowledge evidence of RDNA not being anywhere close to scaling linearly with CU count, then go on to laugh at the possibility of PS5 Pro besting 7900XTX in RT simply because it has more CUs as Richard and the DF crew have done.

Most GPUs on one family have similar clocks so you can see number of cores and they scale quite linearly (like Ada cards, 4070, 4070ti, 4080). With RDNA2 there is this weird situation where 6700XT has higher clock than parts with more cores.

Still TF number scaling is quite close to real life, 6800 has 22% more TF than end performs 25% better in real life. This fucks up with higher models.

Returning to 7900XTX, you know that leakers were saying that it may be close 4070. 4070 is way ahead than comparable AMD cards in RT but still overall much slower than 3090ti/7900xtx. AMD would have to be more performant in RT than nvidia to close this gap:

fu0gzvQ.jpg


It's over 30% in 4K ^

A console environment should mean it performs the same as the 7800xt remember these tests have a better cpu than the one leaked for the pro which Cana fetched t

No. PS5 is like 6700 and sometimes performs better sometimes worse in games, there is no big advantage for PS5. "Console environment" is a thing of the past, console parts are mostly comparable to PC parts.
 
Last edited:

Lysandros

Member
A more important difference between the PS5 and XSX, is the amount of GPU cache they have. Think its L2 cache. The PS5 has 4MB, the XSX has 5MB. That basically translates to like 111KB/CU for PS5 and 96KB/CU for XSX. And you need the cache to keep the GPU fed. This is why it has been highlighted that the XSX-wide GPu is being underutilized. Its possible, that RT tasks are more cache sensitive.
An even more important difference is that each PS5 CU has 14.2 KB of L1 cache in average per SA while a XSX CU has just 9.8 KB. That's a big difference of 45% (at significantly higher bandwidth due to frequency). And yes as far as i know RT performance can be very cache sensitive especially regarding BVH structures. That's an often overlooked hardware advantage favoring PS5 in matter of RT and overall compute efficiency/CU saturation.

Edit: By the way, the amount of GPU L1 cache being the same (128 KB per SA) between the systems and this pool of cache having 22% higher bandwidth on PS5, each PS5 CU should have ~72% more bandwidth ('1.22'÷9) available in average compared to XSX' (1÷13). (Please correct me if there is a mistake.)
 
Last edited:

winjer

Gold Member
An even more important difference is that each PS5 CU has 14.2 KB of L1 cache in average per SA while a XSX CU has just 9.8 KB. That's a big difference of 45% (at significantly higher bandwidth due to frequency). And yes as far as i know RT performance can be very cache sensitive especially regarding BVH structures. That's an often overlooked hardware advantage favoring PS5 in matter of RT and overall compute efficiency/CU saturation.

Are you talking about the L2 cache?
Because RDNA 1 and 2 have a 128KB l1 cache per SA.
 
3SE = 30WGP / 3WGP disabled = 54CU

That's what I thought 6 months ago as well but now it doesn't match the latest Tom Henderson leaks


"According to documents sent to Insider Gaming, this is possible because of faster RAM (28% faster) and a faster GPU that is 67% larger than the standard console (45% faster) -

If the PS5 Pro GPU is said to be 67% LARGER it means that the active CUs are 60

36->60 = +67%

36->54 = +50%
 
Last edited:

sncvsrtoip

Member
I was watching the DF video on control, and while the average delta between the xsx and ps5 was 16% in Alex's 25 scenarios, the corridor of doom which is the toughest RT room in the game had the xsx with just a 1 fps lead.

This corridor is full of reflections within reflections and the high CU xsx just doesnt give the same bang for buck. The other more simpler scenarios with fewer RT spaces gave a 10-20% increase in framerate.

To me, its clear that the 44% extra CUs didnt really matter when it came to actual RT performance. RT performance is either tied to higher clocks or infinity cache or both. The PS5 Pro better have infinity cache or they will run into the same bottlenecks as the XSX since they are all but confirmed to be using 60 CUs and 2.1 Ghz clocks. The leaked docs showing 2-4x increase hopefully means that infinity cache is on there.

hpx3HHP.jpg
high rt usage is also high cpu usage and we reguraly see in cpu heavy scenario playstation 5 is closer to xsx or even faster (probably because of lighter api)
 

winjer

Gold Member
No i am talking about L1 cache amount available 'per CU' (per SA) in average. As in 128÷9=14.22 in PS5's case.

But L1 cache is only shared across each shader array. So it's the same.
It's the L2 cache that is shared across all CUs. And in this case, the PS5 does have a bit more L2 cache per CU.
 

Lysandros

Member
But L1 cache is only shared across each shader array. So it's the same.
It's the L2 cache that is shared across all CUs. And in this case, the PS5 does have a bit more L2 cache per CU.
128 KB of L1 cache is shared between ~9 CUs for PS5 and ~13 CUs for XSX in each shader array. I am talking 'per array' basis. Per CU in each SA if you will.
 
Last edited:

Gaiff

SBI’s Resident Gaslighter
No. PS5 is like 6700 and sometimes performs better sometimes worse in games, there is no big advantage for PS5. "Console environment" is a thing of the past, console parts are mostly comparable to PC parts.
Heisenberg has it on pretty good authority that it performs like a 4070/3080, presumably closer to the 3080 thanks to its significantly wider bus and higher bandwidth.
A console environment should mean it performs the same as the 7800xt remember these tests have a better cpu than the one leaked for the pro which Cana fetched t
Yeah, so about a 4070/3080. They're all in the same tier but the 4070 starts suffering at higher resolutions while the 7800 XT suffers in ray tracing. The 3080 is the most balanced of the bunch but has the smallest amount of VRAM. At lest the 10GB model.
 

Bojji

Member
Heisenberg has it on pretty good authority that it performs like a 4070/3080, presumably closer to the 3080 thanks to its significantly wider bus and higher bandwidth.

Yeah, so about a 4070/3080. They're all in the same tier but the 4070 starts suffering at higher resolutions while the 7800 XT suffers in ray tracing. The 3080 is the most balanced of the bunch but has the smallest amount of VRAM. At lest the 10GB model.

I think what he is talking about its performance with RT, in just raster is should be below 7700XT:

pQ0MzuO.jpg


But of course time will tell, DF will obviously compare it to PC parts.
 
An even more important difference is that each PS5 CU has 14.2 KB of L1 cache in average per SA while a XSX CU has just 9.8 KB. That's a big difference of 45% (at significantly higher bandwidth due to frequency). And yes as far as i know RT performance can be very cache sensitive especially regarding BVH structures. That's an often overlooked hardware advantage favoring PS5 in matter of RT and overall compute efficiency/CU saturation.

Edit: By the way, the amount of GPU L1 cache being the same (128 KB per SA) between the systems and this pool of cache having 22% higher bandwidth on PS5, each PS5 CU should have ~72% more bandwidth ('1.22'÷9) available in average compared to XSX' (1÷13). (Please correct me if there is a mistake.)
I think we should stop looking at this 45% figure. It reminds me 14 + 4 CU narrative we got on PS4 (which was from Cerny himself). They are engineers and just too honest with numbers. We know Tflops performance isn't linear with real performance. We won't get 65% or (even 130%) higher rendering in the same non-RT game from PS5 to Pro. We'll get much better performance with adding RT effects and much better IQ from PSSR though, like they themselves said (2 to 4x better perf with RT), but obviously here people won't trust those engineers numbers.
128 KB of L1 cache is shared between ~9 CUs for PS5 and ~13 CUs for XSX in each shader array. I am talking 'per array' basis. Per CU in each SA if you will.
Yes this is likely the main problem on XSX. They'll probably increase L1 cache per CU for PS5 Pro.
 

onQ123

Member
I think we should stop looking at this 45% figure. It reminds me 14 + 4 CU narrative we got on PS4 (which was from Cerny himself). They are engineers and just too honest with numbers. We know Tflops performance isn't linear with real performance. We won't get 65% or (even 130%) higher rendering in the same non-RT game from PS5 to Pro. We'll get much better performance with adding RT effects and much better IQ from PSSR though, like they themselves said (2 to 4x better perf with RT), but obviously here people won't trust those engineers numbers.

Yes this is likely the main problem on XSX. They'll probably increase L1 cache per CU for PS5 Pro.
The 45% is most likely the rendering pipeline

The 45% is tied to the fixed function units in the Shader Engines

So if it's going from 2 SE to 3 SE

So the simple way of getting the answer is 2 x 2.23 (GHz ) = 4.46 units of fix function rendering

While 3 x 2.18 GHz) = 6.54 units of fix function rendering


4.46 + 46% = 6.51

(This is without using the exact clock speed it's rough math )
 

Fafalada

Fafracer forever
high rt usage is also high cpu usage
Even as a rule of thumb that's not true. Anyway, if that particular area is cpu bound(what's on screen doesn't look like it, given it's a static corridor, but maybe something under the hood doesn't play nice), then DF should have not used it as their gpu rt benchmarks.

we reguraly see in cpu heavy scenario playstation 5 is closer to xsx or even faster (probably because of lighter api)
And we see the other way too. Ps4/xb1 were the same too. The graphics api 'was' better to extract cpu perf on ps4, but it didn't offer any guarantees and it was entirely possible to end up on the opposite side.
 

Loxus

Member
That's what I thought 6 months ago as well but now it doesn't match the latest Tom Henderson leaks




If the PS5 Pro GPU is said to be 67% LARGER it means that the active CUs are 60

36->60 = +67%

36->54 = +50%
Depends on how you look at it.
PS5 GPU has 40CUs total.
40 + 67% = 66.8

Not only that, a GPU consists of CUs, ROPs, Cache, Geometry Processor, etc.

He has to be more specific. He should have said 67% more active CUs or WGP instead of just saying a 67% larger GPU.
 
Last edited:
Depends on how you look at it.
PS5 GPU has 40CUs total.
40 + 67% = 66.8

Not only that, a GPU consists of CUs, ROPs, Cache, Geometry Processor, etc.

He has to be more specific. He should have said 67% more active CUs or WGP instead of just saying a 67% larger GPU.

In the Road to PS5 refers to RDNA2 CU as much LARGER than GCN, it's the same thing: @35:09 to 35:28



I don't get why you put 40 in there... The count doesn't make sense as 4 are disabled

Why should they count disabled CUs???

Every leak suggest it's a 60 active CUs , 54 seems out of the window now.

Also Navi 48 has 64 CUs
 
Last edited:
Top Bottom