• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

[Computerbase] RDNA 2 CU Scaling

jroc74

Phone reception is more important to me than human rights
Jurassic Park Ian Malcom GIF
Yup.

The more things change, the more they stay the same. Its the middle of 2022....how old narratives still get pushed is wild.

I applaud the OP for the thread tho, it has the makings for a good discussion if some folks would just let go of hopes n dreams.
 

01011001

Banned
It is because XSX is fixed clock why use the same clock of ps5 . And yes it is. Higher frequency is available just in limited scenario.

I have no idea what you are talking about now...

and no, the SX has higher clocks in all modes
3.8ghz without SMT and 3.6ghz with SMT, the PS5's max CPU clock is 3.5ghz
 

Tripolygon

Banned
Microsoft needed to reach 12 TFLOPs because that spec was a requirement for Project Scarlett's secondary use as an Azure cloud processor capable of FP32/FP16 GPGPU tasks.
All GPUs support FP32/F16 operations. It is not a specific requirement for cloud.
Unlike Sony who made a SoC solely for gaming consoles, Microsoft had their Azure Silicon Architecture Team provide the specs for AMD to produce the semi-custom design.
PS5 supports FP32/FP16 and can be used in the "cloud" for cloud gaming. There are PS3 and PS4 server blades running PS Now with plans to add PS5 servers in the future.
There's a very interesting piece on the Series X SoC by Anandtech where they reveal that Microsoft even considered having a 56 CU design running at an even slower 1.675GHz clock because it would have 20% better power efficiency for cloud compute tasks.
You have it backwards. Microsoft says Xbox Cloud gaming can use lower binned chips at higher clocks than consumer Xbox Series X consoles.

Xbox Cloud gaming can use any chip >= 24WGP

24 WGP = 48 CU = 3072 SP @ 2GHz = 12.2TF


bnGnjv5.jpg


Though this could mean that the Series X would perform worse than the PS5 in videogames.
All 56 CU at a lower clock does not mean it will perform worse than PS5. The gap would still be the same.
 

Riky

$MSFT
Nice deflection, but your remark was about the PS5...This is you, despite all the explanation that had been provided on how the console architecure works:

Xsi1N9h.gif
PS5? I didn't even mention it, I said Microsoft wanted sustained performance, Microsoft make the Xbox console.
 

ChiefDada

Gold Member
That said, very interesting article especially considering the 2.0 Ghz 40 tflops 6700xt is basically a 10.2 tflops PS5. I think Death Stranding benchmark is a great example because it scales very well on the PS5 hitting native 4k 60 fps at times. a 4x increase in pixel count and another 2x increase in framerate resulting in a massive 8x bigger pixel budget on a console just 5x more powerful in raw tflops. It's obvious that the 50% IPC gains from GCN 1.1 to Polaris to RDNA is making the 10.23 tflops PS5 act like a 16 tflops GCN 1.1 card which means the PS5's 8x computing power directly resulted in a 8x performance boost. Here we see the 10.2 tflops 6700xt average 64 fps which is pretty much what we see on the PS5 during normal gameplay though its capped at 60 fps so we dont get an accurate fps average.

What I'm very curious about is whether we will see instances of PS5 performing better than 6700xt with new gen games despite less CUs, clocks, TF. 6700XT IC is significantly smaller than larger 6000 cards and memory bandwidth favors PS5. My gut feeling tells me PS5 cache setup is more favorable than IC if it were feasible on PC. I also could just be talking nonsense.
 

assurdum

Banned
I have no idea what you are talking about now...

and no, the SX has higher clocks in all modes
3.8ghz without SMT and 3.6ghz with SMT, the PS5's max CPU clock is 3.5ghz
Wow 3,6ghz with fixed clock. That's an huge boost in cpu speed :messenger_tears_of_joy:. Seriously are you trolling? All the developers say the difference in such frequency is completely irrelevant in terms of perfomance, that's why I sustained they are the same. Furthermore the higher frequency without SMT it's practically useless for the more advanced engine. That's what I'm talking about.
 
Last edited:

01011001

Banned
Wow 3,6ghz with fixed clock. That's an huge boost in cpu speed :messenger_tears_of_joy:. Seriously are you trolling? All the developers said the difference in such frequency is completely irrelevant in terms of perfomance that's why I said they are the same. Plus higher frequency without SMT it's practically useless for the more modern engine. That's what I'm talking about.

I don't even know what the fuck you are talking about. I think I said this in another thread already but I think your English is not good enough to follow some of the discussion... like wtf?
 

assurdum

Banned
I don't even know what the fuck you are talking about. I think I said this in another thread already but I think your English is not good enough to follow some of the discussion... like wtf?
Are you kidding me now? WTF are you trying to do with this reply? Now my english is the issue there? Lol. Are you trying to say 3,5 Vs 3,6 of CPU aren't practically the same in terms of perfomance? I don't think debate about of a 0,1 of cpu difference has any sense. And again use such CPU without SMT just to use the higher frequency only available in this way, it's a waste of CPU perfomance.
 
Last edited:

01011001

Banned
Are you kidding me now? WTF are you trying to do with this reply? Now my english is the issue there? Lol. Are you trying to say 3,5 Vs 3,6 of CPU aren't practically the same in terms of perfomance? I don't think debate about of a 0,1 of cpu difference has any sense.

noone was talking about Xbox vs PS5 and suddenly you start that shit while apparently not understanding the conversation at all... it was at no point about that
 

assurdum

Banned
noone was talking about Xbox vs PS5 and suddenly you start that shit while apparently not understanding the conversation at all... it was at no point about that
Ok let me resume. You said in the previous post ps5 can't have a cpu with higher frequency because it's variable. So I ironized saying "why XSX cpu hasn't an higher frequency then". So the conversation has turn around this.
 
Last edited:

01011001

Banned
Ok let me resume. You said in the previous post ps5 can't have a cpu with higher frequency because it's variable. So I ironized saying "why XSX cpu hasn't an higher frequency then". What's exactly I have confused?

see you have zero idea what's going on.
DaGwaphics DaGwaphics was talking about PC CPUs and variable clocks, then you misunderstood what he said and through he was talking about consoles and the PS5.
then I told you that the PS5's clock speeds are not high enough for the variances in CPU bins to take effect
 

assurdum

Banned
see you have zero idea what's going on.
DaGwaphics DaGwaphics was talking about PC CPUs and variable clocks, then you misunderstood what he said and through he was talking about consoles and the PS5.
then I told you that the PS5's clock speeds are not high enough for the variances in CPU bins to take effect
The user you mention said ps5 can't use AMD smartshift because it's unpredictable and can cause perfomance deficency which honestly it's a total absurdity because ps5 absolutely uses AMD smartshift and it hasn't perfomance deficency.
 
Last edited:

01011001

Banned
The user you mention said ps5 can't use AMD smartshift because it's unpredictable and can cause perfomance deficency which honestly it's a total absurdity because ps5 absolutely uses AMD smartshift and it hasn't perfomance deficency.

that's also true, smart shift wouldn't work on console, Sony most likely doesn't use that or at least not the version seen on PC
 

Panajev2001a

GAF's Pleasant Genius
that's also true, smart shift wouldn't work on console, Sony most likely doesn't use that or at least not the version seen on PC
bOEnfcq.jpg


"The CPU and GPU each have a power budget, of course the GPU power budget is the larger of the two," adds Cerny. "If the CPU doesn't use its power budget - for example, if it is capped at 3.5GHz - then the unused portion of the budget goes to the GPU. That's what AMD calls SmartShift. There's enough power that both CPU and GPU can potentially run at their limits of 3.5GHz and 2.23GHz, it isn't the case that the developer has to choose to run one of them slower."
 

SlimySnake

Flashless at the Golden Globes
bOEnfcq.jpg

You would think a next gen only game like the Matrix that hits the CPU harder than any current gen game would make people see that this does not impact the GPU in anyway, but nope.

If anything, questions must be asked why this fixed CPU and GPU clock XSX design is dropping frames just like the variable clocked PS5 console despite a significant 18% power gap.
 

Panajev2001a

GAF's Pleasant Genius
it's still most likely not the same way smart shift works on PC, that version is unpredictable and Cerny said theirs is almost 100% predictable
Improved Smartshift and the fact that well, this is not a PC solution meant to work with thick abstract compatibility layers, but something customised just to the way the overall clicking solution works inside PS5 (ah, fixed specs and console designs eh :D), but it is Smartshift ;).
 
Last edited:

01011001

Banned
Improved Smartshift and the fact that well, this is not a PC solution meant to work with thick abstract compatibility layers, but something customised just to the way the overall clicking solution works inside PS5 (ah, fixed specs and console designs eh :D), but it is Smartshift ;).

it's not smart shift. smart shift is AMDs solution for laptops.
Smart Shif is a brand name, and there are other ways to do similar things.

if it works differently it's nit smart shift, very simple
 

Lysandros

Member
All 56 CU at a lower clock does not mean it will perform worse than PS5. The gap would still be the same.
This would increase PS5's fixed function advantage to +33% instead of 22% along with even faster/higher bandwidth L1 & L2 caches. How this wouldn't impact performance throughput? What kind of "gap" are we talking about precisely?
 

Lysandros

Member
If anything, questions must be asked why this fixed CPU and GPU clock XSX design is dropping frames just like the variable clocked PS5 console despite a significant 18% power gap.
Because there isn't " a significant 18% power gap". The 18% that you are focusing on is the difference in compute which happens to be only one metric related to a GPU's overall power.
 
Last edited:

ChiefDada

Gold Member
If anything, questions must be asked why this fixed CPU and GPU clock XSX design is dropping frames just like the variable clocked PS5 console despite a significant 18% power gap.

On that note, PS5 SMT for CPU is always engaged iirc, whereas Series X has presumably continued with SMT off as it does with cross gen games to tap into 3.8ghz CPU profile. Since the Matrix demo was single threaded friendly and CPU bound at points, why do we not see performance differentials? And presumably the gap would only become narrower with multi threaded games as Series X then drops to 3.66ghz. Which reinforces my question of why power wasn't given to GPU instead. It seems it would have gone further with GPU.
 

winjer

Gold Member
On that note, PS5 SMT for CPU is always engaged iirc, whereas Series X has presumably continued with SMT off as it does with cross gen games to tap into 3.8ghz CPU profile. Since the Matrix demo was single threaded friendly and CPU bound at points, why do we not see performance differentials? And presumably the gap would only become narrower with multi threaded games as Series X then drops to 3.66ghz. Which reinforces my question of why power wasn't given to GPU instead. It seems it would have gone further with GPU.

Bollocks.
The SMT function is chosen by the devs on the Xbox. It's not enforced by MS.
Devs can choose one of two options. SMT with lower slightly lower clocks, or SMT off with higher clocks.
 

Tripolygon

Banned
This would increase PS5's fixed function advantage to +33% instead of 22% along with even faster/higher bandwidth L1 & L2 caches. How this wouldn't impact performance throughput? What kind of "gap" are we talking about precisely?
The current “gap” that exist now would be the same. How that plays out in actual games would be interesting to see.
 

Fafalada

Fafracer forever
Sure but you can still analyze in a vacuum. Some games can utilize high core count better or more poorly than others.
Much more common is that some games are more/less compute bound than others (and these deltas are not just statistical noise). This is why both OneX and Pro substantially widened the rest of the pipeline - not just added a flops multiplier - and both of them scaled very close to linearly with available Compute, relative to the base consoles as result.

CU utilization can't be compared in vacuum either - the respective hardware relies on the rest of the infrastructure to keep different number of CUs fed. Sure - software 'can' play a part, but that's much more likely going to be specific to individual workloads within a frame, not games as a whole.
 

Lysandros

Member
The current “gap” that exist now would be the same. How that plays out in actual games would be interesting to see.
I see. So you were referring to XSX' compute edge only while excluding the whole GPU metrics/specs and real world performance, i get it. We were not discussing the same thing then.
 

ChiefDada

Gold Member
Do you realize that sometimes SMT can bring lower performance?
It's an advantage to have the option to disable it. Even more if it gives a clock boost.

Yes, which goes back to my question. The Matrix demo/UE5 favors single thread. I thought maybe we'd see some hint of performance difference since there were CPU related bottlenecks.

For the exact same reason that it is enabled on your PC CPU and PS5 for current/cross gen games.
But I'm talking about Series X since that's the only console that can enable/disable right.

Clearly my wording was off and I'm doing a poor job of communicating the question so we can just leave it lol.
 

Lysandros

Member
Actually they were running with that spec as late as end of 2019. I wonder if the shift to lower CU count actually improved the yields though (more room for errors?).
A console GPU with 56 CUs without any of them being disabled for yields would be quite problematic i think.
 
Last edited:

Loxus

Member
it's still most likely not the same way smart shift works on PC, that version is unpredictable and Cerny said theirs is almost 100% predictable
So how does it work on PC then?
As for as I know, there's only one version of Smartshift and both PS5 and PC Laptops use it.

From Road to PS5.
"While we're at it, we also use AMD's Smart Shift technology and send any unused power from the CPU to the GPU so it can squeeze out a few more pixels."

From AMD
AMD SmartShift technology dynamically shifts power in your laptop to help boost performance for gaming, video editing, 3D rendering, content creation and productivity.
 

Pedro Motta

Member
You would think a next gen only game like the Matrix that hits the CPU harder than any current gen game would make people see that this does not impact the GPU in anyway, but nope.

If anything, questions must be asked why this fixed CPU and GPU clock XSX design is dropping frames just like the variable clocked PS5 console despite a significant 18% power gap.
Because...when triangles are small...you know the rest.
 

01011001

Banned
So how does it work on PC then?
As for as I know, there's only one version of Smartshift and both PS5 and PC Laptops use it.

From Road to PS5.
"While we're at it, we also use AMD's Smart Shift technology and send any unused power from the CPU to the GPU so it can squeeze out a few more pixels."

From AMD
AMD SmartShift technology dynamically shifts power in your laptop to help boost performance for gaming, video editing, 3D rendering, content creation and productivity.

cerny said developers basically see some kind of expected performance profile or something along these lines, and that it's basically always easy to know how the hardware will behave at any given moment.

on PC that's not really the case. CPU and GPU clocks fluctuate like hell at times. my old 1070 often went from 1600mhz to 2100mhz and up and down between these while playing. the PS5 isn't doing that, it also doesn't change clocks depending on what amount of power is needed, it changes clocks if the predetermined power limit is reached and only then (at least that's how it sounded like in the presentation)

so Smart Shift in a Laptop works way different than that, it might be similar if both CPU and GPU run at absolutely max power draw, but that's rarely the case while gaming.
in a laptop and/or PC the thermals are usually the determining factor for throttling
 
Last edited:

Md Ray

Member
Xbox One X wasn't RDNA, so no. They have stated many times they were looking for 2x the performance of Xbox One X, they achieved that.

The closest RDNA2 card at launch of the range was the 6800 and the game clock is almost identical to Series X.
They didn't exactly achieve 2x the performance of X1X. Remember the early DF vid article/vid on this? SX's memory bandwidth didn't get 2x boost so that limits the machine from achieving 2x perf in every game. It was around ~1.75x or thereabouts, IIRC.

They only achieved 2x TFLOPS numbers on the spec sheet.
 

winjer

Gold Member
Yes, which goes back to my question. The Matrix demo/UE5 favors single thread. I thought maybe we'd see some hint of performance difference since there were CPU related bottlenecks.

I don't think you know what single thread means.
Single thread means using just one thread. But UE5 can use dozens of threads.

What it does have is a couple of main threads that hit the CPU very hard, and can limit performance somewhat.
But if it used a single thread, then performance would be in the single digit frame per second.

BbBNSlG.png
 

assurdum

Banned
cerny said developers basically see some kind of expected performance profile or something along these lines, and that it's basically always easy to know how the hardware will behave at any given moment.

on PC that's not really the case. CPU and GPU clocks fluctuate like hell at times. my old 1070 often went from 1600mhz to 2100mhz and up and down between these while playing. the PS5 isn't doing that, it also doesn't change clocks depending on what amount of power is needed, it changes clocks if the predetermined power limit is reached and only then (at least that's how it sounded like in the presentation)

so Smart Shift in a Laptop works way different than that, it might be similar if both CPU and GPU run at absolutely max power draw, but that's rarely the case while gaming.
in a laptop and/or PC the thermals are usually the determining factor for throttling
Smarshift not work differently on ps5. Otherwise Sony would have said that.
 

Amiga

Member
Preface: THIS THREAD IS FOR THOSE INTERESTED IN TECH ANALYSIS/DISCUSSION OF RDNA 2 AND THE DESIGN CHOICES FOR PS5 AND SERIES X. NO CONSOLE WARS PLEASE!

https://www.computerbase.de/2021-03/amd-radeon-rdna2-rdna-gcn-ipc-cu-vergleich/2/

Awesome analysis that I think is the most comprehensive in comparing RDNA 2 CU scalability between AMD 6000 series cards, all fixed at 2ghz clock frequencies. Results posted below, but I wanted to point out a crucial point from a different but related test they conducted that determined RDNA 1 CUs are actually faster than RDNA 2 CU due to the shorter ALU pipeline (this should serve as a reminder to some that RDNA 2 isn't inherently better than RDNA 1 across the board AMD) .



Applying the findings below to the premium consoles, there are a few interesting facts from my perspective:

1. Cerny's argument of CU utilization doesn't pan out in the games tested below. CU scalability remains relatively constant from 1080p-4k. Although it is possible that current gen games could yield a different outcome.

2. Series consoles chose RDNA 2 design with inherent CU latency increase compared to RDNA 1, without offsetting via higher clock frequency (2ghz+) as AMD and Sony have done with 6000 series and PS5, respectively. Based on the testing, it doesn't appear as though Series X 560gb/s throughput would be enough to compensate to achieve the levels AMD was able to with faster clocks/cache. As shown below, A 50% CU advantage AND 33% bandwidth advantage for the 6800 over the 6700xt resulted in only 36% performance increase at 4k (again, both GPU clock frequencies fixed at 2ghz). Series X has 44% CU advantage and 25% bandwidth advantage ceiling over PS5 with its fastest memory segment. If anyone has information as to why Microsoft deviated from AMD RDNA 2 strategy, I would be interested to learn about this (proprietery hw/sw, etc.).


Data Parameters

doursXZ.jpg





1080p Performance
L5MdeBz.jpg


1440p Performance
rd2D46l.jpg


4k Performance

rlLOITq.jpg

What about GPU compute on RDNA vs CDNA? can RDNA still do compute but CDNA is just better and more focused?
Compute was a fundamental part of GCN on PS4. PS5 would have to retain those features for full backwards compatibility.

As a stupid in this topic, I have 4 takeaways.

1. PS5: Narrow but faster.
2. XboxSX: wider but slower.

3. Both consoles are virtually identical.

4. The difference is made by the devs.

The biggest difference is the I/O.

even on the PC front the bottleneck is data transfer. the high end GPUs struggle running the Matrix Awakens. direct storage and smart access storage will be fundamental going forward.
 

Amiga

Member
I don't think you know what single thread means.
Single thread means using just one thread. But UE5 can use dozens of threads.

What it does have is a couple of main threads that hit the CPU very hard, and can limit performance somewhat.
But if it used a single thread, then performance would be in the single digit frame per second.

BbBNSlG.png
jobs are not equal. games depend on a main thread. most games split tasks are optimized on 6 threads.
 

Loxus

Member
cerny said developers basically see some kind of expected performance profile or something along these lines, and that it's basically always easy to know how the hardware will behave at any given moment.

on PC that's not really the case. CPU and GPU clocks fluctuate like hell at times. my old 1070 often went from 1600mhz to 2100mhz and up and down between these while playing. the PS5 isn't doing that, it also doesn't change clocks depending on what amount of power is needed, it changes clocks if the predetermined power limit is reached and only then (at least that's how it sounded like in the presentation)

so Smart Shift in a Laptop works way different than that, it might be similar if both CPU and GPU run at absolutely max power draw, but that's rarely the case while gaming.
in a laptop and/or PC the thermals are usually the determining factor for throttling
What you described about PC Laptops is not how Smartshift works.

I literally posted how AMD said how it works.
It's about power management from a set power budget, not clocks.
 

MikeM

Member
What you described about PC Laptops is not how Smartshift works.

I literally posted how AMD said how it works.
It's about power management from a set power budget, not clocks.
Problem is people largely believe clocks= power consumption, when in fact its a combination of clocks+data instructions generally = power consumption.

Some instruction sets are more power intensive than others.
 
Top Bottom