• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

[Computerbase] RDNA 2 CU Scaling

Considering that RDNA2 has the capability of reaching clocks much higher than 2.2 Ghz, it's very unlikely that clocks speed will limit yields for the PS5.

I remember Sony made a comment on getting people to transition to the PS5 as fast as possible. In order for that to happen production has to be extremely good and you achieve that if you have bad yields. I believe the smaller chip size and using an architecture that can achieve those speeds easily helps with production quite a bit.
 

Lysandros

Member
In a way PS5 was designed to mitigate the downfalls of narrow but fast and seems to have done it's job well. When it comes to running code on it, last gen code that is, it might even be easier to make it perform better, as it doesn't need to be as parallelized, theoretically.

Xbox Series S/X is theoretically the most forward looking, and that might give it an edge towards the end of the generation. But it also might be a non-factor. Depends on how hardware (and code) evolves around these consoles.

Yes.

And lower clocks with less aggressive turbo will also make a lot of chips more passable for Microsoft.
Concerning the future of this generation's game engines and the question of which console architecture is more forward looking, do you consider the push towards more complex geometry and lower level control of primitives as a non-factor? Related to this, what about the new Geometry Engine's capabilities in the machines? One machine's GE is running at 22% higher frequency, shouldn't it somewhat more capable/faster at enabling those new techniques counterbalancing XSX ALU advantage to some degree? There is also streaming side of things, but it is slightly outside scope of since we are focusing on GPUs. Also i didn't understand the "less aggressive turbo" part, what do you mean by it?
 
Last edited:

Loxus

Member
The clock speed for the Series X was set to meet the target of twice the Xbox One X power, 12tflops. That was the target, the clock speed is set exactly for that, it could probably clock higher but they didn't feel the need to as there is also a 25% uplift in efficiency on top per CU over last gen. Size, heat and noise also come onto this.
Jason Ronald has talked about this several times.
Well, your partly right about the clocks.
But imo, it's the amount of CUs per Shader Array.


In both RDNA 1&2, there is a limit of 5 WGP or 10CUs per Shader Array.
Based on how AMD structured RDNA, it would have been difficult to reach that 12TF target without adding more CUs per Shader Array.

6900XT

Adding another Shader Engine would have made the die to large as they would have been adding more ROPs, Rasterizer, Cache, etc.



Microsoft had to make a decision in order to reach 12TF.
Add another Shader Engine, add more CUs per Shader Array or clock the GPU 2600+ Mhz with 2 Shader Engines. They went with more CUs per Shader Array.
 

winjer

Member
I remember Sony made a comment on getting people to transition to the PS5 as fast as possible. In order for that to happen production has to be extremely good and you achieve that if you have bad yields. I believe the smaller chip size and using an architecture that can achieve those speeds easily helps with production quite a bit.

Yes, Sony having better yeilds and a smaller chip, means they can get more chips out of each waffer.
 

Loxus

Member
Preface: THIS THREAD IS FOR THOSE INTERESTED IN TECH ANALYSIS/DISCUSSION OF RDNA 2 AND THE DESIGN CHOICES FOR PS5 AND SERIES X. NO CONSOLE WARS PLEASE!

https://www.computerbase.de/2021-03/amd-radeon-rdna2-rdna-gcn-ipc-cu-vergleich/2/

Awesome analysis that I think is the most comprehensive in comparing RDNA 2 CU scalability between AMD 6000 series cards, all fixed at 2ghz clock frequencies. Results posted below, but I wanted to point out a crucial point from a different but related test they conducted that determined RDNA 1 CUs are actually faster than RDNA 2 CU due to the shorter ALU pipeline (this should serve as a reminder to some that RDNA 2 isn't inherently better than RDNA 1 across the board AMD) .



Applying the findings below to the premium consoles, there are a few interesting facts from my perspective:

1. Cerny's argument of CU utilization doesn't pan out in the games tested below. CU scalability remains relatively constant from 1080p-4k. Although it is possible that current gen games could yield a different outcome.

2. Series consoles chose RDNA 2 design with inherent CU latency increase compared to RDNA 1, without offsetting via higher clock frequency (2ghz+) as AMD and Sony have done with 6000 series and PS5, respectively. Based on the testing, it doesn't appear as though Series X 560gb/s throughput would be enough to compensate to achieve the levels AMD was able to with faster clocks/cache. As shown below, A 50% CU advantage AND 33% bandwidth advantage for the 6800 over the 6700xt resulted in only 36% performance increase at 4k (again, both GPU clock frequencies fixed at 2ghz). Series X has 44% CU advantage and 25% bandwidth advantage ceiling over PS5 with its fastest memory segment. If anyone has information as to why Microsoft deviated from AMD RDNA 2 strategy, I would be interested to learn about this (proprietery hw/sw, etc.).


Data Parameters






1080p Performance


1440p Performance


4k Performance

Isn't it supposed to be same TF with different CU count and clock speed, and not same clock with different TF and CU count?
 
Considering that RDNA2 has the capability of reaching clocks much higher than 2.2 Ghz, it's very unlikely that clocks speed will limit yields for the PS5.
Are PS5 pipelines even RDNA2? In hindsight I'd say they probably are, but... The chip is certainly not confirmed to be.
Concerning the future of this generation's game engines and the question of which console architecture is more forward looking, do you consider the push towards more complex geometry and lower level control of primitives as a non-factor? Related to this, what about the new Geometry Engine's capabilities in the machines? One machine's GE is running at 22% higher frequency, shouldn't it somewhat more capable/faster at enabling those new techniques counterbalancing XSX ALU advantage to some degree? There is also streaming side of things, but it is slightly outside scope of since we focusing on GPUs. Also i didn't understand the "less aggressive turbo" part, what do you mean by it?
I think super wide architecture designs are the future so Xbox Series X might benefit a bit down the line with PC crossplatform. But that guess is as good as any other.

I don't think clock-rate on the respective console's "geometry engines" is going to do much of a difference due to clock speed, but PS5 has a proprietary Geometry Engine which might have some tricks up it's sleeve, otherwise why bother. And Xbox Series has VRR's as a hardware feature. In regards to the rest, since the 7th generation that peak geometry performance doesn't matter as much as before. There's no advantage in going near the theoretical limit. Also, looking at stuff like nanite, it seems like they're achieving parity quite easily and later demos have moved that into the GPU.

Efficiency really depends on both bottlenecks and how code is being written for an heterogeneous "common" denominator (or alternatively, exclusively to). PS5's Geometry Engine could be either something proprietary that ultimately does the same thing as the competition or a major advantage (if it takes/reduces a bottleneck out, like they were able to do with the streaming side of things plus on-the-fly compression). So far it seems neither a problem nor an advantage.

Also i didn't understand the "less aggressive turbo" part, what do you mean by it?
PS5 being designed as a variable clock speed machine, means both CPU and GPU have "turbo speeds" above what's considered the sustainable performance. Almost like customizing a load, you opt for less cpu speed to push more flops out of the GPU kind of deal.

Perhaps calling it an aggressive turbo is not it (albeit, it's more aggressive against Xbox Series for sure). As it's basically more about managing a watt and heat budget.
 

Allandor

Member
How can a PC bench lead to console wars again?
The behavior is totally expected. Not everything on the GPU-boards scales with the peak TF performance. With every added shader all other things inside the GPU must scale with it to get a near linear scaling (including bandwidth, ...). If not everything is scaling with it you will only get 1:1 performance gains in certain scenarios. So everything is up to the developer how the hardware is used.

Just stop this stupid console wars.
 
Sustained performance, really? You still live in 2019, don't you? 😁

It doesn't mean stable performance. It's just how the power consumption works on relation to the clock speeds.

I mean if people think the PS5 is unstable due to it's variable clocks they might have to rethink that.
 

DaGwaphics

Member
Obviously bigger chips on higher frequency have way bigger chance of failure than smaller chip, like it's inside PS5. And it's not dis of PS5, it's contrary. I sometimes wonder how much chips for 3080/3090 ends up in bin. After all it 3080 have almost 30 billion transistors, it's crazy that they even make it work...

That's why you always end up with some 60 series cards based on the large chip, they will bin for their lives. With a consoles they must just throw them away, although there were some embedded systems released last-gen that featured a binned version of the X1 SOC (mostly for the Chinese and Indian markets).
 
Last edited:

Ev1L AuRoN

Member
In a world of Dynamic Resolution, Image reconstruction and expensive and long development time, I think Sony was very smart with the PS5, their Soc is Smaller therefor cheaper, they deploy a cheaper cooling method, and they have the talent to make their machine shine.

It's clear that Microsoft is faring much better this time around, with a powerful system and a plethora of services and price proposition unmatched thanks to Game pass.

This generation is shaping up to be one of the best in terms of competition. The only aspect I don't really like is the consolidation of the market, with the platform holders buying out all the 3P studios.
 

DaGwaphics

Member
AMD call the 1815mhz the Game clock.

The architecture isn't based on clocks, you could have a 1,000MHz RDA2 part if that hit the power targets you were looking for, anything to the contrary is nonsense. The Steam Deck base clock is only like 1,300MHz.

The desktop space is a race for max performance, plus you don't need to be concerned about consistency between systems or power draw/heat for the most part.
 

Lysandros

Member
I think super wide architecture designs are the future so Xbox Series X might benefit a bit down the line with PC crossplatform. But that guess is as good as any other.

PS5 being designed as a variable clock speed machine, means both CPU and GPU have "turbo speeds" above what's considered the sustainable performance. Almost like customizing a load, you opt for less cpu speed to push more flops out of the GPU kind of deal.
XSX isn't really any wider than PS5 in architecture both machines have 2 shader engines and four shader arrays. XSX has just more CUs squished per array for a higher theoretical compute ceiling at the cost reduced L1 cache amount and bandwidth available per CU which should diminish real world compute efficiency to some degree.

As to PS5's continuous boost solution, i think you have slight misunderstandings about it. Both the CPU and GPU run at max frequencies most of the time, is not necessarily a game of trade-off. Please see Road to PS5 and Cerny's DF interview anbout the matter again.
 
Last edited:
That's true, but even in the case of that Series X is kind of an outlier because there are RDNA2 GPUs much bigger than it that can also clock higher, so it seems MS's reasoning was a deliberate choice not just for their console footprint design & cooling budget, but also because they wanted something that could be cheap but powerful enough to run in an Azure server cluster without pushing the energy bill too high.

Plus, again, they wanted something fit for BC and running multiple Series S simulated instances. They needed a wider design for that at scale.



Guess some people just want to hear about dat "power delta" finally manifesting as if by magic any time now. But those same people only look at TF and raw peak RAM bandwidth, and don't usually understand intricacies in other parts of system design like cache scrubbing, bus latency, cache coherency, pixel fillrate (< all particular features of PS5), virtualized RAM partition, mip-blending (< all particular features of Series X) etc.
Each console offers tools and techniques to tackle specific situations in their own unique ways. There's no secret sauce though or some imaginary, insane performance delta that allows one console to push far above the other. Of course certain outliers that believe everything that is thrown at them will tell you that exhibit A is better than Exhibit B, because of some software or hardware based secret sauce.
E.g.
•Playstation is dramatically faster than Xbox in storage Io because Io Complex or
•Xbox is much better than Playstation because Mesh Shaders & whatnot
Now, frankly, those same individuals have not seen devs test both consoles side by side, to see which is better and at which workload. So why make assumptions out of thin air without any evidence to back them up. Because throwing out there a feature with a fancy name, makes plastic box A look better than plastic box B in the eyes of the individual living in his or her fantasy world, where everything is perfect. Frankly for me, what you already said above makes absolute sense and is logical, to someone else who worships one very specific box over another though... It's bs to fuel console wars.
 
Last edited:

Tripolygon

Member
Me? I didn't design the console.
No, you are stuck in 2019 and Microsoft has done a wonderful Job at misinforming people how processors work. There is no such thing as sustained performance or Teraflop like you are trying to imply. All modern GPU can sustain a high clock assuming they don't throttle based on thermals. All modern processors use variable frequency scaling.

A good explanation by Intel on Gamer Nexus.

AMD and Nvidia tend to advertise a conservative clock on their website, but what these GPUs do in practice is another thing.

An RX6800 never runs at 1.815GHz when gaming, it stays above 2+GHz

 
Last edited:

DaGwaphics

Member
No, you are stuck in 2019 and Microsoft has done a wonderful Job at misinforming people how processors work. There is no such thing as sustained performance or Teraflop like you are trying to imply. All modern GPU can sustain a high clock assuming they don't throttle based on thermals. All modern processors use variable frequency scaling.

On PCs, that's true. On console the traditional method is to lock the clocks so that performance is consistent, you can't have timing variations based on temps. Sony has worked around this by basing the variable clocks on power usage, that allows them to make it predictable. Consoles should run exactly the same regardless of if Dick is in Florida and Jane is in Maine.
 
Last edited:

winjer

Member
On PCs, that's true. On console the traditional method is to lock the clocks so that performance is consistent, you can't have timing variations based on temps. Sony has worked around this by basing the variable clocks on power usage, that allows them to make it predictable. Consoles should run exactly the same regardless of if Dick is in Florida and Jane is in Maine.

Sony is just using the same tech as AMD Smartshift. Nothing exclusive to consoles, as it's been used on laptops for a while.

 

Tripolygon

Member
On PCs, that's true. On console the traditional method is to lock the clocks so that performance is consistent, you can't have timing variations based on temps. Sony has worked around this by basing the variable clocks on power usage, that allows them to make it predictable.
On every "modern" processor that is true, everyone has moved on from the days of fixed clock, Microsoft is the only one who decided to use a fixed clock. All processors use different variables to determine optimum clock, voltage, power, thermals etc. What Sony did was take thermals out of the equation so all PS5 will behave the same regardless of ambient temperature because ambient temperature can affect how hot a given processor is.
 
Last edited:
I wanted to point out a crucial point from a different but related test they conducted that determined RDNA 1 CUs are actually faster than RDNA 2 CU due to the shorter ALU pipeline (this should serve as a reminder to some that RDNA 2 isn't inherently better than RDNA 1 across the board AMD) .
This is not what's being shown in the article you referred.
At ISO clocks, the 5700XT is 4% faster at 1080p and 2% slower at 1440p. That's within margin of error and it could be that driver optimizations for Infinity Cache in the meantime have put the RDNA2 GPU on top. For example, AMD just recently rewrote their entire DX11 driver which gave RDNA2 GPUs a 10% performnace boost on average.


Cerny's argument of CU utilization doesn't pan out in the games tested below. CU scalability remains relatively constant from 1080p-4k.
You can't compare CU utilization when the rest of the GPU isn't scaling accordingly. Navi 22 has 50% more L2 cache, 50% more L3/IC, 50% more VRAM bandwidth, 50% more ROP units (pixel fillrate) and 100% more ACE units (work schedulers) per CU than Navi 21. Navi 21 is clearly a more compute-oriented and less gaming-oriented GPU than Navi 22. The chances for having idling ALUs in a Navi 21 are much greater than with Navi 22.


A 50% CU advantage AND 33% bandwidth advantage for the 6800 over the 6700xt resulted in only 36% performance increase at 4k (again, both GPU clock frequencies fixed at 2ghz).
An average 41% scaling in units resulting in a 36% performance increase seems like a great scaling, again considering the fact that Navi 21 and Navi 22 have the same amount of ACEs.


Series X has 44% CU advantage and 25% bandwidth advantage ceiling over PS5 with its fastest memory segment.
The Series X also has 18% slower caches, 18% slower geometry processor, 18% slower ROPs / lower fillrate and no cache scrubbers that might give the PS5 a bit better effective bandwidth ratio.


If anyone has information as to why Microsoft deviated from AMD RDNA 2 strategy, I would be interested to learn about this (proprietery hw/sw, etc.).
Microsoft needed to reach 12 TFLOPs because that spec was a requirement for Project Scarlett's secondary use as an Azure cloud processor capable of FP32/FP16 GPGPU tasks.
Unlike Sony who made a SoC solely for gaming consoles, Microsoft had their Azure Silicon Architecture Team provide the specs for AMD to produce the semi-custom design.
There's a very interesting piece on the Series X SoC by Anandtech where they reveal that Microsoft even considered having a 56 CU design running at an even slower 1.675GHz clock because it would have 20% better power efficiency for cloud compute tasks.




Though this could mean that the Series X would perform worse than the PS5 in videogames.
 
Last edited:

DaGwaphics

Member
On every "modern" processor that is true, but everyone has moved on from the days of fixed clock, Microsoft is the only one who decided to use a fixed clock. All processors use different variables to determine optimum clock. Voltage, Power, Thermals etc. What Sony did was take thermals out of the equation so all PS5 will behave the same regardless of ambient temperature because ambient temperature can affect how hot a given processor is.

You could just as easily say that Sony is the "Only one, shifting clocks". LOL

There are only two companies making comparable consoles here, the XSX is just built the way they always have been. PC CPUs are not consistent between machines, even in the same room on the same model MB. That's not ideal in a console. Too much variation in timing. You can work with that, obviously PC gaming does. But, you will lose some efficiency.
 
Last edited:
Sorry but what?? Not this again after all this time...
I stand corrected. PS5 is RDNA2 at least on the core behaviour of the processor.

That's what I was uncertain of. I saw that presentation but it was a while ago.
AMD Smartshift isn't 100% predictable, I don't think the PS5 is using that tech exactly.
It basically is, thing is with laptops you always have shitty cooling solutions, you never have a cooling solution that is over-engineered.

That's precisely what Sony did, so on PS5 the temperature of the room, how long it has been under full load and such have less of an impact. That makes everything more predictable (otherwise it would be a disaster)
 
Last edited:

DaGwaphics

Member
Do you have anything to back that statement?

No lab reports, but the results you get from that aren't necessarily identical between two laptops of the same model under the same workload. Sony claimed their method was extremely consistent.
 

winjer

Member
No lab reports, but the results you get from that aren't necessarily identical between two laptops of the same model under the same workload. Sony claimed their method was extremely consistent.

Two different laptops can have different power targets, batteries, cooling system, etc. So that is not a fair comparison.
Regardless, Smartshift, in both laptops will try to do the same thing. And it ends up being the same as the PS5. Might I remind you that AMD is the one that designed the SoC for the PS5.
 

DaGwaphics

Member
Two different laptops can have different power targets, batteries, cooling system, etc. So that is not a fair comparison.
Regardless, Smartshift, in both laptops will try to do the same thing. And it ends up being the same as the PS5. Might I remind you that AMD is the one that designed the SoC for the PS5.

It very well could be the same thing. I have no idea.
 

Tripolygon

Member
You could just as easily say that Sony is the "Only one, shifting clocks". LOL
You couldn't say that because that is how every modern processor works. Look at, Intel, AMD, Nvidia, Qualcomm, MediaTek, Huawei, Apple, ImaginationTech, Samsung, etc.

There are only two companies making comparable consoles here, the XSX is just built the way they always have been. PC CPUs are not consistent between machines, even in the same room on the same model MB. That's not ideal in a console. Too much variation in timing.
Both consoles use the same PC architecture and thermals is not an issue because both consoles provide enough thermal headroom for the average living room. If you are building your own PC, you get to decide what your thermal headroom is. Consoles are supposed to create a consistent uniform experience for everyone that is why certain concessions are made. You give away your ability to overclock and tweak your hardware for the simplicity and consistency.

AMD Smartshift isn't 100% predictable, I don't think the PS5 is using that tech exactly.
It is 100% predictable. Just because something is variable does not mean it is not predictable. The algorithm decides when to move power because it is entirely predictable. It wouldn't make sense to just arbitrarily move power around without a certain level of predictability of what type of workload is being done and what is going to be needed in the next few seconds.
 

DaGwaphics

Member
You couldn't say that because that is how every modern processor works. Look at, Intel, AMD, Nvidia, Qualcomm, MediaTek, Huawei, Apple, ImaginationTech, Samsung, etc.


Both consoles use the same PC architecture and thermals is not an issue because both consoles provide enough thermal headroom for the average living room. If you are building your own PC, you get to decide what your thermal headroom is. Consoles are supposed to create a consistent uniform experience for everyone that is why certain concessions are made. You give away your ability to overclock and tweak your hardware for the simplicity and consistency.


It is 100% predictable. Just because something is variable does not mean it is not predictable. The algorithm decides when to move power because it is entirely predictable. It wouldn't make sense to just arbitrarily move power around without a certain level of predictability of what type of workload is being done and what is going to be needed in the next few seconds.

I'm talking about consoles. PC CPUs have had variable clocks for decades (All the way back to Prescott I think). There's a reason the PS3, Xbox360, PS4 and x1 don't work like that, same as the XSX. Different design principles/considerations. The tech has always been around, it's not like it's a new thing. Although the X1/PS4 do have Cool n' Quiet built in there, it's just disabled in game mode.
 
Last edited:

SlimySnake

Member
Applying the findings below to the premium consoles, there are a few interesting facts from my perspective:

1. Cerny's argument of CU utilization doesn't pan out in the games tested below. CU scalability remains relatively constant from 1080p-4k. Although it is possible that current gen games could yield a different outcome.

2. Series consoles chose RDNA 2 design with inherent CU latency increase compared to RDNA 1, without offsetting via higher clock frequency (2ghz+) as AMD and Sony have done with 6000 series and PS5, respectively. Based on the testing, it doesn't appear as though Series X 560gb/s throughput would be enough to compensate to achieve the levels AMD was able to with faster clocks/cache. As shown below, A 50% CU advantage AND 33% bandwidth advantage for the 6800 over the 6700xt resulted in only 36% performance increase at 4k (again, both GPU clock frequencies fixed at 2ghz). Series X has 44% CU advantage and 25% bandwidth advantage ceiling over PS5 with its fastest memory segment. If anyone has information as to why Microsoft deviated from AMD RDNA 2 strategy, I would be interested to learn about this (proprietery hw/sw, etc.).


Data Parameters

Dont think you can apply the findings of this particular test that normalizes clocks to the PS5 and XSX archs which have a pretty significant difference in clocks.

Now if the test measured the 40 CU 6700xt with the 40 CU 5700xt at 2.23 GHz and 1.825 GHz then we might be able to make a proper comparison between the two consoles. Maybe DemonCleaner DemonCleaner can do the comparison because DF doesnt seem to be particularly interested.

Cerny's argument also doesnt apply here because again, the clocks are fixed. In his example, he had the same tflops count but arrived at the tflops number using different clocks. A better test would be to take the 32 CU 6600xt, lock it at 2.23 GHz benchmarks and then downclock the 40 CU 6700xt to whatever percentage that gets us to the 6600 xt's new Tflops count. Is it better? Is it worse? The same?

That said, very interesting article especially considering the 2.0 Ghz 40 tflops 6700xt is basically a 10.2 tflops PS5. I think Death Stranding benchmark is a great example because it scales very well on the PS5 hitting native 4k 60 fps at times. a 4x increase in pixel count and another 2x increase in framerate resulting in a massive 8x bigger pixel budget on a console just 5x more powerful in raw tflops. It's obvious that the 50% IPC gains from GCN 1.1 to Polaris to RDNA is making the 10.23 tflops PS5 act like a 16 tflops GCN 1.1 card which means the PS5's 8x computing power directly resulted in a 8x performance boost. Here we see the 10.2 tflops 6700xt average 64 fps which is pretty much what we see on the PS5 during normal gameplay though its capped at 60 fps so we dont get an accurate fps average.

The PS5 results also match up with the AC Valhalla 1440p benchmarks since it averages 60 fps here similar to the PS5 during normal gameplay.
 

assurdum

Banned
You could just as easily say that Sony is the "Only one, shifting clocks". LOL

There are only two companies making comparable consoles here, the XSX is just built the way they always have been. PC CPUs are not consistent between machines, even in the same room on the same model MB. That's not ideal in a console. Too much variation in timing. You can work with that, obviously PC gaming does. But, you will lose some efficiency.
The hell of bullshit is that. Where you have seen ps5 lose efficiency? In what way shifting clock compromise console perfomance? Thanks to God we should avoid nonsense console war but nope, some of you can't really resist.
 
Last edited:

DaGwaphics

Member
The hell of bullshit is that. Where you have seen ps5 lose efficiency? In what way shifting clock compromise console perfomance? Thanks to God we should avoid console war.

I was specifically talking about PC gaming. Learn to read.

As far as I'm aware, every PS5 operates identical to each other, so, this could never be an issue there.
 
Last edited:

ChiefDada

Member
This is not what's being shown in the article you referred.
At ISO clocks, the 5700XT is 4% faster at 1080p and 2% slower at 1440p. That's within margin of error and it could be that driver optimizations for Infinity Cache in the meantime have put the RDNA2 GPU on top. For example, AMD just recently rewrote their entire DX11 driver which gave

That's the point I was trying to make. The IC is making up the difference at higher resolution even though ALU pipeline is slower, performance is regained and then some with infinity cache banking data on gpu and significantly cutting trip back to RAM. If IC wasn't present, then we wouldn't see the tide shift from 1080p to 1440p, correct? Remember, 5700XT had larger memory bus and throughput in the tests. Drawing back to consoles, PS5 has similar strategy with cache coherency hardware, but series X went with larger bus/bandwidth that theoretically wouldn't compensate on a per CU basis based on results we're seeing here.

Microsoft needed to reach 12 TFLOPs because that spec was a requirement for Project Scarlett's secondary use as an Azure cloud processor capable of FP32/FP16 GPGPU tasks.
Unlike Sony who made a SoC solely for gaming consoles, Microsoft had their Azure Silicon Architecture Team provide the specs for AMD to produce the semi-custom design.
There's a very interesting piece on the Series X SoC by Anandtech where they reveal that Microsoft even considered having a 56 CU design running at an even slower 1.675GHz clock because it would have 20% better power efficiency for cloud compute tasks.

Thanks for sharing. I saw the link but didn't see part where they tied the 12 tflop metric to requirements for cloud. But I take your word for it and if that's true, it sounds like a nightmare creating a system with divergent focus.
 

assurdum

Banned
I was specifically talking about PC gaming. Learn to read.

As far as I'm aware, every PS5 operates identical to each other, so, this could never be an issue there.
You said in your post variable clock is too much unpredictable for console. You are not talking just of PC.
 

DaGwaphics

Member
You said in your post variable clock is too much unpredictable for console. You are not talking just of PC.

The preceding sentence was "PC CPUs are not consistent between machines, even in the same room on the same model MB.", I was talking about that, obviously.

Specifically, I was referring to the stepping, Speedstep and Cool n' Quiet (or whatever AMD is calling it now) are not exactly the same between PCs with the same CPU (most of the time due to different MBs, but it can be the quality of the chip itself too). PC games work just fine, devs just can't rely on any specific timing (obviously, since there are a gazillion CPUs, but even in cases of the same CPU), which is going to hurt efficiency a bit. One could argue that games shouldn't be built to the metal that much anyway, but as anyone that's used emulators much can tell you, there are definitely places where they do that on console. Older systems are particularly bad with that, DC and GC especially. But, it must be happening on newer systems too or they wouldn't work so hard to have BC modes that can support the right clocks, etc.
 
Last edited:

Tripolygon

Member
I'm talking about consoles. PC CPUs have had variable clocks for decades (All the way back to Prescott I think).
I'm talking about processors in general. The higher a given architecture can clock; it just makes sense to adopt a variable frequency scaling strategy because not all workloads scale with clock and also power does not scale linearly with clock.

There's a reason the PS3, Xbox360, PS4 and x1 don't work like that, same as the XSX. Different design principles/considerations. The tech has always been around, it's not like it's a new thing. Although the X1/PS4 do have Cool n' Quiet built in there, it's just disabled in game mode.

Modern consoles are not architecturally different especially since they have adopted PC architecture. Consoles are typically designed to last 6 to 10 years, so they "had" to adopt a conservative fixed clock within a given power budget while thermals are taken into consideration. The easiest way to keep both under control is to adopt a fixed conservative clock. Microsoft stuck with the same strategy even though all modern processors use variable frequency scaling, Sony chose to adapt with the times, and it pays off for them because they can achieve a similar performance with a smaller SoC at the cost of slightly higher power draw. Both are equally valid design choices and there are some tradeoffs for both.

We know RDNA 2 clocks as high as 2.5GHz and Microsoft is well below what the architecture is capable of, at 1.5GHz and 1.8GHz for the Series S/X, and RDNA 3 is supposed to reach 2.8 - 3GHz comfortably. My prediction is that next generation Microsoft will adopt variable frequency scaling strategy. It is inevitable as these GPUs are getting massive and more expensive with each smaller node.
 

assurdum

Banned
The preceding sentence was "PC CPUs are not consistent between machines, even in the same room on the same model MB.", I was talking about that, obviously.

Specifically, I was referring to the stepping, Speedstep and Cool n' Quiet (or whatever AMD is calling it now) are not exactly the same between PCs with the same CPU (most of the time due to different MBs, but it can be the quality of the chip itself too). PC games work just fine, devs just can't rely on any specific timing (obviously, since there are a gazillion CPUs, but even in cases of the same CPU), which is going to hurt efficiency a bit. One could argue that games shouldn't be built to the metal that much anyway, but as anyone that's used emulators much can tell you, there are definitely places where they do that on console. Older systems are particularly bad with that, DC and GC especially. But, it must be happening on newer systems too or they wouldn't work so hard to have BC modes that can support the right clocks, etc.
You don't talk just about that. And as you can see above I'm not the only user hasn't "learned to read" your posts.
 
Last edited:

DaGwaphics

Member
Modern consoles are not architecturally different especially since they have adopted PC architecture. Consoles are typically designed to last 6 to 10 years, so they "had" to adopt a conservative fixed clock within a given power budget while thermals are taken into consideration. The easiest way to keep both under control is to adopt a fixed conservative clock. Microsoft stuck with the same strategy even though all modern processors use variable frequency scaling, Sony chose to adapt with the times, and it pays off for them because they can achieve a similar performance with a smaller SoC at the cost of slightly higher power draw. Both are equally valid design choices and there are some tradeoffs for both.

We know RDNA 2 clocks as high as 2.5GHz and Microsoft is well below what the architecture is capable of, at 1.5GHz and 1.8GHz for the Series S/X, and RDNA 3 is supposed to reach 2.8 - 3GHz comfortably. My prediction is that next generation Microsoft will adopt variable frequency scaling strategy. It is inevitable as these GPUs are getting massive and more expensive with each smaller node.

I can agree with that. You were making it sound like MS went back to Stonehenge days for their design, rather than the reality that Sony just implemented this feature first. No different than the unified shaders on 360, somebody will always go first in the console space. Good ideas that work well will be copied (especially if they save money). Who knows what MS will do with their next generation of systems, but it wouldn't surprise me if they did go with the "narrow as possible and clock to all hell" design.
 

Lysandros

Member
The Series X also has 18% slower caches, 18% slower geometry processor, 18% slower ROPs / lower fillrate and no cache scrubbers that might give the PS5 a bit better effective bandwidth ratio.
Indeed, the contribution of PS5's faster caches and scrubbers' to its overall real world bandwidth is often overlooked. Additionally you mentioned GE, is it because it allows to cull 22% more geometry at any given time, thus can also save more bandwidth? And by which mechanism/s a higher pixel fillrate contributes to bandwidth positively, i don't think that i understood that part.
 
Last edited:
Top Bottom