• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

DF: The Touryst PS5 - The First 8K 60fps Console Game

Yes, the X1 GPU was clocked a bit higher, but it had so few CUs, it was not enough to gap the difference.
But it did have an advantage. The CPU of the X1, had a higher clock speed, compared to the PS4. And it had DDR3, which has lower latency.
This meant that in a few games, heavily CPU bound, the X1 was able to match or beat the PS4.
The PS4 also had more fillrate, more (I can't recall the technical term) scheduling units to feed its GPU, etc. All in all a very similar approach to the PS5. Also, the DDR 3 may have had a slightly lower latency, it also had a much lower bandwidth to feed its CPU and GPU.
 

Lysandros

Member
Yes, the X1 GPU was clocked a bit higher, but it had so few CUs, it was not enough to gap the difference.
But it did have an advantage. The CPU of the X1, had a higher clock speed, compared to the PS4. And it had DDR3, which has lower latency.
This meant that in a few games, heavily CPU bound, the X1 was able to match or beat the PS4.
The interesting thing is PS4 outperformed XboxOne in quite few (supposed) CPU bound games/scenarios too. Perhaps there was a customization beyond the clock frequency or this was due to API differences.
 

Arioco

Member
FINgber.png


Is that Jesús Quintero smoking a joint? lol

The thing I least expected to see on GAF.

I've always wondered why GAF is so full of Spanish freaks. 😂
 

rushgore

Member
You forget the very simplistic graphic argument of Leadbetter in the video. Yes because tipically less powerful hardware handle better "simplistic" graphic lol. The level of stupidity they reach to biasis a multiplatform result, is something else some times.
Did they really say that? LOL by that logic, backwards compatible games should run worse on the next gen hardware, as they are rendering the last gen graphics with powerful hardware.
 
It doesn't say anywhere that it's backwards compatibility, the SX also benefited from its native version at launch.
The devs explain well that it's thanks to the higher GPU frequency and unified RAM of the PS5 that they were able to push the resolution higher than it was on XSX
FINgber.png

Maybe the developers can release an XSX patch to put it in 8K as well of course if they manage to do so with further optimization, but for now it remains in 6K.
They don't say anything about engine rewrite to take advantage of anything. They said it very specifically this time. This at least implies this wasn't done for xs.
 

winjer

Gold Member
So doesn't this logic result in higher occupancy per CU 'by nature'? By the way, are you sure that asynchronous compute is directly tied '1:1' to ALU throughput like this?

XSX/PS5 'should' have 4 ACEs+1 HWS like in standart RDNA/2 SE architecture unless Sony added more like in PS4 (8 ACEs) but this has to be confirmed.

You have to think of it as execution time to do a task.
If the clock is higher for shaders, that means there's less time to execute the calculation.
This means there's less time to get a new work wave. But since the ACE also has a higher clock speed, it also has a lower time to execute it's task.
In a perfect world each unit would do it's task in the exacta same time as all other units. Thus passing it's work along the execution pipeline, without any unit having to wait.

The interesting thing is PS4 outperformed XboxOne in quite few (supposed) CPU bound games/scenarios too. Perhaps there was a customization beyond the clock frequency or this was due to API differences.

I don't remember those situations. But it's very possible.
Jaguar didn't have a granular clock speed like Zen. So these differences were probably due to API, drivers, etc.


The PS4 also had more fillrate, more (I can't recall the technical term) scheduling units to feed its GPU, etc. All in all a very similar approach to the PS5. Also, the DDR 3 may have had a slightly lower latency, it also had a much lower bandwidth to feed its CPU and GPU.

A CU has TMUs, ROP, vector units, scheduler, cacche, etc.
Add a CU and you add a proportional amount of other units.
I don't think that it's necessary to constantly repeat that the X1 had fewer of all of these units, when just saying that it had fewer CUs, does the same.

GCN_CU.jpg
 
Last edited:

Heisenberg007

Gold Journalism

Trilobit

Member
The Touryst demo was one of the first games I tried on my Switch. It's just super gorgeous. Can't imagine what it'd look in 8K 60fps. :O
 

M1chl

Currently Gif and Meme Champion

ethomaz

Banned
No.
The Dev explained how he got the game to 8k on the PS5. They never said the Xsx version could not do 8k. The talk about how the clockspeed, cu difference and memory setup will effect performance between the PS5 + seriesX is John and Richards speculation.
Nope.

Dev said it.

“Shin'en tells us that in the case of its engine, the increase to clock frequencies and the difference in memory set-up makes the difference.”
 
Last edited:

ethomaz

Banned
Didnt the x1 also have this compared to the PS4 because of its higher clocked GPU?

Anyway im sure if the dev did a native xsx version they could do an 8k mode too.

Comparing a modified BC xbox one game to a native PS5 game is a foolish way to draw any speculations on hardware performance, im surprised DF did this.
MS themselves said they got more performance with 7% upclock than 17% increase in CUs with Xbox One.

But hey people forget for whatever reason.
 
Last edited:

ethomaz

Banned
This is a game that uses simple shaders, so it end's up not having it's performance being limited by shader throughput.
At 8K, it's probably just limited by ROPs.
The PS5 has double the Z/stencil ROPs of the Series X. And then it has the clock advantage.

As for the question of wide vs clock speed.
Let's just remember that the calculations made by GPUs are very easily parallelized.
Besides, both this and the last generation had Async Compute, to fill in the shaders that might be empty.
This means that going wide will scale performance very well.
Comparing GPUs workload to CPU's, is a bit silly. They work in different ways and deal with different types of loads.
Well if we forget the how the big advantage in the past years in GPU market was the high clocks of nVidia cards giving the edge in performance per TF even with way lower number of units.

Until of course AMD found a way increase the clock of their GPU and catch the performance per TF difference.

It is not like AMD didn’t know that.
 
Last edited:

jumpship

Member
That's still a straight port vs engine rewrite.
That's like saying Borderlands 3 on ps5 is the best it can be. It's native after all.

This really points to the difference in how Sony and Microsoft handle their graphics API doesn't it?

On Xbox devs use the same one size fits all GDK across all consoles/PC and adjust settings depending on the power of the machine. In this case they managed to push the Series X hardware up to 6K 60fps (optimized for Series X)

Sony do things differently in that they use a separate low level API for each console. The devs could have used base PS4 version in boost mode for PS5. Thankfully instead made a native PS5 version using the PS5 api and all its features to push 8K 60fps.

Devs have even explained why there is a difference between the two consoles I think some just don't want to accept the answer.
 

winjer

Gold Member
Well if we forget the how the big advantage in the past years in GPU market was the high clocks of nVidia cards giving the edge in performance per TF even with way lower number of units.

Until of course AMD found a way increase the clock of their GPU and catch the performance per TF difference.

It is not like AMD didn’t know that.

You are comparing very different archs, that have units with very different levels of performance.
What a silly comparison to make, when talking about wide vs clock speed.

A better test it to use one GPU and change clock speed. You will quickly realize that performance doesn't scale linearly.
 

ethomaz

Banned
You are comparing very different archs, that have units with very different levels of performance.
What a silly comparison to make, when talking about wide vs clock speed.

A better test it to use one GPU and change clock speed. You will quickly realize that performance doesn't scale linearly.
A GPU with the same memory bandwidth with 20CUs at 1.5Ghz will run faster than one with 30CUs at 1Ghz.

RDNA 2 was where AMD finally fixed the clock disadvantage with nVidia and the Perf/TFs become similar when they were way below before that.
 
Last edited:

winjer

Gold Member
A GPU with the same memory bandwidth with 20CUs at 1.5Ghz will run faster than one with 30CUs at 1Ghz.

RDNA 2 was where AMD finally fixed the clock disadvantage with nVidia and the Perf/TFs become similar.

I would like to see proof of that. Can you provide?
 

ethomaz

Banned
I would like to see proof of that. Can you provide?
Yes..

GPU​
5700​
5700XT​
5700 OC​
CU's​
36​
40​
36​
Clock​
1725 Mhz​
1905 Mhz​
2005 Mhz​
TFLOP​
7.95​
9.75​
9.24​
TFLOP Diff.​
100%​
123%​
116%​
Assassin's Creed Odyssey​
50 fps​
56 fps​
56 fps​
F1 2019​
95 fps​
112 fps​
121 fps​
Far Cry: New Dawn​
89 fps​
94 fps​
98 fps​
Metro Exodus​
51 fps​
58 fps​
57 fps​
Shadow of the Tomb Raider​
70 fps​
79 fps​
77 fps​
Performance Difference​
100%​
112%​
115%​

All GPU's are all based on AMD Navi 10, have GDDR6 memory at 448GB/s. Game benchmarks were done at 1440p.


PS. Oc course there is a silicon limit for clocks speeds and so after you cross that limit in a chip the performance won’t scale anymore… RDNA it was around 2000-2100Mhz depending of the chip (bin lottery).
 
Last edited:

benno

Member
Can't you only benefit this 8k60fps if you have an 8k tv? Does anyone even have one?
I used to own an 8k TV which I used as a PC monitor with a RTX3090.

The input lag at 8k is much higher than 4k. Using the same TV in PC/game mode I noticed 4k was about 20ms while 8k was closer to 60ms.

8k does look so good though on certain games. Cyberpunk looked amazing, although about 20fps on ultra.
 
Last edited:

assurdum

Banned
They don't say anything about engine rewrite to take advantage of anything. They said it very specifically this time. This at least implies this wasn't done for xs.
What's the point to rewrite the engine on XSX? Isn't it virtual coded with a multiplat development kit? Some you are something else sometimes. You don't have access to a low level of API on XSX, thanks the direct X Lord and the logic of coding by MS, why you blame the developers, they did what they can. Furthermore if I'm not wrong, they also said higher res was possible thanks to the faster frequency and the memory setup, not only because they "just" rewritten the engine.
 
Last edited:

ethomaz

Banned
What's the point to rewrite the engine on XSX? Isn't it virtual coded with a multiplat development kit? Some you are something else sometimes. You don't have access to a low level of API on XSX, thanks the direct X Lord and the logic of coding by MS, why you blame the developers, they did what they can.
That is the whole point of GDK… same development and code across all supported platforms.
 
Last edited:
A CU has TMUs, ROP, vector units, scheduler, cacche, etc.
Add a CU and you add a proportional amount of other units.
I don't think that it's necessary to constantly repeat that the X1 had fewer of all of these units, when just saying that it had fewer CUs, does the same.
How do you cope with the fact that the PS5 has less units and an higher fill rate?

The ratios are completely different from one console to the next, actually the xbox one had more ROPs units per CUs (still less fill rate in total)
PS4
CUs: 18
ROPs: 32

PS4 PRO
CUs: 36
ROPs: 64

PS5
CUs: 36
ROPs: 64

One
CUs: 12
ROPs: 16

One X
CUs: 40
ROPs: 32

Series S
CUs: 20
ROPs: 32

Series X
CUs: 52
ROPs: 64

I have not looked at the TMUs, shader units, etc. but I assume they are added to a units built to the specs of the client (MS, Sony, Atari, etc.)

EDIT: Thanks Lysandros Lysandros I fixed PS4 PRO ROPs number.
 
Last edited:
What is a straight port? You're assuming it was a "straight port".

It's a native XSX optimized game, according to Xbox's official website: https://news.xbox.com/en-us/2020/11/13/inside-xbox-series-xs-optimized-the-touryst/
So are many other games that they just run thru the SDK to gdk and call it a day. Again Borderlands 3 is a native ps5 game. 1070 performance is the best we can expect from that engine on ps5, or do you think if they rewrite their engine specifically for the ps5 it could do better?
This really points to the difference in how Sony and Microsoft handle their graphics API doesn't it?

On Xbox devs use the same one size fits all GDK across all consoles/PC and adjust settings depending on the power of the machine. In this case they managed to push the Series X hardware up to 6K 60fps (optimized for Series X)

Sony do things differently in that they use a separate low level API for each console. The devs could have used base PS4 version in boost mode for PS5. Thankfully instead made a native PS5 version using the PS5 api and all its features to push 8K 60fps.

Devs have even explained why there is a difference between the two consoles I think some just don't want to accept the answer.
No, Xbox gives the option of making it simple, not that it has to be. You can do the same pretty much for ps4-ps5 and it's been done quite often. Just not this time.
 

assurdum

Banned
So are many other games that they just run thru the SDK to gdk and call it a day. Again Borderlands 3 is a native ps5 game. 1070 performance is the best we can expect from that engine on ps5, or do you think if they rewrite their engine specifically for the ps5 it could do better?

No, Xbox gives the option of making it simple, not that it has to be. You can do the same pretty much for ps4-ps5 and it's been done quite often. Just not this time.
I mean, at this point I think you are free to believe whatever you want. Developers said also ps5 specs give them some advantages for the resolution but eh.
 

winjer

Gold Member
Yes..

GPU​
5700​
5700XT​
5700 OC​
CU's​
36​
40​
36​
Clock​
1725 Mhz​
1905 Mhz​
2005 Mhz​
TFLOP​
7.95​
9.75​
9.24​
TFLOP Diff.​
100%​
123%​
116%​
Assassin's Creed Odyssey​
50 fps​
56 fps​
56 fps​
F1 2019​
95 fps​
112 fps​
121 fps​
Far Cry: New Dawn​
89 fps​
94 fps​
98 fps​
Metro Exodus​
51 fps​
58 fps​
57 fps​
Shadow of the Tomb Raider​
70 fps​
79 fps​
77 fps​
Performance Difference​
100%​
112%​
115%​

All GPU's are all based on AMD Navi 10, have GDDR6 memory at 448GB/s. Game benchmarks were done at 1440p.


PS. Oc course there is a silicon limit for clocks speeds and so after you cross that limit in a chip the performance won’t scale anymore… RDNA it was around 2000-2100Mhz depending of the chip (bin lottery).

That's an interesting test. But very limited. First the amount of games is very small.
And then, we don't know what API is being used, and especially if Async compute is on.
For example, Far Cry is a DX11 game. And as far as I know, DX11 has no support for Async Compute.
This could explain why some games lose on the 5700XT, and other win.

Regardless, I was expecting clock speed to perform worse, and you proved me wrong.
 

Heisenberg007

Gold Journalism
So are many other games that they just run thru the SDK to gdk and call it a day. Again Borderlands 3 is a native ps5 game. 1070 performance is the best we can expect from that engine on ps5, or do you think if they rewrite their engine specifically for the ps5 it could do better?

No, Xbox gives the option of making it simple, not that it has to be. You can do the same pretty much for ps4-ps5 and it's been done quite often. Just not this time.
You're making a huge left-field assumption that devs did not create the Xbox version with the same efforts as they did the PS5 version, while there is absolutely no reason/evidence to do so.

The devs have already told us the reason it's 8K on PS5, i.e., their game engine was more suitable to how the PS5 is designed: better memory setup and higher GPU clocks. That's it. It's really that simple.
 
Last edited:

Zathalus

Member
What's the point to rewrite the engine on XSX? Isn't it virtual coded with a multiplat development kit? Some you are something else sometimes. You don't have access to a low level of API on XSX, thanks the direct X Lord and the logic of coding by MS, not the developers.
The version of DX12 on the XSX is a low level API though. Heck, that was the entire point of DX12 in the first place, to compete with GNM on the PS4.


Q: Are you happy as DX12 as a low hardware API? A: DX12 is very versatile - we have some Xbox specific enhancements that power developers can use. But we try to have consistency between Xbox and PC. Divergence isn't that good. But we work with developers when designing these chips so that their needs are met. Not heard many complains so far (as a silicon person!). We have a SMASH driver model. The games on the binaries implement the hardware layed out data that the GPU eats directly - it's not a HAL layer abstraction. MS also re-writes the driver and smashes it together, we replace that and the firmware in the GPU. It's significantly more efficient than the PC.


DirectX 12 is a low-level programming API from Microsoft that reduces driver overhead in comparison to its predecessors.

It's also the reason DX12 has been such a hit and miss on the PC side of things, your game engine has to specifically coded to work well with it. Hence some games saw massive games (on AMD hardware) and others saw very little or actually regressed in performance.

Now it may very well be the PS5 API has a lower overhead then the DX12 version on the XSX, but it likely isn't anything huge as both are effectively using low-level APIs. Both consoles will continue to get performance enhancements as the lifecycle of both consoles go on, we even have a recent example of this, Control was improved on both consoles with zero input from the developer.
 

ethomaz

Banned
That's an interesting test. But very limited. First the amount of games is very small.
And then, we don't know what API is being used, and especially if Async compute is on.
For example, Far Cry is a DX11 game. And as far as I know, DX11 has no support for Async Compute.
This could explain why some games lose on the 5700XT, and other win.

Regardless, I was expecting clock speed to perform worse, and you proved me wrong.
API has no place in that type of test… after all the game is using the same API across the two cards.
You can do it yourself if you have the card.

You asked proof… I give you it in five second but you choose to ignore it.

There is another German site with similar results and conclusion but I don’t speak German so it will take too much time to find i.

Clock over cores is a thing in CPU/GPU/etc processors from decades ago… why it was choose the increase in cores instead clock was just the limitation of silicon tech to reach high clocks so they find an alternative to increase performance that was to paralelize cores.

While unit increase doesn’t give the same increase in performance as clock increase it is still pretty good and very close and has no limitation regarding the silicon except the size and/or the heat of the same.
 
Last edited:

winjer

Gold Member
API has no place in that type of test… after all the game is using the same API across the two cards.
You can do it yourself if you have the card.

I have only a 2070Super.
API has a place, since Async is the way to keep those shaders well feed.
It was one of the big improvements on the PS4 and X1 generation.
And it's only possible on low level APIs, like DX12 and Vulkan.
Without it, the GPU scheduler is kind of blind and stupid.
 
Last edited:

ethomaz

Banned
I have only a 2070Super.
API has a place, since Async is the way to keep those shaders well feed.
It was one of the big improvements on the PS4 and X1 generation.
And it's only possible on low level APIs, like DX12 and Vulkan.
Without it, the GPU scheduler is kind of blind and stupid.
The point that units keep being not used in a render time already tells you clock increase us advantage over unit increase.

You don’t need workarounds like ASync Compute to get more performance with increase in clock.

You are just trying to find a excuse to a fact… clock increase delivery more performance than unit increase (MS themselves knows that).
 
Last edited:

assurdum

Banned
The version of DX12 on the XSX is a low level API though. Heck, that was the entire point of DX12 in the first place, to compete with GNM on the PS4.







It's also the reason DX12 has been such a hit and miss on the PC side of things, your game engine has to specifically coded to work well with it. Hence some games saw massive games (on AMD hardware) and others saw very little or actually regressed in performance.

Now it may very well be the PS5 API has a lower overhead then the DX12 version on the XSX, but it likely isn't anything huge as both are effectively using low-level APIs. Both consoles will continue to get performance enhancements as the lifecycle of both consoles go on, we even have a recent example of this, Control was improved on both consoles with zero input from the developer.
It's not exactly as to have an effective low API hardware access however. It's the same argument used by MS for the velocity architecture.
 
Last edited:

Lysandros

Member
A CU has TMUs, ROP, vector units, scheduler, cacche, etc.
Add a CU and you add a proportional amount of other units.
I don't think that it's necessary to constantly repeat that the X1 had fewer of all of these units, when just saying that it had fewer CUs, does the same.

GCN_CU.jpg
Thanks for the reply. How do you explain the same amount of ROPs, GPU L1 cache, ACEs and HWS along with other units like rasterizers, prim units etc. across PS5/XSX despite the CU number difference then? As far as i know those scale with shader engine/array not CU count.
 

winjer

Gold Member
The point that units keep being not used in a render time already tells you clock increase us advantage over unit increase.

You don’t need workarounds like ASync Compute to get more performance with increase in clock.

You are just trying to find a excuse to a fact… clock increase delivery more performance than unit increase (MS themselves knows that).

Async Compute is not some workaround. It's a significant improvement to GPU scheduling.
You wouldn't complain if Intel or AMD improved it's CPU performance by improving it's branch prediction.
The fact is that Async compute is a standard for the last 2 console generations, on both Sony and MS machines.
Even things like choosing the width of Work Groups can have an effect on performance.
For example, from GCN to RDNA2, AMD went from work waves of 64,to 32. So it can have better granularity relative to shader assignment.
Is this also a problem?

Thanks for the reply. How do you explain the same amount of ROPs, GPU L1 cache, ACEs and HWS along with other units like rasterizers, prim units etc. across PS5/XSX despite the CU number difference then? As far as i know those scale with shader engine/array not CU count.

I was considering the GPUs on PC, which have a very modular stack.
Sony and MS have a more custom GPU approach.
Like I said before, the PS5 has double the Depth ROPs, and a higher clock speed, than Series X.
This is due in part to the PS5 back end being in line with RDNA1. And the render back end of the Series X being more in line to RDNA2.
But also, MS decided to sacrifice ROPs. Maybe to give a bit more space to shaders.
 
Last edited:
I mean, at this point I think you are free to believe whatever you want. Developers said also ps5 specs give them some advantages for the resolution but eh.
Yes some, but double?
You're making a huge left-field assumption that devs did not create the Xbox version with the same efforts as they did the PS5 version, while there is absolutely no reason/evidence to do so.

The devs have already told us the reason it's 8K on PS5, i.e., their game engine was more suitable to how the PS5 is designed: better memory setup and higher GPU clocks. That's it. It's really that simple.
No, I'm not. One is a launch game other came out 1 year later. One they said they rewrite their engine to better use the hardware and the other they said no such thing. I think it's more left field to assume the small differences in hardware will ever give double resolution.
 

Heisenberg007

Gold Journalism
Yes some, but double?

No, I'm not. One is a launch game other came out 1 year later. One they said they rewrite their engine to better use the hardware and the other they said no such thing. I think it's more left field to assume the small differences in hardware will ever give double resolution.

Here is a quote from the developers:
“Shin'en tells us [Digital Foundry] that in the case of its engine, the increase to clock frequencies and the difference in memory set-up makes the difference.”
You're believing that has no evidence and outright discarding what the developer is literally telling us.
 

Zathalus

Member
It's not exactly as an effective low API access however.
But that is what it is? Both GNM and DX12 are low level APIs. There is no real way to prove which one is more 'low level' than the other. We just know that both are. Unless you think Microsoft, Nvidia, and AMD are all lying?

The difference between DX11 and DX12 are the same as the difference between GNMX and GNM, heck PlayStation Shader Language is very similar indeed to the HLSL standard in DirectX as confirmed by a developer themselves:


Another key area of the game is its programmable pixel shaders. Reflections' experience suggests that the PlayStation Shader Language (PSSL) is very similar indeed to the HLSL standard in DirectX 11, with just subtle differences that were eliminated for the most part through pre-process macros and what O'Connor calls a "regex search and replace" for more complicated differences.

Also some good explanations about the difference between the PS4 GNM low level API and DX11:

A lot of work was put into the move to the lower-level GNM, and in the process the tech team found out just how much work DirectX does in the background in terms of memory allocation and resource management. Moving to GNM meant that the developers had to take on the burden there themselves, as O'Connor explains:

"The Crew uses a subset of the D3D11 feature-set, so that subset is for the most part easily portable to the PS4 API. But the PS4 is a console not a PC, so a lot of things that are done for you by D3D on PC - you have to do that yourself. It means there's more DIY to do but it gives you a hell of a lot more control over what you can do with the system."

The move from DX11 to DX12 is the exact same thing, many things that DX11 handled needs to be optimised and coded for the developer:

slide_4.jpg
 

Connxtion

Member
You're making a huge left-field assumption that devs did not create the Xbox version with the same efforts as they did the PS5 version, while there is absolutely no reason/evidence to do so.

The devs have already told us the reason it's 8K on PS5, i.e., their game engine was more suitable to how the PS5 is designed: better memory setup and higher GPU clocks. That's it. It's really that simple.
COD MW was ran through the newer XDK and then reverted as it done bugger all bar force the game on the internal storage 😂

So it’s 100% possible that they done the bare minimum to get it to 6K. (We just don’t know, well we know the renderer wasn’t rewritten for the Xbox) Seems the Xbox just brute forces it’s way to 6K 🤷‍♂️

Also the dev stated that the renderer was rewritten for the PS5 API’s that’s more than what was done for the Xbox version.
 
Here is a quote from the developers:

You're believing that has no evidence and outright discarding what the developer is literally telling us.
No, not at all. You are disregarding the next line in the same quote. Unless writing engines to specific hardware doesn't give better performance?
 
Top Bottom