• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

AMD video on shared cache and 22 % performance improvement

geordiemp

Member


The AMD patent is about shared L1 cache which gives an average of 22 % more performance for shader workload, and for some of the work loads almost 50 % (BVH calcs in ray tracing).

Note that the technique has a shared and private mode for the L1 cache, as not every instruction benefits from sharing data between caches.

What is not clear is the mesh sharing of data, as this also brings L2 into consideration and how big the cache is on die overall.

The current rumours is that this will be a main of AMD reveal of RDNA2, as 22 % performance increase by optimising an L1 cache is a big deal, what we dont know is how big L2 cache will be to fgurther increase the IPC, as it is also suggested L2 is on the mesh connections between caches.

I doubt the recent rumours of 128 MB of L2 cache, that would be infinity in die terms but would take up so much die it seems overkill, and I would think as cache is meshed to talk to each other 128 MB is a combination of caches, but not long now I guess until the mystery clears.

22 % IPC would make a 23 to 24 TF card punch close to a 30 TF card, its not all about TF. Certainly interesting and cant wait. This generation will be about speed and caches from AMD at least.

Has Ps5 got infinity cache ?

Who knows, to add infinity cache mesh connection only increases the die by 0.09 mm2 per cu, so on die shots we would never know as 4 mm2 is pretty hard to detect. Its likely the mesh interconnects for all the cache would increase wafer cost and more layers rather than die size.

So, Ps5 MIGHT have L1 infinity cache, only if the technology was available in time to include it in the design.

We know Ps5 wont have much L2, as the die size says we will be lucky to get 8 MB or 16 MB being overtly optimistic.

Cerny certainly would of gone for it as the cache scrubbers, coherency and Sony patent on data handling of pixel vertices to shaders tells us it was a strong Sony focus point.

Also note the 50 % BVH improvement, I guess we have seen some ps5 BVH ray tracing already, but its not a given.

Will XSX have infinity cache

Just as much chance as Ps5 to be fair. Hotchips did not show it, but MS could not talk about L1 at hotchips. The block diagram showed no common L1 but again it could of been altered to adhere to the NDA.

Playing devils advocate, why do a hotchips and not show the major IPC improvement ? I guess all will come clear soon.

MS will almost certainly been offered the technology like Sony, and again its a timescale thing, did console makers have time to include it ?

The other confusing aspect of MS is rhe XSX chip is dual purpose as a server, and running 4 instances it may be arranged to have L1 private as no point in sharing L1 cache among 4 different games.

But L1 ininity cache would add about 56 x 0.09 mm2 to the die, so we would not know unless MS tell us. We know the L2 cache is 5 MB, so just scaled by PHY.

We have not seen much ray tracing BVH running in game on XSX, except minecraft but not sure if that was playable and how it ran.

Lets hope both consoles have infinity cache, at least L1, and BVH higher capacity meaning we will see some ray tracing in next gen on consoles.
 
Last edited:

Bo_Hazem

Banned


The AMD patent is about shared L1 cache which gives an average of 22 % more performance for shader workload, and for some of the work loads almost 50 % (BVH calcs in ray tracing).

Note that the technique has a shared and private mode for the L1 cache, as not every instruction benefits from sharing data between caches.

What is not clear is the mesh sharing of data, as this also brings L2 into consideration and how big the cache is on die overall.

The current rumours is that this will be a main of AMD reveal of RDNA2, as 22 % performance increase by optimising an L1 cache is a big deal, what we dont know is how big L2 cache will be to fgurther increase the IPC, as it is also suggested L2 is on the mesh connections between caches.

I doubt the recent rumours of 128 MB of L2 cache, that would be infinity in die terms but would take up so much die it seems overkill, and I would think as cache is meshed to talk to each other 128 MB is a combination of caches, but not long now I guess until the mystery clears.

22 % IPC would make a 23 to 24 TF card punch close to a 30 TF card, its not all about TF. Certainly interesting and cant wait. This generation will be about speed and caches from AMD at least.

Has Ps5 got infinity cache ?

Who knows, to add infinity cache mesh connection only increases the die by 0.09 mm2 per cu, so on die shots we would never know as 4 mm2 is pretty hard to detect. Its likely the mesh interconnects for all the cache would increase wafer cost and more layers rather than die size.

So, Ps5 MIGHT have L1 infinity cache, only if the technology was available in time to include it in the design.

We know Ps5 wont have much L2, as the die size says we will be lucky to get 8 MB or 16 MB being overtly optimistic.

Cerny certainly would of gone for it as the cache scrubbers, coherency and Sony patent on data handling of pixel vertices to shaders tells us it was a strong Sony focus point.

Also not eth 50 % BVH imporvement, I guess we have seen some ps5 BVH ray tracing already, but its nota given.

Will XSX have infinity cache

Just as much chance as Ps5 to be fair. Hotchips did not show it, but MS could not talk about L1 at hotchips. The block diagram showed no common L1 but again it could of been altered to adhere to the NDA.

Playing devils advocate, why do a hotchips and not show the major IPC improvement ? I guess all will come clear soon.

MS will almost certainly been offered the technology like Sony,a nd again its a timescale thing, did console makers have time to include it ?

The other confusing aspect of mS is rhe XSX chip is dual purpose as a server, and running 4 instances it may be arranged to have L1 privaate as no point in sharing L1 cache among 4 different games.

But L1 ininity cache would add about 56 x 0.09 mm2 to the die, so we would not know unless MS tell us. We know the L2 cache is 5 MB, so just scaled by PHY.

We have not seen much ray tracing BVH running in game on XSX, except minecraft but not sure if that was playable and how it ran.

Lets hope both consoles have infinity cache, at least L1, and BVH higher capacity meaning we will see some ray tracing in next gen on consoles.


Man, that should spike TF utilization much further than the current state! Amazing if those leaked performances about 6900XT vs 3080/3090 are true. Could see that as well being a massive thing for PS5 if it has them. One week to go!
 

mansoor1980

Gold Member
Man, that should spike TF utilization much further than the current state! Amazing if those leaked performances about 6900XT vs 3080/3090 are true. Could see that as well being a massive thing for PS5 if it has them. One week to go!
u mean it will utilize the teraflop number efficiently?
 

geordiemp

Member
u mean it will utilize the teraflop number efficiently?

Think of TF as a maxium calculation, in practice in game work loads ANY GPU may be lucky with good code and apis to use 40 % of that power.

IPC of 22 % is just that, you use up 22 % more of the power of whatever it was performing at for shading. There is also CU utilisation and cache speed and efficiency it all adds up. TF is just 1 number
 
Last edited:

mansoor1980

Gold Member
Think of TF as a maxium calculation, in practive in game work loads ANY GPU may be lucky with good code and apis to use 40 % of that power.

IPC of 22 % is jist that, you use up 22 % more of the power of whatever it was performing at for shading. There is also CU utilisation and cache speed and efficiency it all adds up. TF is just 1 number
so they have been lying to us all along previously
 

geordiemp

Member
so they have been lying to us all along previously

Who has been lying ? Nobody who understands how computing works just looks at TF ? Its used as a simple metric for PR and marketing. Anyway, in a few days we will see a 22 TF card take in a Nvidia 30 TF card. Should be fun.

Dont be so sure of a clear difference.

does ampere have this infinity cache?

No, its AMD only. Nvidia have their own tricks and optimisations and IPC gains in Ampere.. Showdown.
 
Last edited:

Bo_Hazem

Banned
Is it possible that Cerny and AMD worked closley on this together, and it's a feature the PS5 has (but not XSX) and that's why Sony have been so shy on details, because of a possible NDA?

That could possibly be true. Here's an article by Forbes 2 years ago:

 
Last edited:

LordOfChaos

Member
so they have been lying to us all along previously

Not exactly, it's always just been a paper calculation of the maximum execution resources of the GPU hardware. Think of it like clock speed on a CPU. It doesn't tell you nearly everything in terms of how many instructions it can do per clock, the cache hit rate, the branch prediction, the misspredict penalty, etc etc. Is clock speed lying, no, it is what it is.

Is it possible that Cerny and AMD worked closley on this together, and it's a feature the PS5 has (but not XSX) and that's why Sony have been so shy on details, because of a possible NDA?

I've wondered the same, if they have to wait for RDNA2 to be fully disclosed first. Xbox was able to say a few details but still not nearly everything.

does ampere have this infinity cache?

It's an AMD patented AMD brand name for shared L1 cache. If Nvidia has any such thing they have not been loud about it. Nvidia did go quietly tile based a long time before anyone even knew they did, so it's always possible they're using performance techniques they haven't told anyone about.
 
Last edited:

Bo_Hazem

Banned
man microsoft wud be dumb to allow that

It's not about being dumb. Sony been in collaboration with them, so terms can't be broken, especially if some of those techniques were already patented by Sony, like GPU cache scrubbers that Sony didn't let AMD have them on their GPU's, which should enhance CU utilization even further.

If Sony was using a Windows-based API, you could see MS having some exclusivity of DX12 Ultimate as they worked closely with AMD in the software end of things, but Sony uses OpenGL or something different for their API.

Sony is a hardware company, Microsoft is a software company, both collaborate with AMD to lift some of that R&D costs and get better deals back for their dies.
 
Last edited:
im gonna call it .....silent sony sauce
Soylent Soyny Sauce. I like it :messenger_tears_of_joy:

For real though, Cerny is an engineering genius, regardless of what company he works for. I wouldn't be suprised if there was something on the PS5 that wasn't shared on the XSX architecture, just because of how closely AMD and Cerny have worked in the past.

Didn't he come up with some crazy shit this gen like checkerboard rendering or that fp16 to fp32 double TF magic or some shit?
 

mansoor1980

Gold Member
It's not about being dumb. Sony been in collaboration with them, so terms can't be broken, especially if some of those techniques were already patented by Sony, like GPU cache scrubbers that Sony didn't let AMD have them on their GPU's, which should enhance CU utilization even further.

If Sony was using a Windows-based API, you could see MS having some exclusivity of DX12 Ultimate as they worked closely with AMD in the software end of things, but Sony uses OpenGL or something different for their API.

Sony is a hardware company, Microsoft is a software company, both collaborate with AMD to lift some of that R&D costs and get better deals back for their dies.
this is interesting
i am looking forward to silent sony sauce
need to get GPU next month too for my new desktop
 

mansoor1980

Gold Member
Soylent Soyny Sauce. I like it :messenger_tears_of_joy:

For real though, Cerny is an engineering genius, regardless of what company he works for. I wouldn't be suprised if there was something on the PS5 that wasn't shared on the XSX architecture, just because of how closely AMD and Cerny have worked in the past.

Didn't he come up with some crazy shit this gen like checkerboard rendering or that fp16 to fp32 double TF magic or some shit?
i mean its not a secret any more.............right?
 

Bo_Hazem

Banned
Soylent Soyny Sauce. I like it :messenger_tears_of_joy:

For real though, Cerny is an engineering genius, regardless of what company he works for. I wouldn't be suprised if there was something on the PS5 that wasn't shared on the XSX architecture, just because of how closely AMD and Cerny have worked in the past.

Didn't he come up with some crazy shit this gen like checkerboard rendering or that fp16 to fp32 double TF magic or some shit?

Honestly, I'm not sure why Sony and Microsoft didn't just buy AMD. The more they wait, the higher the acquisition becomes. Just like Epic Games years back.
 

LordOfChaos

Member
Honestly, I'm not sure why Sony and Microsoft didn't just buy AMD. The more they wait, the higher the acquisition becomes. Just like Epic Games years back.


AMD's market cap is the same as Sony's, so I'm going with that haha.

Even before the big bull run, hard to buy something with 1/3rd the market cap as you even. Even looking at Microsofts biggest acquisitions, that were what, 22 billion or something, 100B is a different beast.
 
Last edited:

Bo_Hazem

Banned
No need to buy it. It wouldn't be a benefit to the business.

Sony could introduce them into new markets, like cameras, cellphones, etc. But seems like I thought they were cheap.

AMD's market cap is the same as Sony's, so I'm going with that haha.

Maybe I underestimated them a bit, lol.

Interactive chart of historical net worth (market cap) for AMD (AMD) over the last 10 years. How much a company is worth is typically represented by its market capitalization, or the current stock price multiplied by the number of shares outstanding. AMD net worth as of October 20, 2020 is $96.27B.

 

LordOfChaos

Member
Sony could introduce them into new markets, like cameras, cellphones, etc. But seems like I thought they were cheap.



Maybe I underestimated them a bit, lol.

Interactive chart of historical net worth (market cap) for AMD (AMD) over the last 10 years. How much a company is worth is typically represented by its market capitalization, or the current stock price multiplied by the number of shares outstanding. AMD net worth as of October 20, 2020 is $96.27B.



I remember seriously considering getting a bunch of them for 2 dollars a share and just leaving it for decades...Stupid, stupid...
 

Elog

Member
I really believe as well that this is one of the sources of secret sauce in the new cards and the main reason for the secrecy. IF - and that is a big if - they have stacked the L2 cache as speculated yesterday I think everything makes sense. Putting L2 off-die would take away a lot of mm2 from the main die that can be used to enlarge L0 and L1 without ending up with a main die that is too large.

Only a few days now!
 

geordiemp

Member
I really believe as well that this is one of the sources of secret sauce in the new cards and the main reason for the secrecy. IF - and that is a big if - they have stacked the L2 cache as speculated yesterday I think everything makes sense. Putting L2 off-die would take away a lot of mm2 from the main die that can be used to enlarge L0 and L1 without ending up with a main die that is too large.

Only a few days now!

You dont even need a big L2 cache to have infinity technology, just a shared L1 cache between all L0 in a mesh and a way to switch modes between private and shared to get 22 % IPC (and 50 % for BVH).

The big L2 just takes it to the nth degree and optimises bandwidth multipliers even further - we dont even know if the big PC infinity cache is L2 + L1 + L0 as its all connected.....

If its a separate die, it would have o be chip to chip low latency connection to be performant cache effecively seen as on die, dont think it will be wire bonded. But you never know.

I am done speculating lol, just show us AMD.
 
Last edited:

farmerboy

Member
At this point the infinity cache discussion is the most interesting one. From a design point of view, it sounds like something PS5 would have as the PS5 seems to be centred around optimisation and speed of moving info around the system.

Could this also be something co-designed by Sony and AMD?
 

geordiemp

Member
Ya'll really trying to make this about PS5 and secret sauce now? How well that work out last time? What was it last time...FP32?

I thought I was fairly equal in the possibility of Infinity Cache being included in either console. I am sorry if you read it differently.

Its fun not knowing, more to look forward to...
 
Last edited:
I thought I was failry equal in the possibility of Infinity Cache being included in either console. I am sorry if you read it differently.

Its fun not knowing, more to look forward to...
O, it's fine by me. I'd just hate to see people get disappointed when none of this comes to consoles. It's fun to speculate though. I'd wager that hardware for consoles has been locked down for quite a while and infinity cache seems to be a newer development. Not to say it couldn't have been added in time. I just don't think it is. Love to be wrong though, more power the better.

English isn't his/her native language, I guess.

I'll have you refer to me as Theymers from now on thank you very much.
 
Last edited:
Top Bottom