Support NeoGAF

thicc_girls_are_teh_best · Jul 13, 2020

Ascend Yeah, when we take this stuff and then line it up with mentions of how SFS works, you need extremely low latency to stream in textures of higher quality seamlessly. It starts to add up and that's good since MS want to take their sweet time to talk about the architecture themselves xD.

I think GDDR6's latency figures are way better than GDDR5's...wouldn't be surprised if it's very close to DDR4, though DDR5 and especially HBM2E will/do have better latency. MS have made some customization with the GDDR6 (either the chips themselves, the memory controller, features of the OS and kernel or some mixture of all of that) to implement ECC. Maybe that helps a bit?

Whatever the case I trust they've considered any issues regarding GDDR6 latency impacting any aspect of XvA they'd want to implement and design said parts of XvA around that.

oldergamer I think it comes down to some people not knowing that things like latency, bandwidth, speed etc. are different metrics and the technology standard being used in particular contextualizes them.

For example when talking about PCIe, bandwidth and speed are generally interchangeable; even though PCIe data is sent serial (PCI was parallel), you're still basically moving whatever amount of claimed data that spec rates at, at that amount per second. It's just serial rather than parallel over the lane. Latency can have a big impact on speed depending on what the task calls for (there's a good example in the last bits of the post from function I quoted earlier), and bandwidth (as in parallelism of data over a fabric or network) can as well.

They're not exactly one in the same even if many standards essentially blend performance to a point where it seems they could be used interchangeably.

Xplainin · Jul 13, 2020

thicc_girls_are_teh_best said:
Ascend Yeah, when we take this stuff and then line it up with mentions of how SFS works, you need extremely low latency to stream in textures of higher quality seamlessly. It starts to add up and that's good since MS want to take their sweet time to talk about the architecture themselves xD.

I think GDDR6's latency figures are way better than GDDR5's...wouldn't be surprised if it's very close to DDR4, though DDR5 and especially HBM2E will/do have better latency. MS have made some customization with the GDDR6 (either the chips themselves, the memory controller, features of the OS and kernel or some mixture of all of that) to implement ECC. Maybe that helps a bit?

Whatever the case I trust they've considered any issues regarding GDDR6 latency impacting any aspect of XvA they'd want to implement and design said parts of XvA around that.

Do we know what MS were talking about when they said signalling was an issue if they had of used 560gbs all round on the RAM? And would the PS5 also suffer from signalling with their RAM?
I never quite understood what MS were going on about with that.

thicc_girls_are_teh_best · Jul 13, 2020

Xplainin said:
Do we know what MS were talking about when they said signalling was an issue if they had of used 560gbs all round on the RAM? And would the PS5 also suffer from signalling with their RAM?
I never quite understood what MS were going on about with that.

Believe it has something to do with electrical interference. With highly parallel buses crosstalk can become an issue. It's the major reason why we moved from PCI (a parallel interface) to PCIe (a serial interface).

But with GDDR in particular, the other issue is the length of the traces between the chips and the APUs. As the speeds keep increasing you have to try keep making the traces shorter and shorter. But traditionally GDDR is flatly placed on the PCB around the processor component (one of the reasons why if you want a wide bandwidth you need a larger chip to mux more memory controllers through). I read some stuff about stacked GDDR6, but don't honestly know how it functions or if there's any retail products that have implemented it.

Potentially stacked GDDR could compete with HBM RAM, which attempts solving some of the issues mentioned about GDDR by stacking DRAM modules on top of another (it has its own complications though, mainly with the interposer that interfaces the chip lanes with the processor), but it's an unknown for me. But yeah, I figure signaling rate of electrical current on the traces possibly causing crosstalk is what MS are referring to, it could also factor into why (aside from price) they went with slower 14Gbps chips, same as Sony.

Speaking of Sony, the Oberon C0/E0 (Oberon is, essentially, the PS5 APU's GPU) revision increased the memory controller to support faster 16Gbps chips, so technically they could've pushed for 512 GB/s if they wanted. But they are bound by the same potential of crosstalk interference issues as MS and maybe moreso in Sony's case since the GPU itself is already running so fast! I don't think the crosstalk/electrical signal interference issues are too bad tbh, but MS and Sony have to design systems that'll guarantee a level of long-term stability over a period of years.

Due to that, they can't push certain specs too far if they only give them temporary performance boosts but long-term performance degradation problems (or worst).

pawel86ck · Jul 13, 2020

I was convinced PS5 I/O was far supperior compared to XSX because SSD RAW numbers looks better on PS5, but after reading Thicc and Ascend posts (thanks guys for sharing your knowledge) it looks like MS is doing things differently, therefore they dont need as fast SSD as Sony in order to achieve the same goal. We will see interesting battle in the near future for sure.

sinnergy · Jul 13, 2020

Like I said before , for the 100 GB IT Sounds more like adres mapping into/onto the SSD, much like they remapped the esram from Xbox one to work on the One X as it only had GDDR5 if I am correct . Would be more like a cartridge idea from nes snes age, but totally awesome with today’s tech.

and a nice extension what they did with One X.

thicc_girls_are_teh_best · Jul 13, 2020

pawel86ck said:
I was convinced PS5 I/O was far supperior compared to XSX because SSD RAW numbers looks better on PS5, but after reading Thicc and Ascend posts (thanks guys for sharing your knowledge) it looks like MS is doing things differently, therefore they dont need as fast SSD as Sony in order to achieve the same goal. We will see interesting battle in the near future for sure.

Yep, it would seem MS are focused more on cutting down latency while Sony is focused on increasing I/O bandwidth. MS's solution always required something that could be scalable even for lower-performance SSDs since XvA is also targeting PCs (or at least many parts of it, such as DirectStorage). So if you already know your solution can't rely on a standardized bandwidth (since different drives have different types of bandwidth), how do you still increase performance by a large factor? You optimize your solution around addressing latency.

Sony's solution only really need be concerned with PS5 and while they're allowing 3rd-parties to make compatible drives, they need hardware overhead to meet the raw throughput of Sony's approach which is designed around maximizing bandwidth. So a really simple way to express it is that Sony's is stronger, but MS's is smarter. But like the user 'function' was mentioning, a drawback with MS's approach is you need compatible drives made with custom firmware. That might explain why they've partnered with Seagate on the expansion cards.

sinnergy said:
Like I said before , for the 100 GB IT Sounds more like adres mapping into/onto the SSD, much like they remapped the esram from Xbox one to work on the One X as it only had GDDR5 if I am correct . Would be more like a cartridge idea from nes snes age, but totally awesome with today’s tech.

and a nice extension what they did with One X.

A dev on Twitter named Louise Kirby speculated on this. I remember a user here (very knowledgeable person seemed like) saying it was impractical since SSD speeds are way too slow compared to GDDR6. Which is true.

BUT, that might've been them looking at it the wrong way it seems. It was never about speed or bandwidth, but latency. And now I think we have enough converging speculations from various people both here and on other forums that support this. It also might support something I had thought of before but was talked out of: since just-in-time frame asset streaming (for things like textures) are more reliant on latency than bandwidth, there's the possibility the GPU could be able to stream in textures from storage directly to then either place in the GPU-optimized 10 GB pool, place in the 4 GB (as 4x 1 GB chips) pool while the CPU or other components are accessing the 6 GB pool (not really sure if this would be possible), or even just having a way to stream texture data directly through the GPU to load in the caches.

The third option might be possible if the GPU essentially sees the 100 GB partition package as virtual RAM, that way it's just treating it as RAM (in terms of memory mapping, not as a scratchpad because of NAND endurance levels), so rather than needing to put the data in physical RAM it can read it and then copy the data to the GPU cache. I think that might require a HBCC (someone else talked about that here, I think it was Ascend), and also some coprocessor component in the GPU which I personally think is the ARM cores mentioned in an Indian AMD engineer's LinkedIn from months back (this person was also a member of the Series X APU engineering team).

That would be a very big boost in executeIndirect features which were already present on XBO and a couple of Nvidia cards, and modern Nvidia GPUs already use FPGA cores for management of the onboard RAM plus supporting features like GPUDirectStorage, so it's a possibility the ARM cores mentioned in that profile could be a customization here (especially considering the XBO and One X don't have ARM cores AFAIK and handle background OS tasks differently than PS4 Pro, which did have an ARM core and 1 GB DDR3 for background tasks).

Deleted member 775630 · Jul 14, 2020

Xbox explained the XVA in more common language, here.

GODbody · Jul 14, 2020

X-Fighter said:
Xbox explained the XVA in more common language, here.

Some bullet points

Xbox Velocity Architecture

A revolutionary breakthrough in speed and performance beyond just hardware
Engineered to deliver the ultimate next-gen gaming experience

Custom NVMe SSD

Storage throughput 40x faster than Xbox One

DirectStorage API

Prioritizes game data for ultra low latency

Hardware-Accelerated Decompression

Maximizes throughput and CPU performance

Sampler Feedback Streaming

On-Demand texture detail eliminates wasted loading
Delivers a 2.5x multiplier of SSD performance and memory on average* (*2.5x performance improvement of Xbox Series X SSD and memory; varies by content)

T-Cake · Jul 14, 2020

I still can't get my head around how this is going to translate to the same games running on a PC.

Every time Xbox releases one of these tech spiels, I start salivating and want an Xbox Series X. Then I think about my PC and do I really need an Xbox. But then I think about my intial question. And so it goes round and round in a loop...

Jigsaah · Jul 14, 2020

GODbody said:
Some bullet points

Xbox Velocity Architecture

A revolutionary breakthrough in speed and performance beyond just hardware

Engineered to deliver the ultimate next-gen gaming experience

Custom NVMe SSD

Storage throughput 40x faster than Xbox One

DirectStorage API

Prioritizes game data for ultra low latency

Hardware-Accelerated Decompression

Maximizes throughput and CPU performance

Sampler Feedback Streaming

On-Demand texture detail eliminates wasted loading

Delivers a 2.5x multiplier of SSD performance and memory on average* (*2.5x performance improvement of Xbox Series X SSD and memory; varies by content)

So how does this compare to Kraken?

Bernkastel · Jul 14, 2020

New OP based on new info.

Bernkastel · Jul 14, 2020

Jigsaah said:
So how does this compare to Kraken?

Theres also lots of tweets on XSX hardware decompression by Richard Geldreich (worked at Space X, Valve and Ensemble)

Xbox Series X’s BCPack Texture Compression Technique 'might be' better than the PS5’s Kraken

Microsoft might have an ace up its sleeve to trump the PS5′ Kraken tech. Of the many things Mark Cerny explained in detail during Sony’s recent deep dive into the PS5’s tech, one thing he mentioned was the console’s new texture decompression tech, called Kraken. Though it’s too specific to be...

www.neogaf.com

Xbox Velocity Architecture - 100 GB is instantly accessible by the developer through a custom hardware decompression block

Perhaps you could show how you arrive at this conclusion.... Because clocks mean nothing naturally, the more CUs the better, even if they were clocked at 800mhz.

www.neogaf.com

Vasto · Jul 14, 2020

Velocity will be better than what PS5 has and Microsoft will actually show how their games are using it. With all of PS5s SSD talk it did not amount to anything. Now you hear crickets but before the conference that is all people talked about.

psorcerer · Jul 14, 2020

GODbody said:
Some bullet points

Xbox Velocity Architecture

A revolutionary breakthrough in speed and performance beyond just hardware

Engineered to deliver the ultimate next-gen gaming experience

Custom NVMe SSD

Storage throughput 40x faster than Xbox One

DirectStorage API

Prioritizes game data for ultra low latency

Hardware-Accelerated Decompression

Maximizes throughput and CPU performance

Sampler Feedback Streaming

On-Demand texture detail eliminates wasted loading

Delivers a 2.5x multiplier of SSD performance and memory on average* (*2.5x performance improvement of Xbox Series X SSD and memory; varies by content)

Not a single number. Just vague 2.5x of some unknown performance.
Pathetic.

Mochilador · Jul 14, 2020

T-Cake said:
I still can't get my head around how this is going to translate to the same games running on a PC.

Every time Xbox releases one of these tech spiels, I start salivating and want an Xbox Series X. Then I think about my PC and do I really need an Xbox. But then I think about my intial question. And so it goes round and round in a loop...

You may not need it. PC and consoles have different audiences.

LordOfChaos · Jul 14, 2020

X-Fighter said:
Xbox explained the XVA in more common language, here.

Definitely seems like they're trying to get ahead of being thought of as the more generic SSD and getting ahead of the IO complex (do they need a snazzy marketing term?) being marketed in earnest. And the focus on being a software + hardware solution. The PS5's SSD still does more, but it's going to be harder to explain the difference to most gamers who aren't into the intricacy of hardware, so smart move starting to market it in earnest first from Microsoft.

Jigsaah said:
So how does this compare to Kraken?

Both figures have already included compression figures. BCpack may have a higher ratio than Kraken, but the end total is already figured in. BCpack is also for game textures, Kraken afaik is applied to everything.

Thugnificient · Jul 14, 2020

I like reading about this because it's awesome to know how the tech works but ultimately, specs and numbers mean nothing. We'll know the real deal only when we see the games.

TheCrimsnFuckr · Jul 14, 2020

T-Cake said:
I still can't get my head around how this is going to translate to the same games running on a PC.

Every time Xbox releases one of these tech spiels, I start salivating and want an Xbox Series X. Then I think about my PC and do I really need an Xbox. But then I think about my intial question. And so it goes round and round in a loop...

Games will run on PC by both requiring larger amounts of RAM and a relatively fast SSD. A lot of the compression and DirectStorage API will also come over to Windows as well even if it won't work as well as the console.

Also, I use a PC and an Xbox for gaming, and I love that for Xbox first party games I can pick up right where I left off on either my PC or my Xbox. I wish it was also a thing for more third party games but who knows maybe it will. Personally I find the Xbox to be perfectly complimentary to my PC and as Xbox first party teams release more RPGs I am excited that if I want to just play on my couch after a long day I'll be able to.

Not enough to convince you to get an Xbox console if you already prefer PS I know, but I really enjoy the seamless cross save between PC and Xbox for first party games. Game Pass makes it much easier for me too but not everyone likes a subscription

Eliciel · Jul 14, 2020

X-Fighter said:
Xbox explained the XVA in more common language, here.

I think it has been specifically released to comment on PS5 SSD narrative.
This now makes it harder for PS5 to "stay ahead" or quality for a "stand-alone" argument except if Sony start talking about something we don't know yet..?

Jigsaah · Jul 14, 2020

Bernkastel said:
Theres also lots of tweets on XSX hardware decompression by Richard Geldreich (worked at Space X, Valve and Ensemble)

Xbox Series X’s BCPack Texture Compression Technique 'might be' better than the PS5’s Kraken

Microsoft might have an ace up its sleeve to trump the PS5′ Kraken tech. Of the many things Mark Cerny explained in detail during Sony’s recent deep dive into the PS5’s tech, one thing he mentioned was the console’s new texture decompression tech, called Kraken. Though it’s too specific to be...

www.neogaf.com

Xbox Velocity Architecture - 100 GB is instantly accessible by the developer through a custom hardware decompression block

Perhaps you could show how you arrive at this conclusion.... Because clocks mean nothing naturally, the more CUs the better, even if they were clocked at 800mhz.

www.neogaf.com

Appreciate it man. Seems like those in the know think Xbox Velocity may be better than what PS5 Kraken is offering when all factors are considered. Still need to see how it results in the games, but I mean the power is all the way there it seems.

oldergamer · Jul 14, 2020

these videos don't go into any specifics. Its just what was said before.

Thugnificient · Jul 14, 2020

Eliciel said:
I think it has been specifically released to comment on PS5 SSD narrative.
This now makes it very hard for PS5 to "stay ahead", except if they start talking about something we don't know yet..?

You can read more here.

Through the massive increase in I/O throughput, hardware-accelerated decompression, DirectStorage, and the significant increases in efficiency provided by Sampler Feedback Streaming, the Xbox Velocity Architecture enables the Xbox Series X to deliver effective performance well beyond the raw hardware specs, providing direct, instant, low-level access to more than 100GB of game data stored on the SSD just in time for when the game requires it.

enigmatic enthusiast · Jul 14, 2020

Eliciel said:
I think it has been specifically released to comment on PS5 SSD narrative.
This now makes it very hard for PS5 to "stay ahead", except if they start talking about something we don't know yet..?

What are you talking about? This proves the PS5 ssd advantage as it bears out the dominate throughput of the PS5. The ooDle license will widen the gap as it wasn't included in the 8-9Gbps>6.2Gbps

Eliciel · Jul 14, 2020

enigmatic enthusiast said:
What are you talking about? This proves the PS5 ssd advantage as it bears out the dominate throughput of the PS5. The ooDle license will widen the gap as it wasn't included in the 8-9Gbps>6.2Gbps

e.g. we didn't know decompression was Hardware based before

thicc_girls_are_teh_best · Jul 14, 2020

psorcerer said:
Not a single number. Just vague 2.5x of some unknown performance.
Pathetic.

They're probably referring to the raw speed, so 2.4 GB/s x 2.5 = 6 GB/s which is the same figure they stated way earlier for higher compressed data transfer rate.

Jigsaah said:
Appreciate it man. Seems like those in the know think Xbox Velocity may be better than what PS5 Kraken is offering when all factors are considered. Still need to see how it results in the games, but I mean the power is all the way there it seems.

I think the area where XvA beats PS5's SSD I/O is in latency. There's a lot of things pointing to it, and I've been talking about it the past few days plus getting insight into it from people on other forums like B3D (I'm just a lurker tho x3).

In terms of raw bandwidth Sony's solution is still faster, I don't think that can be denied. But if MS has been prioritizing extremely low latency the entire time (as it seems they have), that actually brings massive advantages in frame-to-frame asset streaming, prefetching etc. Basically taking what the DiRT 5 developer was speaking of and validating it.

enigmatic enthusiast said:
What are you talking about? This proves the PS5 ssd advantage as it bears out the dominate throughput of the PS5. The ooDle license will widen the gap as it wasn't included in the 8-9Gbps>6.2Gbps

Here's the problem. You're still looking at it apples-to-apples. Their approaches actually ARE quite different and while Sony's prioritized raw bandwidth, MS has prioritized extremely low latency. So a lot of the paper specs, in practice, can end up being cancelled out.

Which is why comparing things through the paper specs alone was never a good idea. But hey, people were doing the same thing with the GPUs and the TFs all up until March so it is what it is

Thugnificient · Jul 14, 2020

enigmatic enthusiast said:
What are you talking about? This proves the PS5 ssd advantage as it bears out the dominate throughput of the PS5. The ooDle license will widen the gap as it wasn't included in the 8-9Gbps>6.2Gbps

It's GB/s not Gbps.

Kraken won't speed it up. The purpose is that the algo is fast enough to process info that takes advantage of the PS5's SSD, unlike zlib. Kraken won't magically transform 8-9GB/s of raw data into 2,000 GB/s.

psorcerer · Jul 14, 2020

thicc_girls_are_teh_best said:
They're probably referring to the raw speed, so 2.4 GB/s x 2.5 = 6 GB/s which is the same figure they stated way earlier for higher compressed data transfer rate.

Yup. But no new info. Just vague marketing statements.

sendit · Jul 14, 2020

psorcerer said:
Yup. But no new info. Just vague marketing statements.

Yep. I don't understand how they can't explain their tech at a deeper level. NDA maybe? They add buzzwords to every tech they have and let people run wild with it.

GODbody · Jul 14, 2020

psorcerer said:
Yup. But no new info. Just vague marketing statements.

cyber69 said:
Yep. I don't understand how they can't explain their tech at a deeper level. NDA maybe? They add buzzwords to every tech they have and let people run wild with it.

They released an accompanying article about the Velocity Architecture here.

Kazekage1981 · Jul 14, 2020

Crimson_Fate · Jul 14, 2020

Thugnificient said:
It's GB/s not Gbps.

Kraken won't speed it up. The purpose is that the algo is fast enough to process info that takes advantage of the PS5's SSD, unlike zlib. Kraken won't magically transform 8-9GB/s of raw data into 2,000 GB/s.

The PS5 SSD rate is fixed at a maximum of 5.5GB/s raw and The decompression chip in the PS5 can handle the the maximum read rate of 5-5.5GB/s as in input . If the raw data it reads is compressed using Kraken or Oodle then this data after decompression can be as high as 22GB/s . I think they mentioned the average of about 9 for Kraken (likely more for Oodle)

So you are right, Kraken does not speed up the "RAW" read rate but it does increase the overall data read rate After decompression

Ascend · Jul 14, 2020

psorcerer said:
Not a single number. Just vague 2.5x of some unknown performance.
Pathetic.

It's not unknown. We have 2.4 GB/s raw, and 4.8 GB/s compressed. On top of that, you have this multiplier. So in practice, you would be getting 12GB/s equivalent throughput of doing things raw. I made a post about this quite a while back.

Edit: After reading through the link on MS website, it seems that it is above the raw throughput, not the compressed throughput. So the 12GB/s is incorrect, and it is indeedn 2.4 GB/s * 2.5, which gives you 6GB/s. Still fine. I guess I over-speculated about a few things back then ^_^

Edit2: Hm... I'm doubtful again.

This innovation results in approximately 2.5x the effective I/O throughput and memory usage above and beyond the raw hardware capabilities on average.

Xbox Velocity Architecture: A Closer Look at the Next-Gen Tech Driving Gaming Innovation Forward on Xbox Series X - Xbox Wire

When we set out to design the Xbox Series X, we aspired to build our most powerful console ever powered by next generation innovation and delivering consistent, sustained performance never before seen in a console with no compromises. To achieve this goal, we knew we needed to analyze each...

news.xbox.com

Must we interpret that as 2.5x above the 2.5GB/s, or, 2.5x above the 4.8GB/s, because, the compression is also hardware, so that means it should be 2.5x above 4.8GB/s. The word 'raw' can be interpreted to mean all hardware, or specifically the raw throughput of the I/O.

GODbody · Jul 14, 2020

thicc_girls_are_teh_best said:
They're probably referring to the raw speed, so 2.4 GB/s x 2.5 = 6 GB/s which is the same figure they stated way earlier for higher compressed data transfer rate.

I think the area where XvA beats PS5's SSD I/O is in latency. There's a lot of things pointing to it, and I've been talking about it the past few days plus getting insight into it from people on other forums like B3D (I'm just a lurker tho x3).

In terms of raw bandwidth Sony's solution is still faster, I don't think that can be denied. But if MS has been prioritizing extremely low latency the entire time (as it seems they have), that actually brings massive advantages in frame-to-frame asset streaming, prefetching etc. Basically taking what the DiRT 5 developer was speaking of and validating it.

Here's the problem. You're still looking at it apples-to-apples. Their approaches actually ARE quite different and while Sony's prioritized raw bandwidth, MS has prioritized extremely low latency. So a lot of the paper specs, in practice, can end up being cancelled out.

Which is why comparing things through the paper specs alone was never a good idea. But hey, people were doing the same thing with the GPUs and the TFs all up until March so it is what it is

Yeah the consoles are much more distinguished from each other in the coming generation. It's really hard not to just compare paper specs because that's what the current generation was built upon due to the Xbox One and PS4 being much less customized. In the coming gen with the addition of the SSDs asset streaming has become paramount and while Sony has been boasting about their SSD bandwidth it seems that Microsoft has not only bridged the gap through software but placed further distance between the Series X and the PS5.

The biggest game changer of the new consoles seems to me to be sampler feedback streaming as it will be delivering with a 2-3x multiplier on I/O Bandwidth and Memory. I've seen it reiterated many times by Microsoft and they seem pretty confident in that statement. That means the Series X can effectively transfer and store game assests that would have taken 20-30GBs of space in 10GB of RAM with a transfer rate of 4.8-7.2 GB/s raw and 9.6 - 14.4 GB/s compressed. (Please note that it's not going to literally be these speeds and sizes just how much equivalent data a system without Sampler feedback streaming would have to utilize).

Combining that with the ultra low latency and the speculated memory paging SSG style storage controller and the ceiling for what to expect from the Series X becomes much higher.

Jaxcellent · Jul 14, 2020

Snappy video, i like it, for real I'm so ultra curious of what the difference will be, could be huge, could be minimal.

ZywyPL · Jul 14, 2020

I think there's a huge misconception how th XB VA works, especially that "multiplier" part, so to make things simple, let's use an image:

On this picture we can see only three sides of the Rubic's cube, BUT - the system still loads and uses textures for all 6 of them, wasting memory space and bandwidth. And now what MS is doing with VA, specifically the SFS component, is making the system loading and using indeed only the textures for those three visible sides, and by doing so, they effectively cut the data size by half, which also means half the required bandwidth. Now if we want to talk about rendering a whole scene, for example instead of 6GB the same scene will now use just 2-3GB instead, so instead of 2.4GB/s they can achieve the exact same on-screen result by utilizing only 0.8GB/s of the SSD. Which then creates an opportunity/headroom to add 3-4GB worth of additional objects/textures to fit within that previous 6GB size, utilizing the 2.4GB/s bandwidth, which otherwise would need 18GB RAM and 7.2GB's bandwidth. long story short, they can achieve the same stuff, but with only half-third of the resources.

psorcerer · Jul 14, 2020

ZywyPL said:
I think there's a huge misconception how th XB VA works, especially that "multiplier" part, so to make things simple, let's use an image:

On this picture we can see only three sides of the Rubic's cube, BUT - the system still loads and uses textures for all 6 of them, wasting memory space and bandwidth. And now what MS is doing with VA, specifically the SFS component, is making the system loading and using indeed only the textures for those three visible sides, and by doing so, they effectively cut the data size by half, which also means half the required bandwidth. Now if we want to talk about rendering a whole scene, for example instead of 6GB the same scene will now use just 2-3GB instead, so instead of 2.4GB/s they can achieve the exact same on-screen result by utilizing only 0.8GB/s of the SSD. Which then creates an opportunity/headroom to add 3-4GB worth of additional objects/textures to fit within that previous 6GB size, utilizing the 2.4GB/s bandwidth, which otherwise would need 18GB RAM and 7.2GB's bandwidth. long story short, they can achieve the same stuff, but with only half-third of the resources.

There's a problem with the reasoning.
The thing is: finding which parts of the cube are visible and which textures are needed is called "rendering", so, you cannot render without rendering...

Lort · Jul 14, 2020

Microsoft ( https://news.xbox.com/en-us/2020/07...y-architecture/amp/?__twitter_impression=true )just confirmed average expected throughput of xbox series x ssd with sample feedback streaming is...

4.8x2.5 = 11.52 GB a sec( also with more ram free)

ps5 is 9 GB a sec (expected actual throughput)

Potential for another neogaf hype train crash gaining speed...

Lort · Jul 14, 2020

psorcerer said:
There's a problem with the reasoning.
The thing is: finding which parts of the cube are visible and which textures are needed is called "rendering", so, you cannot render without rendering...

...you dont render a texture to discover if its scene you “render” a polygon at the culling stage... when the cube is turning and the first part of the side becomes visible it uses a lower in mem asset then immediately starts streaming the higher asset before its needed.

psorcerer · Jul 14, 2020

Lort said:
...you dont render a texture to discover if its scene you “render” a polygon at the culling stage... when the cube is turning and the first part of the side becomes visible it uses a lower in mem asset then immediately starts streaming the higher asset before its needed.

Not so fast. Each pixel still needs a texture sample. You cannot know which samples are needed without that stage. And you do want to know it.

NullZ3r0 · Jul 14, 2020

I wonder if the PS5's SSD throughput numbers are peak or sustained?

Lort · Jul 14, 2020

psorcerer said:
Not so fast. Each pixel still needs a texture sample. You cannot know which samples are needed without that stage. And you do want to know it.

As i said a low res texture is used thats kept in memory .. and as the object comes partially into view the use of it will automatically trigger SFS to load the higher res texture. ( they specifally give a very similar example in a MiP map in the link i just gave).

Quite a few people on both sides are saying you can just load in textures with the frame but i think thats unlikely to be practical as it will cause massive stalls in the GPU waiting for data. I think like they do now on all systems textures that need to be full resolved in less than 1 frame will be forcibly cached in memory.. there are very few of these though usually.

geordiemp · Jul 14, 2020

From edge magazine, you have already seen a Texel version of SFS in UE5 demo, they only load whats in view already.

Its not special sauce loading whats in view, you have all seen it already....it will be in all UE5 Nanite games but there are no LODs.

The tech goes for beyond backface culling (which detects with polygons are facing away from a view, and doesn't draw them, saving on processing power).

"It's in the form of textures," Karis explains. It's actually like, what are the texels of the texture that are actually landing on pixels in your view? So, it's in the frustum......It's a very accurate algorithm, because when you're asking for it, it's requesting it. But because it's in that sort of paradigm, that means as soon as you request it, we need to get that data in very quickly."

Click to expand...

- Brian Karis (Nanite Inventor)

Lort · Jul 14, 2020

geordiemp said:
From edge magazine, you have already seen a Texel version of SFS in UE5 demo, they only load whats in view already.

Its not special sauce loading whats in view, you have all seen it already....it will be in all UE5 Nanite games but there are no LODs.

Its related but not the same ... the engine in that case requests just the texture thats required to be shown.. but that can happen in any game engine now ( but it doesnt always).

If doesnt mention anything about only requesting from the SSD small blocks from each individual texture file ( the presumptionhas to be that the whole texture is loaded in to ram from the ssd as thats how its traditionally done).

Nanites algorithm will likey work brilliantly with SFS, leveraging the ability to load partial texture files.

geordiemp · Jul 14, 2020

Lort said:
Its related but not the same ... the engine in that case requests just the texture thats required to be shown.. but that can happen in any game engine now ( but it doesnt always).

If doesnt mention anything about only requesting from the SSD small blocks from each individual texture file ( the presumptionhas to be that the whole texture is loaded in to ram from the ssd as thats how its traditionally done).

Nanites algorithm will likey work brilliantly with SFS, leveraging the ability to load partial texture files.

Its small triangles and texels, there is no LOD, there is no LOD blending from mipmaps, there is no mesh shading or geometry engine culling or whatever XSX and ps5 call it, its all done by nanite as the descriptions.

Good news is all games using Nanite can use it. So everyone wins.

Ascend · Jul 14, 2020

geordiemp said:
Its small triangles and texels, there is no LOD, there is no LOD blending from mipmaps, there is no mesh shading or geometry engine culling or whatever XSX and ps5 call it, its all done by nanite as the descriptions.

Good news is all games using Nanite can use it. So everyone wins.

Nanite can't handle animations, things like hair & grass, and transparent objects, exactly the things that were pretty much missing in the environments of the demo but are prevalent in many games. The character in the demo used traditional rendering techniques. So it has its limitations.

geordiemp · Jul 14, 2020

Ascend said:
Nanite can't handle animations, things like hair & grass, and transparent objects, exactly the things that were pretty much missing in the environments of the demo but are prevalent in many games. The character in the demo used traditional rendering techniques. So it has its limitations.

Oh I agree, Epic said they are working on hair / grass and objects that bend..., but you can also use both nanite and 4.25 traditional together, that was done anyway for the character in the demo.

Nanite can handle animations, as long as they are solids....

I hope they sort it out as Nanite looks great.

Dnice1 · Jul 14, 2020

NullZ3r0 said:
I wonder if the PS5's SSD throughput numbers are peak or sustained?

Microsoft constantly talking about Consistent & Sustained performance is probably no accident. I'm sure Mojang (Minecraft) have PS5 development kits so they should have first hand knowledge.

Lort · Jul 14, 2020

geordiemp said:
Its small triangles and texels, there is no LOD, there is no LOD blending from mipmaps, there is no mesh shading or geometry engine culling or whatever XSX and ps5 call it, its all done by nanite as the descriptions.

Good news is all games using Nanite can use it. So everyone wins.

There is nothing saying that each texel is an indivudal file .. its more likely a block of texels of say 10 Mb is used .. SFS will load only the part of the 10Mb required into RAM. Its possible pS5 has sub block caching which would help .. its also possible that the hardware SFS on xbox may not work as intended using nanite.

Overall i think both consoles will perform amazingly well and coding skills will be paramount in getting the most out of each console.

GODbody · Jul 14, 2020

geordiemp said:
Its small triangles and texels, there is no LOD, there is no LOD blending from mipmaps, there is no mesh shading or geometry engine culling or whatever XSX and ps5 call it, its all done by nanite as the descriptions.

Good news is all games using Nanite can use it. So everyone wins.

In Unreal engine 5 while geometry that you cannot see on screen is culled and not rendered. The full texture file that is mapped onto that geometry is still loaded into memory. Sampler Feedback Streaming enters here and only loads into memory the portion of texture that is visible on screen.

geordiemp · Jul 14, 2020

GODbody said:
In Unreal engine 5 while geometry that you cannot see on screen is culled and not rendered. The full texture file that is mapped onto that geometry is still loaded into memory. Sampler Feedback Streaming enters here and only loads into memory the portion of texture that is visible on screen.

Thats not what Brian said

"It's in the form of textures," Karis explains. It's actually like, what are the texels of the texture that are actually landing on pixels in your view? So, it's in the frustum......It's a very accurate algorithm, because when you're asking for it, it's requesting it. But because it's in that sort of paradigm, that means as soon as you request it, we need to get that data in very quickly."

Requesting the data and get it in very quickly - what do you think that refers to , memory or SSD ?

So I dont think so.

Support NeoGAF

Xbox Velocity Architecture - 100 GB is instantly accessible by the developer through a custom hardware decompression block

Gold Member

Banned

Gold Member

Banned

Member

Gold Member

Deleted member 775630

Unconfirmed Member

Member

Member

Gold Member

Ask me about my fanboy energy!

Ask me about my fanboy energy!

Member

Banned

Member

Member

Banned

Member

Member

Gold Member

Member

Banned

Member

Member

Gold Member

Banned

Banned

Member

Member

Member

Member

Member

Member

Member

Banned

Banned

Banned

Banned

Banned

Banned

Banned

Member

Banned

Member

Member

Member

Member

Banned

Member

Member

Similar threads