This is really interesting actually, I am wondering if the utilization rate on Zen 2 will have improved over Jaguar. Certainly would think so, so we could be looking at maybe 80% utilization, maybe even a tad higher depending on how some of the rumors that float around shake out (like unified L3 cache between the CCXs). I remember
@NXGamer doing a video on the next-gen GPUs and they threw out a random GPU utilization number for the PS4 and XBO that seemed a bit low. I dunno if that was just them going with a gut feeling of a number or if there's research that's been done (by whomever) to back it up but in that vid they also suggest that GPU throughput utilization for the next-gen systems should be much higher, his number was 65% and was maybe being conservative IMHO.
...
After reading your post I went and checked the AMD Stream processor info, and it turns out AMD's collaboration with Sony on PS4 resulted in the steam processors being wrapped back into the GPU as part of GCN (Async compute AFAIK), so the Zen2 cores are just as inefficient at such workloads as intel i7, PPC at 40-50% because of the branch prediction logic. However, I take your point about the XsX VA probably using ARM or some other ASIC option that is as efficient (+90%) as the IO complex.
What we don't know is what Xbox really means when they say 10% of a Zen2 core. Do they mean 10% of theoretical, or 10% in real work done. If they mean the later, the 50-60% inefficiency of the VA interfacing with a CPU core (to copy to memory) is in fact already factored in - so it means 10/40 or 10/50 20-25% theoretical of a core to get 10% real work done. 760Mhz - 950Mhz used, rather than 380Mhz.
If that is the case, it is a relatively small overhead and if able to evenly split across 8 cores and factored out for the developers by the hypervisor, then it would just mean the XsX CPU cores in SMT disabled mode would be running at 3.7GHz - or maybe at the full amount if Microsoft have clocked the CPU higher than 3.8GHz but opaquely held the extra back for the IO.
There is a possibility that the primary CPU core (that will need SMT mode enabled AFAIK) will need to be used for such a high priority, low latency task, in that case, I wouldn't be expecting the +90% of 6GB/s, but 40-50% of +90% of the 6GB/s., only because whether the 380Mhz-475Mhz gets deducted from the 3.6GHz or as a boost clock on the main core, going above 4GHz on the main core or dropping below 3.2GHz (to partition that performance off) wouldn't seem like a good solution.
I suspect it is the former solution with it an invisible upclock across the 8 cores and XsX getting +90% of the 6GB/s theoretical. I still think all the info we have about the asymmetric access points towards GDDR6 memory contention by the IO decompressor lowering the data width down to 160bits and the bandwidth down to 280GB/s for those transfers.
I'm a little disappointed that Xbox haven't offered more info on the VA and asymmetric memory considering at a glance (IMHO) it looks so unfavourable in real workloads to the simpler IO complex and unifed RAM setup; even just using the Zen2 core of decompression copies presumably adds latency to the IO compared to the dedicated IO complex hardware.