• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Reducing the CPU bound game lows on 2.9GHz/4.3GHz boost Core i5-10400

PaintTinJr

Member
/Summary
Regardless of CPU tier, gaming lows/choke points on CPU are either OC stability limits caused by too much power and heat because of workload(SSE & AVX2) at that clockspeed, or CPU throttling caused by firmware power limits or thermal(heat) limits because of workload(SSE & AVX2).

Many games are still single core clockspeed bound, so using AVX2 ratio offset to lower the AVX2 unit clockspeed - and reducing power draw and heat from AVX2 - independently of the main Core SSE clockspeed(and cache clock) can yield superior gaming performance by having more headroom to run the Core SSE clock faster in an OC, or manipulate Intel speedStep logic in your system's favour to allow the boost clock speed to be maintained on CPUs like the Core i5-10400.
/summary

The basis of this thread comes from synthetic results using 3D Mark's free Demo (TimeSpy) from Steam. But the theory for why the results improve should be sound and should equally apply in games.

Anyway, my test rig was my nephew's newly built budget gaming PC - I built and tuned for him, but certainly am not overclocking his brand new GPU, and am not able to overclock his CPU BCLK the small amount without a Z class chipset motherboard at £100 extra over his budget.

In order of his £555 budget it comprises of:
MSI 4GB AMD RX6500XT GPU (~£160)
Intel Core i5-10400F CPU (~£95 including stock cooler)
AsRock H470m HDV/M2 motherboard (~£75) (RAM limited to 2900MHz. NO XMP, No CPU O/C, PCIe3.0 only)
Corsair Carbide Delta RGB ATX Mid case (~£70)
Corsair TX550m 550w PSU(~£55)
Kingston Fury Beast 1x16GB DDR4 RAM module (~£40)
Windows 10 Pro x86/x64 license (~£35) (upgraded freely to Win11 Pro x64)
KIOXIA EXCERIA(formerly Toshiba Storage) 480GB SATA3 SSD (~£20) (nvme is a later upgrade option)
1 to 3 splitter cable for 4/3PIN case fans (~£5)
https://uk.pcpartpicker.com/list/Lfpfwc

Everything is pretty standard in the hardware build, except the 3 RGB fans have no RGB lighting because £15 RGB hub was beyond the budget and the motherboard doesn't have a RGB or aRGB header, and all the 3 front case fans connect to CHASSIS_FAN1 header via the splitter cable, and the PSU fan points upwards into the case, so it is drawing case warm air down from the GPU's 2 fans and venting out the back, down at the bottom - effectively working in tandem with the 4th free case fan(CPU_FAN2) that is drawing warm case air over the CPU stock fan and venting out the back, at the top, with the three front fans drawing in air through the restricted vents on the front panel.

On the Windows config side, the Kioxia drive uses the manufacture software to enable 8GBs of overprovisioning - the size of Windows in RAM typically - and the Windows virtual memory setting has been manually changed to an initial size of 20GB(RAM+VRAM size) and 40GB Maximum (the advised 2.5x Physical RAM) and in power saving the high performance profile has been selected and that profile slightly tweaked to turn off all timers to shutdown or save power. In Windows Explore's FolderOptions->View the "Launch folder windows in a separate process" has been enabled, and the recycle bin has been capped at 2GB, too to limit background recycle bin activity for the O/S book-keeping its spare drive space.

On the bios side the obvious alterations to enable Clever Access memory ( C.A.M., 4G, Resizable Bar) are done, despite the limits of the H470 chipset not supporting PCIe 4.0, so Rebar on PCIe 3.0 could be more beneficial. And the following setting have been set as advised in the bios to get more CPU performance as mobo marketing claims, effectively raising both the CPUs PowerLevel 1 (PL1 is the quoted CPU 65watt TDP as far as I know) and PL2 (the level when exceeded causes the Boost frequency of 4.3GHz to head towards the 2.9GHz base clock).

AVX2 Ratio Offset: Auto
BCLK Spread Spectrum: 0%
BCLK Aware Adaptive Voltage: Enable
Boot Performance Mode: Battery
FCLK Frequency: 400Mhz
Ring to Core Ratio Offset: Enable
Intel SpeedStep Technology: Enable
Intel Turbo Boost Technology: Enable
Intel Speed Shift Technology: Enable
Intel Thermal Velocity Boost Voltage Optimizations: Enable

At this configuration point the CPU and case fan profiles haven't been altered from bios defaults, they've only been adjusted for them all to monitor the CPU temperature and respond. The system still benchmarked quite well through the TimeSpy CPU test at the end, but on trying to run Minecraft, the CPU fan reaches screaming sound levels when the fan goes north of ~65% which certainly isn't good for the fan, the noise or the temperature of the CPU, or its subsequent performance as the temperature is reactively controlled by the poorest fan in the system by all metrics and is too little to late.

Changing the fan profiles, so that the CPU_FAN2 rear fan runs at 100% all the time, and the front 3 fans run at 50% until the CPU temp reaches 65degs, and then they immediately move to 100%.
The CPU fan is set to 47% all the way to 62degs. The CPU fan speeds increases in shallow amounts and so only reaches 80% fans speed at 85degs – which it should never reach because the 4 case fans which are remarkably quiet at full speed will be fully engaged by the CPU reaching 65degs when its own fan is then running only at 50%, which again is still largely inaudible.

Running the benchmark again with the new fan profiles improved the CPU score again, slightly, but altering the AVX2 Ratio Offset manually yielded the best result.

For anyone wondering what AVX2 Ratio Offset is, it is the number (eg 29) to multiply by x100 and subtract from the CPU boost frequency (4,300MHz – 2,900MHz = 1,400MHz), to workout what frequency the (A)dvanced (V)ector E(x)tension (2) vector units should run at for doing largely sparse FMA (fused, multiply add) instruction on a single clock cycle maybe used in decompression or physics in games as it gained traction of doing on the Cell BE and Xenos in the PS3/360 generation.

Reading about AVX2 Ratio Offset on the internet in the context of CPU overclocking would give the exact opposite advice that I’m going to propose here. Most people overclocking with top tier motherboards and top tier CPUs – or at least K class CPUs – would say if they can’t set the ratio to 0(zero) so that the AVX2 units run stably at their CPU Core Streaming SIMD Extensions overclocked frequency or Core SSE frequency for short, then their overclock isn’t stable and needs lowered.
But this CPU and chipset are far from top tier, and in most software, games especially a higher clock frequency for SSE will yield higher performance than AVX2.

AVX2 processing generates lots of heat from drawing far more power because the vector units are doing 3 instructions per equivalent clock than their single instruction SSE counterpart gates. So to avoid hitting a lowly CPU's PL2 early, that causes the boost clock to fall towards the base clock, it stands to reason that you want to reduce the AVX2 clock down towards the optimal value that allows the Core SSE clock to stay highest for longer by power efficiency and pre-emptive cooling around 65degs, while still having enough clock cycles for the AVX2 processing that it doesn’t become a big bottleneck.

My first attempt was setting the AVX2 Ratio Offset value to 14, so that the AVX2 clock would be 2.9GHz, figuring that as the system hits the PL2 the whole chip AVX2 and Core SSE will both be dropped to the base clock anyway, so setting AVX2 at the base clock would be optimal. It did improve the score on my nephew’s chip, but I was able to do better. From a mathematical/physics point of view, I seem to remember that 1.2GHz is the optimal frequency to power efficiency for a parallel circuit IIRC, but it turned out 1.4GHz (AVX2 Ratio Offset of 29, consistently posted the best synthetic benchmark), and my theory for that it is either just the silicon lottery of the specific chip, or that AVX2 doing 3 times the instructions is the closet value to 1/3 of the Core SSE Boost clock (4300/3 = 1433MHz). Either way, I thought this was an interesting thing to test, especially as Intel treat the motherboard chipset and the CPU itself more like a Pentium Gold than the Core i3-ish i5 it is, and that maybe people running much higher LGA1200 setups than my nephew’s will find it also helps them overclock their SSE clock higher for better performance, or someone with a Z class motherboard and a Core i5-10400 CPU will get a better BCLK than the minor boost people report it can get.
 
Last edited:

TexMex

Member
Understanding GIF by Transparent
 

PaintTinJr

Member
Upgrade your cpu.
No, not really because the CPU if it can maintain its boost clock, or thereabouts is actually really great for budget gamers to get to basecamp; especially for just £95 in the UK market which is massively overpriced. 6 Cores, 12 Threads and goo cache levels on desktop CPU that far exceeds the console CPUs isn't too shabby an entry point.

The next real tier up is £200 more between a Z chipset mobo and a K class processor, and then the additional cost of a cooler that isn't provided by a K class processor. £760 isn't a budget PC IMHO as the term used to mean, and if a kid had that much budget extra, they'd want to spend it on a £360 GPU, instead of a £160 one IMO.
 

PaintTinJr

Member
This is a $100 gpu, its bound (hehe) to be bound in modern games. No secret sauce
The TimeSpy benchmark improves its GPU score too with the AVX2 downclock - probably because the CPU cache doesn't throttle early when the PowerLimit 2 would have been hit, meaning the PCIe performance holds longer too - and the final section of the TimeSpy benchmark is using the CPU heavily as it is a CPU test, and results in higher CPU only scores too.
 

PaintTinJr

Member
Just buy the cheapest i5 and (almost)cheapest motherboard . And throw them away every 3-5 years.

There . The secret of pc gaming .
True, but budget PC gamers that can't wait for a £1200 in savings before building will have gotten their Warzone 2.0, Valorant, CS GO, Fornite, Age of Empires and Minecraft fix for sure - along with many others - from that hardware, no?

The problem is actually worse than that too :) The Windows license is tied to the motherboard, and in 3-5years a new memory or vastly faster clock memory is on the market, and that would need replaced too.

it is a system with enough PSU headroom to just about support a RTX 3070ti or RX 6700XT upgrade, and has enough case volume/cooling for that. It can support a nvme to use directstorage, an extra memory module for dual channel memory operation and could still support a better K level CPU, maybe bought cheap 2nd hand in 12months.

Other than the case and the Windows 10 Pro license I wouldn't have bought any of the other components for a more expensive Pc gaming build and I miss the days of before Pentium Pro where there was only one CPU tier per release cycle, Today we have about 20 tiers and this is around the half way mark.
 
Top Bottom