• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

What is the actual benefit of 8 versus 3 cores?

Dsal

it's going to come out of you and it's going to taste so good
I was sitting around waiting for a compile to finish today, and I was doing some reading about Amdahl's law, which can be used to find the benefit of adding extra processors. Click on the Wikipedia link there if you want to check out the math behind it.

So I decided to see what the actual benefit would be of having 8 cores, like what the Cell would have at full use, versus 3 cores like what the 360 will have. The following assumes that the cores function at about the same performance and the only difference is the number of them used. This probably won't actually be the real-world case obviously, but it's interesting to examine.

Anyways, when I graphed the two against each other, it was interesting to see how things stacked up. Of course, the biggest variable in all of this is how much of the theoretical game would benefit from parallelization. What is that exact number? Heck if I know, but here are the results for different values:

At 100% benefitting from parallel, a true fantasy world, the 8core is 166% faster. So at best we're not quite triple the speed.
At 90% benefitting from parallel, the 8core is 86% faster. A big drop from just 10% less.
At 80%: 8core is 55% faster
At 65%: 8core is 32% faster
At 50%: 8core is 19% faster
At 30%: 8core is 8% faster

Maybe this is why Sony is thinking about using less cores? There's obviously diminishing returns for the extra cores if you can't make everything very highly parallel.
 

Dsal

it's going to come out of you and it's going to taste so good
The theorem is interesting to apply to lots of things. For example, if 75% of your computing speed would benefit from more RAM and you want to double the performance, you have to upgrade to 3 times the RAM, not just twice the amount of RAM. Yet if only 50% of your computing benefits from more RAM, you could upgrade until the cows come home and not quite get to double the speed.

It's very counterintuitive.
 

gofreak

GAF's Bob Woodward
Amdahl's Law does not apply generally. The Sandia Experiments proved you could get a speedup of over 1000x across 1024 processors.

In other words, it depends on what you're doing.

Luckily quite a few things in games are parallelisable, either naturally or by being "cast" to parallel tasks.
 

Dsal

it's going to come out of you and it's going to taste so good
gofreak said:
Amdahl's Law does not apply generally. The Sandia Experiments proved you could get a speedup of over 1000x across 1024 processors.

In other words, it depends on what you're doing.

Luckily quite a few things in games are parallelisable, either naturally or by being "cast" to parallel tasks.

Yeah, mostly it's just interesting to see how much of a drop off there can be if things aren't parallelized well. Basically, if they can't get almost everything parallel there's less and less benefit to more cores.
 

fart

Savant
this is not what amdahl's law is used for

games are probably best modeled by a fixed time type equation anyways... however, the thing is, the way devs will be using the multicore, none of these equations will be applicable since they all apply to distribution of a single job over multiple cores. most likely what early developers will be doing is distributing multiple jobs over the cores either via a thread interface or some other kind of scheduling/dispatcher (i'm not familiar with how this works on ps3) - so it's more akin to like a heterogeneous grid on chip than a traditional highly parallel machine. this is most likely why they went with a few smaller, more robust cores, rather than a highly parallel chip with a very large number of very simple cores (connection machine anyone?)
 

gofreak

GAF's Bob Woodward
Dsal said:
Yeah, mostly it's just interesting to see how much of a drop off there can be if things aren't parallelized well. Basically, if they can't get almost everything parallel there's less and less benefit to more cores.

Almost everything? I don't know about that at all.

Many of the most computationally expensive game tasks are nicely parallelisable, I think if you can speed them up across as many cores as possible, that's a good thing.

For a games chip I certainly don't think (high) parallelism is a bad idea.
 
gofreak said:
Almost everything? I don't know about that at all.

Many of the most computationally expensive game tasks are nicely parallelisable, I think if you can speed them up across as many cores as possible, that's a good thing.

For a games chip I certainly don't think (high) parallelism is a bad idea.

Just off the cuff, you can imagine quite a lot of paralled code that could be used by almost all game code. And that's without breaking out code into smaller pieces such as spreading out AI or physics onto multiple threads by themselves .

AI, sound, collision detection, physics, graphical support, I/O, & network code.

As an aside, I wonder how much CPU time in next gen consoles could be eaten up by the latest and greatest audio encodings? Given that almost no developers could get the PS2 to encode DDS 5.1, it can't be all that trivial, and then you add in the lastest flavors of the dolby spec to tax the CPU even more...
 

Dsal

it's going to come out of you and it's going to taste so good
fart said:
this is not what amdahl's law is used for

games are probably best modeled by a fixed time type equation anyways... however, the thing is, the way devs will be using the multicore, none of these equations will be applicable since they all apply to distribution of a single job over multiple cores. most likely what early developers will be doing is distributing multiple jobs over the cores either via a thread interface or some other kind of scheduling/dispatcher (i'm not familiar with how this works on ps3) - so it's more akin to like a heterogeneous grid on chip than a traditional highly parallel machine. this is most likely why they went with a few smaller, more robust cores, rather than a highly parallel chip with a very large number of very simple cores (connection machine anyone?)

Ah okay. I was just going off of what I was reading in the entry. It did have a section on multiple processors in there, but if it's only applicable for one job spread out over processors than I can see how it doesn't apply as much. I know devs on the multicore platforms are being encouraged to separate out everything into "applets" or whatever they're calling the small jobber things as much as possible.
 

Dsal

it's going to come out of you and it's going to taste so good
Flo_Evans said:
8 whores are clearly better than 3...

oh wait you said cores.

I think there's a law of diminishing returns applying there...
 
Besides, the Cell chip in the PS3 doesn't have 8 full-fledged cores. It has 1 PPE (PowerPC Processing Element) and 7 SPEs (Synergystic Processing Elements). The PPE and SPEs operate differently and work on different tasks.
 

Apenheul

Member
Thread synchronisation is one of the most difficult techniques I know of. Two threads is doable, three is already pretty difficult but more than three can cause a lot of trouble. I wonder how many programmers will be able to use the PS3's power efficiently. I've worked on thread synchronisation in a game engine before and I've had great difficulty making sure no threads were accessing the same resources at the same time, and no threads were waiting for other threads to send their events for too long.

For the programmer's sake, less cores is better :)
 

gofreak

GAF's Bob Woodward
Apenheul said:
Thread synchronisation is one of the most difficult techniques I know of. Two threads is doable, three is already pretty difficult but more than three can cause a lot of trouble. I wonder how many programmers will be able to use the PS3's power efficiently. I've worked on thread synchronisation in a game engine before and I've had great difficulty making sure no threads were accessing the same resources at the same time, and no threads were waiting for other threads to send their events for too long.

For the programmer's sake, less cores is better :)

SCEA's GDC presentation may be of interest to you. You can take advantage of the SPEs even without multi-threading, for example (or explicit multi-threading on the programmer's part) - although better use of them may be made with a more obvious multi-threaded/parallel approach.

I think once programmers generally start thinking in multi-threaded terms and parallel terms, though that's more than half the battle. For some tasks, scaling up to take advantage of more cores will be trivial. For others it'll be much harder, and you'll have everything in between. The performance win is there if you can do it, though. Getting programmers to switch their mindsets will be hard though, no doubt, but it has to be done (not just in games, but everywhere. "the free lunch is over")
 

GaimeGuy

Volunteer Deputy Campaign Director, Obama for America '16
Apenheul said:
Thread synchronisation is one of the most difficult techniques I know of. Two threads is doable, three is already pretty difficult but more than three can cause a lot of trouble. I wonder how many programmers will be able to use the PS3's power efficiently. I've worked on thread synchronisation in a game engine before and I've had great difficulty making sure no threads were accessing the same resources at the same time, and no threads were waiting for other threads to send their events for too long.

For the programmer's sake, less cores is better :)
Tell me about it. When I reached threads in my Java programming class last year as a junior in high school, I was going, "guh...?"

I understand the concept, but actually getting it to work is a whole other story. And java is pretty easy when it comes to these types of things, from what I hear. :lol
 

Kleegamefan

K. LEE GAIDEN
Dsal said:
I was sitting around waiting for a compile to finish today, and I was doing some reading about Amdahl's law, which can be used to find the benefit of adding extra processors. Click on the Wikipedia link there if you want to check out the math behind it.

So I decided to see what the actual benefit would be of having 8 cores, like what the Cell would have at full use, versus 3 cores like what the 360 will have. The following assumes that the cores function at about the same performance and the only difference is the number of them used. This probably won't actually be the real-world case obviously, but it's interesting to examine.

Anyways, when I graphed the two against each other, it was interesting to see how things stacked up. Of course, the biggest variable in all of this is how much of the theoretical game would benefit from parallelization. What is that exact number? Heck if I know, but here are the results for different values:

At 100% benefitting from parallel, a true fantasy world, the 8core is 166% faster. So at best we're not quite triple the speed.
At 90% benefitting from parallel, the 8core is 86% faster. A big drop from just 10% less.
At 80%: 8core is 55% faster
At 65%: 8core is 32% faster
At 50%: 8core is 19% faster
At 30%: 8core is 8% faster

Maybe this is why Sony is thinking about using less cores? There's obviously diminishing returns for the extra cores if you can't make everything very highly parallel.


You are of course talking about XeCPU vs CELL...


Keep in mind that each of those XeCPU PPCs are dual thread (so up to six active threads) and only the PPE on CELL is dual threaded....the SPEs are a single thread each, IIRC...

So that leaves you with 9 threads (in theory) on CELL and 6 threads on XeCPU....

Of course only the PPE is directly comparable to one of the XeCPUs PPCs....the SPEs (collectively) will probably be faster than a PPE in some things and not as fast in others...

In other words...comparing PS3 and X360 CPUs is difficult at this time....PS3 should have a CPU edge but benchmarking against X360 would give us a better comparison and we don't have that information yet...
 

gofreak

GAF's Bob Woodward
Kleegamefan said:
Of course you can only the PPE is directly comparable to one of the XeCPUs PPCs....the SPEs (collectively) will probably be faster than a PPE

...or 2, or 3, or 4, or 5, or 6...(just with some things of course ;)).
 

gofreak

GAF's Bob Woodward
Kleegamefan said:
Are you talking about PPEs???

Do you think 7 SPEs would be faster than 6 dual threaded PPEs??

For some specific tasks?

Yeah, sure. Or to put it another way, the PS3 Cell could be 2-3x as fast with certain things than 3 PPEs (which may be subtly different).
 

fart

Savant
SO MANY TLA'S

the bottom line is that you can speculate all you want but performance is going to depend completely on utilization. flat out you can go combinatorically by FLOPS or whatever your elementary measurement of choice is, but real software systems using these things which have much more interesting and unpredictablee performance curves
 
Top Bottom