• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Fast, Faster and IBM's PlayStation 3 Processor

GigaDrive

Banned
http://www.linuxinsider.com/story/34548.html

Fast, Faster and IBM's PlayStation 3 Processor

By Paul Murphy
LinuxInsider
06/17/04 6:38 AM PT


In practice, Apple has never succeeded in getting the bulk of its developers to make effective use of the Altivec, and Sun has had essentially no success getting people outside the military and intelligence communities to use the four-way SIMD capabilities built into its Sparc processors.



Three years ago, IBM (NYSE: IBM) , Sony (NYSE: SNE) and Toshiba announced a partnership aimed at developing a new processor for use in digital entertainment devices like the PlayStation. Since then, the product has seen a billion dollars in development work. Two fabs, one in Tokyo and one in Fishkills, New York, have been custom-built to make the new processor in large volumes. On May 12th, IBM announced that the first commercial workstations based on this processor would become available to game-industry developers late this year.
A lot is known about this processor as planned, but relatively little real information about the product as built has yet leaked. To the extent that performance information has become available, it is characterized by numbers so high that most people simply dismissed the reports. In November of last year, for example, a senior Sony executive told an internal audience that implementations would scale from uniprocessors to 64-way groupings that would deliver in excess of two teraflops -- making it more than 10 times faster than Xeon.

Most of what we know about this machine comes from U.S. patent #6,526,491 as issued to Sony in February 2003 for a "memory protection system and method for computer architecture for broadband networks."

Here's the abstract:

A computer architecture and programming model for high speed processing over broadband networks are provided. The architecture employs a consistent modular structure, a common computing module and uniform software cells. The common computing module includes a control processor, a plurality of processing units, a plurality of local memories from which the processing units process programs, a direct memory access controller and a shared main memory.
A synchronized system and method for the coordinated reading and writing of data to and from the shared main memory by the processing units also are provided. A hardware sandbox structure is provided for security against the corruption of data among the programs being processed by the processing units. The uniform software cells contain both data and applications and are structured for processing by any of the processors of the network. Each software cell is uniquely identified on the network. A system and method for creating a dedicated pipeline for processing streaming data also are provided.


The machine is widely referred to as a cell processor, but the cells involved are software, not hardware. Thus a cell is a kind of TCP packet on steroids, containing both data and instructions and linked back to the task of which it forms part via unique identifiers that facilitate results assembly just as the TCP sequence number does.

Outrageous Performance Claims

The basic processor itself appears to be a PowerPC derivative with high-speed built-in local communications, high-speed access to local memory, and up to eight attached processing units broadly akin to the Altivec short array processor used by Apple (Nasdaq: AAPL) . The actual product consists of one to eight of these on a chip -- a true grid-on-a-chip approach in which a four-way assembly can, when fully populated, consist of four core CPUs, 32 attached processing units and 512 MB of local memory.
The per-cycle performance of the core CPU is undocumented but may be expected to be comparable to other PowerPC machines running at high cache hit rates. Specifications for the four or eight attached processors comprising the array are known; these are expected to turn in one floating point operation per cycle or around 32 Gigaflops for the fully populated array at a nominal 4 GHz.

That's where the apparently outrageous performance claims come from; a four-way assembly running at a planned 4 GHz offers 32 x 4 = 128 Gigaflops in potential floating-point execution. A 64-way supergrid made by stacking eight eight-way assemblies would have a total of 512 attached processors and could, therefore, break 2 teraflops if data transportation kept up with the processors.
In practice, however, Apple has never succeeded in getting the bulk of its developers to make effective use of the Altivec, and Sun has had essentially no success getting people outside the military and intelligence communities to use the four-way SIMD capabilities built into its Sparc processors. Grid computing is slowly entering the commercial mainstream, but combining both local-array access with grid computing requires a significant shift in programming paradigm that will not appeal to the mainstream Wintel and IBM customer base.

Gains Outweigh the Pain

For games developers, however, the potential gains -- up to 50 times the best x86-based processor and graphics board combinations can deliver -- should outweigh the pain. Even minor software change, the kind of thing Adobe does to take advantage of the Altivec in Photoshop, should offer significant advantages to a wider programming community and enable floating-point-intensive applications to run a full order of magnitude more quickly on this machine than on Intel's (Nasdaq: INTC) best.

An important point to bear in mind is that this processor will be inexpensive, and systems built around it even less expensive because no external graphics or network boards will be needed. Both Sony and IBM have been building fabs specifically to make this device. Volumes will be high because Sony will use up to 20 million assemblies in the PlayStation, while 10 million or more that don't quite make the quality cut will get used in its digital televisions and other products.
Very little has been publicly revealed about the operating system for this thing, but it is quite obvious what it has to be and how it has to work. Each core will have its own local Unix kernel, with most just executing cells as they arrive from the dispatch manager and one managing the traffic-coordination hardware. In all likelihood, the kernel used will prove to be both Linux-derived and Linux-compatible -- meaning that most Linux software will run out of the box on the uniprocessor configuration while software adapted for the grid environment will run unchanged on everything from the uniprocessor to configurations with hundreds or even thousands of processor assemblies.

As users of Sun's open-source grid software have found, performance losses on single processes increase as you add processors because data flow and timing control issues increase in complexity nonlinearly with system growth. Fundamentally, what happens is that the larger you make the total machine, whether on one piece of silicon or in a rack, the more cell transit time dominates execution time and the greater the performance cost imposed by the need to coordinate operations.


New Generation of Linux PCs

The patent mentions the use of no-ops (processor nulls) inserted into cells to get around timing problems associated with having components run at different speeds -- with processor coordination initially enforced by setting TTL-like time budgets for cell execution. My guess, however, is that advances in cell isolation and programming for asynchronous event handling have since obsolesced those solutions.
I expect, therefore, that when the real thing appears, it will fully support both the traditional grid format for on-chip work and an asynchronous hypergrid for multi-assembly processes on the model Thinking Machines hoped to achieve with the transputer-based hypercube in 1985 -- and that NSA is rumored to actually have built on 1989's Sparc-SIMD-based CM-5.

Either way, however, the OS for this machine is likely to offer both Linux compatibility at the low end and enormous scalability for those willing to modify their software -- which is why, as I discuss in next week's column, I expect IBM and Toshiba soon to launch a new generation of Linux PCs built around the combination of this CPU with IBM software products like Lotus Workspace for Linux.


making it more than 10 times faster than Xeon.

remember when it was meant to be 100 times faster than a 2.5 Ghz Pentium 4 ?

Xeon is basicly a Penium 4 with more cache and maybe a few tweaks here and there.....
 

P90

Member
At "only" 10x more powerful, that is still very significant. Much, much more than the PS2 to Xbox ratio.

Xbox2 looks to be in a world of hurt, hardware-wise vis a vis PS3 and probably N5.
 
P90 said:
At "only" 10x more powerful, that is still very significant. Much, much more than the PS2 to Xbox ratio.

Xbox2 looks to be in a world of hurt, hardware-wise vis a vis PS3 and probably N5.

The Xenon CPU's gonna be at least 5X more powerful than some of the current Xeons. 10X probably won't be out of the question.
 

DopeyFish

Not bitter, just unsweetened
isn't the ps3 processor going to be handling almost all the processing like audio/gfx, etc? if so when you compare it to Xbox2s rumored specs (3 dual core 2.5+ ghz PPCs and an r600 POSSIBLY dual core) it doesn't sound so impressive. It almost sounds like X2 will be the more powerful gaming system.

Can't wait to see how it pans out
 

kaching

"GAF's biggest wanker"
All of this is in keeping with the numbers batted about at Beyond3d, right cellapu?

If 64 way offers in excess of 2 Teraflops, then that gets 32 Gflops per cell processor. Broadband Engine is supposed to be 4 of these, right? Then there's the visualizer that's still a bit of question mark last I checked, but could be another 4 cell units.

Where's the quote that said one cell processor is supposed to be 100 times faster than a 2.5 Gig Pentium 4?

And reconcile that against a comment later in the article:

For games developers, however, the potential gains -- up to 50 times the best x86-based processor and graphics board combinations can deliver
 

Lord Error

Insane For Sony
isn't the ps3 processor going to be handling almost all the processing like audio/gfx, etc?
That's a complete unknown, but if the leaked Xbox 2 specs are to be believed, it's CPU will actually be handling audio encoding and other such 'lesser' functions, instead of having a dedicated media processor for that. I wouldn't be surprised the same happens with PS3.
 

DopeyFish

Not bitter, just unsweetened
Marconelly said:
That's a complete unknown, but if the leaked Xbox 2 specs are to be believed, it's CPU will actually be handling audio encoding and other such 'lesser' functions, instead of having a dedicated media processor for that.

The MCPX was a great idea :( too bad Nvidia/MS went sour in that respect
 

GigaDrive

Banned
isn't the ps3 processor going to be handling almost all the processing like audio/gfx, etc?

you mean PS3's main CPU processor, the BroadBand Engine? no. it won't be doing all the graphics. that will be done at least in part by PS3's graphics chip or GPU. audio, I don't know. it might be done on the CPU processor, or maybe an 'SPU3' (i.e. PS2 uses SPU2), or audio might be done on the EE or EE+GS (there for backwards compat)

isn't the ps3 processor going to be handling almost all the processing like audio/gfx, etc? if so when you compare it to Xbox2s rumored specs (3 dual core 2.5+ ghz PPCs and an r600 POSSIBLY dual core) it doesn't sound so impressive. It almost sounds like X2 will be the more powerful gaming system.

Can't wait to see how it pans out

what we know of Xbox 2's CPU configuration, it is 1 chip with 3 CPU cores, not 3 dual cores. that is, not unless the cores are POWER4 or POWER5. It is assumed that the cores are derived from PowerPC 970, each of which has 1 core. although it can process 2 threads, making it look like 2 cores to the OS. a POWER4/POWER5 is true dual core, and POWER5 can process 4 threads. also, for each of Xbox 2's 3 CPU cores, they're said to be running at 3.5 GHz. although that seems outrageous when you realize that the fastest PowerPC 970s are running at 2.0 or 2.5 GHz. they are struggling to reach 3 Ghz now.

As for R600 being dual core, I would absulutely love to see such a thing, although I think it's highly doubtful. the most we could reasonablly expect is for 2x the number of vertex shaders as the PC version of R600, much like NV2A in Xbox had 2x the vertex shaders of GeForce3/NV20.
 

GigaDrive

Banned
"All of this is in keeping with the numbers batted about at Beyond3d, right cellapu?"

yes and no I guess. depends on who's numbers you believe. ok I suppose its yes.

"If 64 way offers in excess of 2 Teraflops, then that gets 32 Gflops per cell processor. Broadband Engine is supposed to be 4 of these, right? Then there's the visualizer that's still a bit of question mark last I checked, but could be another 4 cell units."

32 Gflops per Cell processor (Processing Element) is much lower than 32 Gflops per APU. but yeah, that's basicly what Deadmeat has been saying.


"Where's the quote that said one cell processor is supposed to be 100 times faster than a 2.5 Gig Pentium 4?"

it's been in serveral articles. I'll dig it up soon.

"And reconcile that against a comment later in the article:

Quote:
For games developers, however, the potential gains -- up to 50 times the best x86-based processor and graphics board combinations can deliver "

true.
 

TAJ

Darkness cannot drive out darkness; only light can do that. Hate cannot drive out hate; only love can do that.
>>>if the leaked Xbox 2 specs are to be believed, it's CPU will actually be handling audio encoding and other such 'lesser' functions, instead of having a dedicated media processor for that.<<<

Shades of N64 here... ugh. Even going with the same audio solution as this gen (MCPX) would be MUCH better, IMO ( and would be ridiculously cheap by the time Xenon hits) Whoever called MCPX "the last word in console audio" wasn't far from the mark.
 

GigaDrive

Banned
doing audio on Xbox 2 CPU is worse than doing audio on PS3 CPU. because obviously PS3 is gonna have more CPU performance than Xbox 2... I like the concept of having seperate processor for each major area of game-processing (gameplay, audio, graphics)
....
 

Nerevar

they call me "Man Gravy".
P90 said:
At "only" 10x more powerful, that is still very significant. Much, much more than the PS2 to Xbox ratio.

Xbox2 looks to be in a world of hurt, hardware-wise vis a vis PS3 and probably N5.


yeah, except did you read the article? I don't pretend to know anything about grid-computing, but I know even at the most basic level multi-processor development is a bitch. Taking full advantage of this grid-based multiprocessor machine will take devs years unless Sony gets some incredible middleware / dev tools out, and fast. Furthermore, this is a more alien design than anything we've seen in the past. It sounds like, after reading the article, the cell chip will be much more useful to IBM to push it's Linux software boxes than it will be to Sony, but what do I know?
 
Deano Calver, a programmer at Climax (working on the Sudeki team), basically hinted that the Xenon's CPUs will not be stock PPC stuff in a recent thread on the B3D Forum.
 

D.Cowboys

Neo Member
Some people will believe anything in their favorite companies favor. The PS3 nor Xbox 2 probably will not be 10 times more powerful than the current systems.

Nevermind one of the next gens being 10 times faster than one another.
 
Uhhh, yeah. Ten times faster? Ain't gonna happen. So say the Xbox has a 733 MHz processor, would it's successor have a 7.3 GHz processor?




(I know processing power varies and is not simply just a measurement of MHz, but still. I'm just saying, ya know.)
 

Arcticfox

Member
remember when it was meant to be 100 times faster than a 2.5 Ghz Pentium 4 ?
2.5GHz P4 ~ 10 Gigaflops

If the article is right about the 2 Teraflops figure than the PS3 would be 200 times faster than a 2.5GHz P4. Earlier articles said it would be 1 Teraflop, hence the "100 times faster" quote.


I really doubt the PS3 will be anywhere near that powerful, though. I just don't see how 2 Teraflops could be ~$300 in 2-3 years. Hell, I don't even see it being possible in 10 years.
 
Error Macro said:
Uhhh, yeah. Ten times faster? Ain't gonna happen. So say the Xbox has a 733 MHz processor, would it's successor have a 7.3 GHz processor?




(I know processing power varies and is not simply just a measurement of MHz, but still. I'm just saying, ya know.)

I think they're talking about overall performance value.
 

kaching

"GAF's biggest wanker"
Gunsmoke said:
. . . and so it begins. The media hype machine.
Funny, I look at this article, and the particular lines you quoted, and I see a fairly balanced attempt to communicate the scope of the challenge, the potential pitfalls and potential rewards. You're the only one who can be accused of hyping by merit of portraying this article as something larger than it is.
 
Sony will use up to 20 million assemblies in the PlayStation, while 10 million or more that don't quite make the quality cut will get used in its digital televisions and other products.
WTF?

Did anyone else find this distrubing? No wonder their tv's last for 3 years. :(
 

cvxfreak

Member
Meh, I'm not excited. Games these days look realistic enough and whatever new processors come out probably won't destroy this generation's experiences much.
 
MightyHedgehog said:
I wouldn't worry. It's common practice to take lower quality chips and put them into lesser-end products. Celeron is an example of this.
No Celeron is an example of the dumps that Pentium chips take that turn out to suck ass.
 
CVXFREAK said:
Meh, I'm not excited. Games these days look realistic enough and whatever new processors come out probably won't destroy this generation's experiences much.

Well, I don't expect the average experience to be all that different from this gen's offerings, either. Still, this gen differs very little from last gen, but there's always a reason to get the new system...
 

FriScho

Member
GigaDrive said:
In November of last year, for example, a senior Sony executive told an internal audience that implementations would scale from uniprocessors to 64-way groupings that would deliver in excess of two teraflops -- making it more than 10 times faster than Xeon.

maybe he means with 64-way-grouping a 64 cell cpu group - therefore 2000 gflop/64 = 31 GFlop for each Cell - that's about 3-times the performance of a fast Pentium 4. In 2-3 years...
 

ge-man

Member
Gunsmoke said:
I tell you, the real company getting all the joy from all 3 competitors here is IBM.

Ain't that the truth. The will winners of the next-gen are IBM, with ATi closely behind them.
 

GigaDrive

Banned
Funny, I look at this article, and the particular lines you quoted, and I see a fairly balanced attempt to communicate the scope of the challenge, the potential pitfalls and potential rewards. You're the only one who can be accused of hyping by merit of portraying this article as something larger than it is.

agreed.
 

GigaDrive

Banned
btw here's one of the articles that mentions PS3's Cell CPU would be 100x more powerful than a 2.5 Ghz P4.

http://archive.gamespy.com/hardware/january03/playstation3/

Cell, scheduled to hit the market in late 2004 or early 2005, differs notably from current processors. This finely crafted chunk of silicon will contain multiple chips within a single unit, and will be able to perform in excess of one trillion mathematical calculations a second. Put into perspective, that makes it approximately 100 times more powerful than a 2.5 GHz Pentium 4 CPU!

edit: more

http://zdnet.com.com/2100-1103-948493.html
http://news.com.com/Chip+trio+allows+glimpse+into+'Cell'/2100-1001_3-948493.html
 

kaching

"GAF's biggest wanker"
Gunsmoke said:
Yeah, and it's great to be acknowledged with such little effort.
Whatever floats your boat, I'm just glad you tacitly acknowledged that you misrepresented the article.
 
Top Bottom