Playstation3 Head Tracking

Blogged under Cell, Consoles, games, Sony, Wii, PlayStation, Events by Barry Minor on Wednesday 5 March 2008 at 11:37 pm

After seeing Johnny Chung Lee’s wildly popular Wii head tracking video we were highly motivated to add this technology to our iRT ray tracer so colleague Joaquin Madruga quickly coded this function and we hit the road for GDC 2008.

 Left to Right, Joaquin Madruga, Johnny Chung Lee, Barry Minor

Left to Right, Joaquin Madruga (IBM), Johnny Chung Lee (CM), Barry Minor (IBM) 

At the show we demonstrated two infrared (IR) LED tracked displays. The first was a target scene, similar to Johnny’s, that we created in 3dsMax and the second was a 7 million triangle China town scene created in Maya by our partners at Threshold Studios (Thanks Threshold!!). The target scene was easily ray traced on a single Linux Playstation3 but the China town scene required some real horsepower so we deployed six QS21 Cell blades and rendered it remotely using a GigE connected blade center.

 iRT Demo Setup GDC 2008

Head tracking produces a very unique virtual window effect where the monitor appears to be a portal into a virtual world. The user wears a pair of IR LED equipped safety glasses which are tracked using an IR camera attached to the Playstation3. As the user moves, the view relative to the screen is computed and ray traced in real-time producing a strong motion parallax 3D effect. The next step for this technology will be passive head tracking using face tracking technology like that demonstrated by Richard Marks in the Sony booth at GDC 2008. What we need now is a passively head tracked 150” plasma with ray traced visuals at 120 frames/sec!!

iRT Head Tracking Video (YouTube)

iRT Head Tracking Video (Quicktime 28MB) 

Cell and the Boeing 777 at SC07

Blogged under Cell, Consoles, Industry News, PlayStation by Barry Minor on Friday 9 November 2007 at 3:42 pm

The Boeing 777 was the first airliner to be 100 percent digitally designed using 3D computer graphics. The resulting digital mockup is 23,000x more complex than today’s typical digital game assets and therefore requires some serious muscle to render at interactive frame rates. Every subsystem including wiring harnesses, hydraulics, air-conditioning, and fuel delivery are modeled in excruciating detail. Prior research has shown that this is truly a supercomputer class problem which is why we have unleashed a prototype piece of LANL’s Roadrunner system on it at this years SC07 conference.

 

In the IBM booth at SC07 the 350M triangle Boeing digital mockup will be rendered real-time at 1080p resolution using a hybrid cluster of Cell processor based QS21 blades and a Ridgeback (AMD Opteron) memory server. The Ridgeback holds the 25GB digital model in its memory and services blade data request via NFS RDMA over 2GB/sec InfiniBand.  Each blade is responsible for a dynamic region of the screen and therefore only requires a fraction of the digital model to be cached in its local 2GB memory. These regions are further subdivided among the local SPEs which DMA via software caches from the address space of the Opteron forming a memory hierarchy that's transparent to the programmer.

                          128GB                   2GB                      256KB

(x86 disk) –> (x86 memory) –> (Cell memory) –> (SPE local store) –> (SPE register file)

          120MB/sec            2GB/sec              25GB/sec                 50GB/sec

 

IBM’s software ray-traced solution (iRT) has several key advantages:

1) Completely scalable renderer (Frame rate scales linearly with number of blades)

2) Much higher image quality using ambient occlusion

3) Ability to scale to very larger scenes while maintaining interactive frame rates

4) High compute density (no power hungry GPUs in the server racks)

 

 Sample frames: 

 

 

 

Many thanks to The Boeing Company and David Kasik for providing us with the 777 digital model. 

Cell vs G80

Blogged under Cell, Consoles, Industry News, Sony, PlayStation by Barry Minor on Wednesday 5 September 2007 at 6:13 pm

I recently ran across an interesting paper, Stackless KD-Tree Traversal for High Performance GPU Ray Tracing, which documented the strides made by GPU based ray-tracing over the last decade and introduced a new way of mapping acceleration structure traversal to modern GPUs, namely Nvidia's new G80. The paper was authored by Philipp Slusallek's talented computer graphics group at Saarland University in Germany. Our own Cell iRT ray-tracer was based on papers written by Philipp's students so we have great respect for their work. It was interesting to see the great lengths researchers are willing to go through in order to harvest a fraction of the floating point potential locked away in these black boxes.   

From 10,000 feet here's how the Cell processor stacks up to Nvidia's new G80 GPU:

 

Both parts are compared at 90-nanometre.  

As you can see the G80 is twice as big, which is a good indication it requires twice the power, and produces twice the floating point power on paper.  However when we ran one of the benchmarks discussed in the paper, the Stanford Bunny, we found that the Cell processor when combined with the iRT produces significantly better performance (we don't have access to the other datasets listed in the paper):

  

 

Left to Right:  

2.6 GHz AMD Opteron - Saarland Ray-tracer

Nvidia GeForce 8800 GTX - Saarland Ray-tracer

Sony Playstation3 (partial 3.2 GHz Cell processor running Linux) - IBM iRT

3.2 GHz Cell Processor - IBM iRT

IBM QS20 Blade (Two 3.2 GHz Cell Processors) - IBM iRT  

In fact one Cell processor is four to five times faster at ray-tracing the Stanford Bunny than the G80 and the Cell QS20 blade, which has comparable floating point power on paper, is eight to eleven times faster.  Both the G80 and Cell crush the AMD Opteron at ray-tracing which is arguably the most popular production rendering processor today. It's also interesting to note that secondary rays are less costly on Cell which is where ray-tracing becomes interesting.  Primary ray cast is only interesting from an academic perspective. The real issue is secondary rays and GPUs have traditionally had problems with these do to their incoherent nature. When you factor power into the equation it gets even more interesting, given that Cell is half the size of the G80 and produces five times the ray-tracing performance.  

Things are starting to get interesting and Intel is hot on the trail with their Larrabee part which is said to be designed for ray-tracing.  

Only time will tell….

Interactive Ray-tracer (iRT) Available for Download

Blogged under Cell, Consoles, Industry News, Sony, PlayStation by Barry Minor on Wednesday 5 September 2007 at 5:23 pm

We have now released a standalone version of the iRT for the Cell processor.  The downloadable Linux binary runs on both the Sony Playstation3 (PS3) and the IBM QS20 blade.

http://www.alphaworks.ibm.com/tech/irt


 

This demonstration program shows both the ray-tracing potential of the Cell processor and the scalability of code written using the Cell SDK.  Under Linux, the PS3 only has access to 6 of Cells 8 SPEs and has no access to the RSX graphics processing unit. Despite this the iRT can software ray-trace a 333,000 triangle car at interactive frame rates and can spin the 69,000 triangle Stanford bunny around in 720p at better than 40 frames per second. The iRT is also highly scalable, the IBM QS20 blade runs 2.5 time faster than the Linux PS3 and performance continues to scale linearly as additional QS20 blades are added.
 
On the alphaWorks site you will find the demonstration program plus two data sets, have fun!

Want to run a game on Big Iron?

Blogged under Cell, MMOG, Industry News by David Berger on Thursday 26 April 2007 at 7:59 am

So, what do you get when you cross a Cell processor with an industry-leading mainframe system?

 The answer is a whole bunch of "wow."

 No doubt we'll be talking lots more about the "gameframe."  And keep an eye on our friends at the Mainframe blog for more.  

 

The power of Cell, the power of Slashdot, the power of an idea…

Blogged under Cell, Site news, Industry News by David Berger on Friday 6 April 2007 at 3:37 pm

Barry Minor's post and video on Cell-based raytracing has created something of a stir. After being "Slashdotted" the video has more than 40,000 views on YouTube. He's clearly struck a powerful nerve within the graphics programming community. I can't wait to see what's next.

PS3 Clusters

Blogged under Cell, Consoles, games, Industry News, Sony, PlayStation, Higher Education by Barry Minor on Tuesday 3 April 2007 at 7:40 am

The open side of the PS3 is a good way to get access to Cell technology as a programmer. Just head down to Toys-R-Us and toss 200 gigaflops into your cart. Programs like Stanford’s PS3 version of Folding@home are showing that today’s game consoles can form very potent compute clusters. In the video below (sorry about serpent like sound track) we show our IBM developed iRT ray-tracer running on a small PS3 cluster. This car model is 75x more complex than those used in today's games and ray-tracing is a class of rendering algorithm only deployed by the film industry, yet PS3s when clustered together handle this problem with ease. Our code was written using the Cell SDK so the same binary that was developed for the QS20 blade runs fine on the PS3, no changes. We just grabbed our Yellow Dog DVD, installed Linux on the PS3s, copied over the iRT binaries, and in minutes we had a very low cost 600 gigaflop cluster. While it's no match for LANL's massive Roadrunner system the same code can be run on both clusters.

 

Cell Power at GDC 2007

Blogged under Cell, Consoles, Industry News, Companies, Sony, PlayStation, Events by Barry Minor on Wednesday 7 March 2007 at 1:32 am

This week at Game Developers Conference IBM will show a Linux based PS3 real-time rendering a complex (3 million triangle) urban landscape, at 1080p resolution, using only software rendering techniques (iRT).

Even though the PS3’s RSX is inaccessible under Linux the smart little system will reach out across the network and leverage multiple IBM QS20 blades to render the complex model, in real-time, with software based ray-tracing.  Using IBM’s scalable iRT rendering technology, the PS3 is able to decompose each frame into manageable work regions and dynamically distribute them to blades or other PS3s for rendering.  These regions are then further decomposed into sub-regions by the blade’s Cell processors and dynamically dispatched to the heavy lifting SPEs for rendering and image compression.  Finished encoded regions are then sent back to the PS3 for Cell accelerated decompression, compositing, and display.

Here is a resolution reduced (30MB) Quicktime movie of the demo.

Myself, Mark Nutter, and Joaquin Madruga will be on hand in the IBM booth to run the demonstration so stop by, introduce yourself, and swap some Cell programming stories.  Even though much has been made in the press about how difficult the Cell processors is to program, our team of three started with a couple white papers and in only three months created this renderer, the 3dsMax to BVH tree output tool chain, the display client, and the blade distribution framework using only the tools provided in the Cell SDK.  Actually we spent as much time trying to figure out how to preserve our 3dsMax models during export and create a good BVH tree as we did writing the Cell code.

Cell Interactive Ray-tracer (iRT) at SC06

Blogged under Cell, Consoles, games, Industry News by Barry Minor on Sunday 12 November 2006 at 8:54 pm

This week, in the Los Alamos National Laboratory (LANL) booth at SC06, IBM will demonstrate newly developed interactive ray-tracing technology. The iRT will be running on a hybrid system consisting of four IBM QS20 Cell blades and an AMD Opteron based client. By dynamically balancing work across the four Cell blade’s 1.6 Tflops, the iRT renders high definition images at interactive frame rates using advanced techniques such as BRDF shaders, and ambient occlusion. This mini-Roadrunner is approximately 1/2000th of LANL’s monster 1.6 petaflop system. I hope they invite me back to run the iRT on their finished system.

Quicktime Movie of real-time iRT output
(Resolution reduced and H.264 encoded but still 27MB so be patient)

Notes from the PS3 media day now underway…

Blogged under Cell, Industry News, Sony, PlayStation by David Berger on Thursday 2 November 2006 at 2:04 pm

A source at Sony’s invitation-only PlayStation3 media day (now underway at a gallery in SoHo) phones in with an update:

  • 15 titles are being previewed, from both Sony and 3rd-party developers
  • The titles are being shown on 42″ HD plasma screens, at 1080p
  • Some of the most striking titles include Resistance: Fall of Man, NHL 2K7, NBA 2K7, and Lair
  • Great media buzz at the event
  • When asked how the titles looked on the HD screen, my source simply said (speaking of the NBA game) “I swear, it looks like live television.”

Sounds like a tough assignment! :)

Technorati tags: PlayStation 3, Sony

Mike Acton from the Austin Game Conference

Blogged under Cell, games, Industry News, Events by Mike Acton on Saturday 9 September 2006 at 2:20 am

My First Austin Game Conference

This was my first time at AGC and I have to say that it was a bit smaller than I expected. The conference only needed a small corner of the (rather large) convention center for the expo area and the meeting rooms. As a matter of fact, it was literally a longer walk from the nearest entrance of the convention center to the action than it was from my hotel to the entrance!

However, contrary to what one might expect from such a small conference, it was surprisingly professional. The expo area was well organized and the meeting rooms were kept neat and all the equiptment seemed to work. So all in all it was a pretty smooth experience.

I had not originally planned to attend, however. Noel Llopis, another Sr. Architect at High Moon Studios had planned what would surely have been a great presentation on Agile Game Development but Noel was called away on more pressing business (But if you are interested in more information, you might want to check out agilegamedevelopment.com). So, I was asked to substitute at the last minute. Although I am a proponent of many Agile methods and Scrum in particular, I wouldn’t be able to do the topic justice on such short notice. So after some discussion we decided that I would present something which I can speak on endlessly with very little notice - my current passion: Programming the Cell processor.

At Vivendi’s High Moon Booth

Along with the local Austin professional developers, it turns out that there are quite a few students and recent graduates that attend the conference. And since we are actively recruiting top talent at all levels, it was a great opportunity to talk to people and promote our studio - we were able to spend some quality time with quite a few applicants and really get into the details of why the culture at High Moon is unique.
Vivendi's High Moon booth at AGC 06

Quite a few of those students were prepared with resumes and demo reels and it was really great to feel the enthusiasm for the industry and our studio in particular. I did spend some time helping the students with their resumes, actually. Apparently there is a common format that 90% of them are using which made it difficult to tell them apart. I suggested each person forget about using off-the-shelf formats and write something that is a little more unique - or at least slightly different. Here are a few other problems I saw on resumes and my suggestions for fixing them:

  • Being too wordy. Especially at a convention where there’s a very limited time to read a resume. Say things simply. Don’t use 100 words when you can use 10.
  • Make it clear what you do. Are you a programmer? An artist? It seems obvious, but put that at the top. If you haven’t narrowed down what you can offer to a studio at least to a basic skill, your probably not going to get anywhere.
  • Microsoft Word is not a skill. I suppose that there are jobs for which it is not assumed that applicants can use basic office applications, but this is not one of them. This is especially true for programmers - it just makes your resume look silly. Photoshop and Maya are probably relavant though.
  • Put the links to your stuff on the web. A few programmers mentioned that they had demos or sample code on the web but there were no links to that information in their resumes. If you have something special, make sure it’s easy to find when your resume is evaluated again later.
  • Don’t overstate your strengths. I don’t expect kids fresh out of school to know everything, honestly - it’s not a problem. But if you are going to say that 3D math is your main strength at least be able to answer a couple of basic math questions. Or if you’ve listed x86 assembly as a strength be prepared to talk shop - I love programming in assembly and if you can’t then even carry a basic conversation about it, it’s a little disappointing. If you’ve just dabbled in something or have only worked with higher level APIs - that’s OK, just be honest about it.

Tapping the Cell

As it turns out, the right people were not informed that I was substituting for Noel. When I arrived, I was not on the list and didn’t have a badge. The staff did a great job of handling the situation quickly though and within minutes I was on my way with a custom hand-written name tag. But none of the schedules or door signs were changed. Rob Vawter from SCEA was gracious enough to mention my presentation during his, and our guys at High Moon really went above and beyond and helped me out by printing a session description and handing them out at the booth - that was really nice.

Overall, I think the presentation went well. I tried to respond to quite a few of the comments I received from my interview with PSINext. Specifically how high-level strategies for Cell programming are applicable to cross-platform titles and the impact of my suggestions on engine design. If you’re interested in the details of what I presented you can get: Tapping the Cell (Slides)

There were a couple of interesting questions that I can manage to remember:

“The basic philosophies between Agile development and the type of data-first design [I’m] espousing seem to share some similarities - is that a coincidence?”

I think the answer to that is both yes and no. Yes it is a coincidence in that any similarities are not there by design. But no, I think the similarities are there because both methodologies are based on the basic premise of knowing what the most important elements are and being prepared to adapt and change them to get practical benefits. I think knowing what’s both real and practical is more important than policy and procedure in programming and the Agile methods are similar in regard to development in general.

“How would you teach these approaches to an established programming team?”

This is a tough question that I still don’t have a great answer to. At the moment, I think the most realistic method is to work with one programmer at a time and demonstrate the real benefits that can be gained from changing their approach and perspective on programming. In general, programmers find it harder to argue with immediate results but can argue about “design philosophies” until they run out of breath.

And the obligatory…

No game development conference would be complete without:Obligatory Booth Babes

Was it worth it?

Yes. At the very least, as was pointed out to me, it is an opportunity to know better those who we may work with but never get the chance to spend time to really get to know eachother. And the Austin Game Conference has been one of the best experiences I’ve had with regard to being able to spend time connecting not just with old friends and colleagues from other studios, but from my own too. It really was the environment and the people that made this a worthwhile trip.

The race to 1,000,000,000,000,000

Blogged under Cell, Industry News by Catherine Helzerman on Friday 8 September 2006 at 12:12 pm

IBM and the U.S. government have announced a deal to build the first supercomputer using the Cell processor, the chip behind Sony’s PlayStation video game.

Its unusual “hybrid design” will combine the Cell Broadband Engine chip with AMD’s Opteron microprocessors to achieve a sustained speed of up to one quadrillion calculations per second.

Contributed by Sandra Dressel
The supercomputer, code-named Roadrunner, is being built in collaboration with the Department of Energy’s Los Alamos National Laboratory in New Mexico, and will handle a wide range of scientific and commercial applications.

Planned for completion in 2008, the supercomputer is expected to perform at a peak level of 1.6 petaflops, or 1.6 quadrillion calculations per second. Petaflop computing is the next grand challenge in high performance computing, spurring a worldwide race among companies and research organizations to reach the milestone.

The fastest supercomputer today is the IBM Blue Gene/L supercomputer housed at the Lawrence Livermore National Laboratory which is capable of more than 280 teraflops, or 280 trillion calculations per second. Besides Blue Gene, other well-known petaflop projects include systems at the Oak Ridge National Laboratory (Cray) and at Japan’s Institute of Physical and Chemical Research, called RIKEN.

Hybrid supercomputing

Roadrunner represents a new design in supercomputers that calls for using a mix of traditional commercial chips with accelerator chips to speed performance while keeping reduced power consumption and floor space in mind.  Hybrid designs are seen as a critical development to building faster systems. Traditional methods to increase chip performance by shrinking features to add more capabilities are hitting a physical barrier.

In this design, 16,000 AMD Opteron processors running on IBM’s System x3755 will handle routine computer processes, such as file input/output and communication activity.  Meanwhile, 16,000 Cell chips running on IBM’s slender blade design will handle the more complex graphical and mathematical intensive problems.  The system will run the open source Linux operating system.

The trick is in the application software.  Scientists at IBM, Los Alamos and AMD will work collaboratively to figure out how to best split the application code and map it to the right processors to optimize the performance of the application.  Typically, an application runs on only one type of processor, so there is no need to divide the code based on its function.

Once scientists learn how to optimize the application, this innovation could be transferred to a wide range of servers to create hybrid supercomputers, large or small.

Who needs increased performance?

Hybrid designs focus on programming tools and frameworks to help distribute the data so that applications scale better. As a result these designs could be of value across many commercial segments including financial engineering, seismic computing, digital and rich media, and information based medicine to name a few.

Additionally, petaflop computing itself could revolutionize some areas where additional computational capabilities are needed.  Those areas include drug designs, modeling of environmental pollution and long term climate changes, and real time nuclear magnetic resonance imaging during surgery.

Beyond gaming… PS3 in the fight against Cancer

Blogged under Cell, Consoles, Industry News, Sony, PlayStation by Catherine Helzerman on Monday 28 August 2006 at 11:00 am

Via PS3land.com

“According to an IGN report, Sony has signed a partnership with the Folding@home distributed computing project which will allow the development of a client to “allow idle Cell Processors to turn their considerable computational power from crunching the polygons that makeup curvaceous videogame breasts to crunching the math of folding proteins hold the secret to curing cancer”. And instead of purchasing surper-computers which run on the Cell, Folding@home will be using 10,000 PlayStation 3s.

According to the IGN article, “The Cell Processor is expected to perform calculations for Folding@home on the scale of 100 gigaflops”, which translates to a quadrillion floating point operations a second- “enough so that project leaders are now considering expanding their simulations to study and s and other forms of cancer.”

ps3

Tags: , , , ,

Interactive Rendering in The Post-GPU Era

Blogged under Cell, Consoles, Industry News, XBox by Barry Minor on Monday 14 August 2006 at 5:20 pm

This is the title of an up coming keynote speech from Graphics Hardware 2006. Matt Pharr is the speaker and his background at both Pixar and Nvidia makes the topic even more interesting.

On a similar note the August issue of Scientific American has an article entitled “A Great Leap in Graphics” where they also point to a quantum jump forward in computer graphics from both a shift to ray-tracing and multi-core CPUs like Cell.

Procedurally Generated Content

Blogged under Cell, Consoles, games by Barry Minor on Wednesday 9 August 2006 at 11:02 pm


.

The above image is a frame from a Quake like game called kkrieger that has a disk foot print of 97KB. No, not 97MB, the creators of this game squeezed everything into a meager 97,280 bytes. In an age when state of the art PC games consume more than 1GB of disk space this is quite shocking. Where this game differs is that all of the art assets are generated procedurally at run time instead of created by teams of artists and stored on your hard drive. Many of the techniques used in this area can be attributed to Ken Perlin’s noise functions and Loren Carpenter’s work with fractals. Kkrieger generates the art assets at run time as part of the loading process turning the stored mathematical descriptions into megabytes of memory resident textures and 3D geometries. The next step in this area is to generate all these art assets on the fly as they are need in a resolution independent way thereby dramatically reducing the memory foot print if the game, off chip memory bandwidth requirements, and finally removing those annoying fuzzy low resolutions textures that are visible when you walk up close to an object. Next generation processors like Cell were designed to excel at these techniques. SPEs are great noise generators that can churn out gigabytes of dynamic textures and procedurally generated geometry on demand. Moving such techniques from load time to run time will dramatically improve the visual quality of games and produce dynamic ever changing worlds that can be different every time you experience them.

Ray-tracing Receives New Focus

Blogged under Cell, Consoles, Industry News by Barry Minor on Monday 19 June 2006 at 1:24 am

Ray-tracing has always been the algorithm of choice for photorealistic rendering. Simple and mathematically elegant, ray-tracing has always generated lots of interest in the software community but its computationally intensive nature has limited its success in the interactive/real-time gaming world. However while the rendering time of traditional polygon rasterization techniques scales linearly with scene complexity, ray-tracing scales logarithmically. This is becoming increasing important as gamers demand larger more complex virtual worlds. Ray-tracing also scales very well on today’s multi-core “scale-out” processors like Cell. It falls into the category of “embarrassingly parallel” and therefore scales linearly with the number of compute elements. The graphics community has taken notice of these facts and is pulling together a conference to share ideas.

The 2006 IEEE Symposium on Interactive Ray-tracing

We plan to participate as we feel this topic is very important to the future of gaming and graphics in general.

Cell Can’t Texture?

Blogged under Cell, Consoles by Barry Minor on Friday 24 March 2006 at 12:14 pm

Much has been said about Cell’s presumed inability to texture map well. Given the small (256KB) local stores and DMA memory access, the SPEs were relegated by many to only handle nice streaming geometry type workloads. This seemed like an issue ripe for a little prototyping.

First, colleague Mark Nutter, implemented a software cache abstraction layer for the SPE giving us the ability to both hide the complexity of DMAs and benefit from transparent data reuse. Next, given the lessons learned from this paper, we tiled our textures, optimized our access patterns, and implemented several cache replacement policies. We then rewrote the shader in the Quaternion Julia Set Raytracer to add five cubemap texture lookup passes - 3 refraction lookups, a reflection lookup, plus a background lookup. These five texture lookups were then blended together with a fresnel calculation and modulated with the base lighting computation to form the final sample color.

The results were very pleasing.

Sample Frame

Quicktime H.264 movie (16MB so be patient)

We found that even with small 4-way set associative software cache sizes (8 KB), miss rates for this renderer were a low 7% and hit access times were only 12 SPE cycles.

Graph

Using only seven 3.2 GHz SPEs we were able to raytrace 15 frames per second with a frame resolution of 1024×1024. The texture buffer held a cubemap with 1024×1024x16 bit texel faces resulting in a 12.5 MB texture buffer in XDR system memory. The performance penalty for using the five pass texture shader vs the lighting only shader was just 13%.

Our miss handler was implemented as a blocking function and we still have ideas pending to further reduce the 12 cycle software cache hit access time so we believe the 13% performance gap between the two shaders will continue to close.

IBM at the Game Developers Conference this week

Blogged under Cell, Industry News, Events by Catherine Helzerman on Thursday 23 March 2006 at 1:13 pm

IBM is at the Game Developers Conference this week. Come by and see us at booth #1230.
One of the things we are showing is a demo of the RapidMind Development Platform and Cell BE. From the RapidMind handout available at the booth:
“The RapidMind Development Platform allows developers to use standard C++ programming to easily create applications targeted for high performance processors including the Cell BE, GPUs and multi-core CPUs. In the case of the Cell BE, the RapidMind platform distributes processing across the SPEs without any explicit reference by the developer to the Cell BE. The platform provides a simple computational model that can be targeted by programmers and then maps this model onto any available computational resources in a system. Code can be written once then run in parallel on any of the processors that RapidMind supports.
What you will see at the demo: To demonstrate the performance acceleration available on the Cell BE processor when using the Rapid development Platform, RapidMind has created a world in which the behaviors of thousands of interacting characters are simulated.
In the demonstration (photo below) the Simulation Application is built in C++ using the RapidMind Development platform. RapidMind in turn leverages the power of two Cell BE processors on an IBM Cell blade to perform the simulation caculations. The state of each character (in this case chickens!) is streamed to the Visualization Application where RapidMind is used to map the state of each character onto visualization and to implement the shaders on the GPU.”

Among the executives at the booth are:

IBM: Hina Shah, Director, Cell Ecosystem & Solutions Development, Bruce D’Amora, Cell Digital Media Solutions Architect, Tanaz Sowdagar Marketing Manager Emerging Technologies, and Michael PerroneIBM Research Manager, Cell Applications Group, IBM Master Inventor
RapidMind: Ray DePaul, President & CEO, Stefanus Du Toit, Vice President, Development, Michael McCool, Chief Scientist, and Matthew Monteyne, Vice President Sales and Marketing.

Visit RapidMind at: http://www.rapidmind.net

game developers conference

Tags: , , ,

Forbes: A supercomputer in your living room

Blogged under Cell, 3gui, Consoles, games, Industry News by David Berger on Friday 13 January 2006 at 1:21 pm

The new Forbes cover story explores the power of the Cell processor. Here’s how it begins:

IBM’s radical Cell processor, to debut in Sony’s PlayStation 3, could reshape entertainment and spark the next high-tech boom.

Later this year millions of homes will get a new supercomputer for the living room. Or maybe the playroom. Sony’s long-awaited PlayStation 3 game console, a slender yet muscular machine the size of a DVD player, performs a mind-boggling 2 trillion calculations per second. This kind of power, once reserved for seismic exploration and nuclear-weapons design, will let programmers create videogames that look as realistic as film.

Some techies say PlayStation 3, which may debut by midyear and could end up in 100 million homes in five years, will usher in the next microchip revolution. The Sony system owes its prowess to a microprocessor called Cell, which was cooked up by chip wizards at IBM (with help from Sony and Toshiba) at a cost of $400 million over five years. The Cell chip, based on a design inspired by supercomputers, runs at least ten times as fast as Intel’s most powerful Pentium. More important, Cell boasts a staggering fiftyfold advantage in handling graphics-intensive applications that will define the next generation of visual entertainment–blindingly fast and seductively immersive games, virtual-reality romps, wireless downloads, real-time video chat, interactive TV shows with multiple endings and a panoply of new services yet to be dreamed up.

The whole article is a must-read.

GPUs vs Cell

Blogged under Cell by Barry Minor on Wednesday 30 November 2005 at 7:39 pm

Recently I came across a link on www.gpgpu.org that I found interesting. It described a method of ray-tracing quaternion Julia fractals using the floating point power in graphics processing units (GPUs). The author of the GPU code , Keenan Crane, stated that “This kind of algorithm is pretty much ideal for the GPU - extremely high arithmetic intensity and almost zero bandwidth usage”. I thought it would be interesting to port this Nvidia CG code to the Cell processor, using the public SDK, and see how it performs given that it was ideal for a GPU. First we directly translated the CG code line for line to C + SPE intrinsics. All the CG code structures and data types were maintained. Then we wrote a CG framework to execute this shader for Cell that included a backend image compression and network delivery layer for the finished images. To our surprise, well not really, we found that using only 7 SPEs for rendering a 3.2 GHz Cell chip could out run an Nvidia 7800 GT OC card at this task by about 30%. We reserved one SPE for the image compression and delivery task. Furthermore the way CG structures it SIMD computation is inefficient as it causes large percentages of the code to execute in scalar mode. This is due to the way they structure their vector data, AOS vs SOA. By converting this CG shader from AOS to SOA form, SIMD utilization was much higher which resulted in Cell out performing the Nvidia 7800 by a factor of 5 - 6x using only 7 SPEs for rendering. Given that the Nvidia 7800 GT is listed as having 313 GFLOPs of computational power and seven 3.2 GHz SPEs only have 179.2 GFLOPs this seems impossible but then again maybe we should start reading more white papers and less marketing hype.

Next Page »
The postings on this site solely reflect the personal views of the authors and do not necessarily represent the views, positions, strategies or opinions of IBM or IBM management.

GT design based on the Identification theme for Wordpress by neuro.