**Sunday 5th
August 2007:** 1 1.54pm. Things are definitely better
this past week. I have given up smoking, having had restarted it since
Easter break due to stress, and I am feeling much the better for it. I still can't
quite believe we're into August already - I have my resit in
Economics in just under a month when I'll also be ending all contact
with the girls permanently now M- has made her choice.

Late last night at 4am I achieved a major milestone for the work I intend to complete this summer, and the first fruits of all that money I spent on new computer hardware. You will all surely recognise this:

Yes, it's a
Mandelbrot set. However, it's no *ordinary* Mandelbrot set -
the one above is the output from a streaming maths computation program
and it was the testcase for the functionality I have been implementing
this summer.

Are you still thinking "so what?". Well, the above is a *vastly
shrunk* form of the original - it's about 144 times smaller.
Here is something closer
to the original
- note you'll need to scroll around a lot with the scrollbars in your
web browser to see it. Now get this - that massive original is *still*
shrunk from the original: it's actually **four times**
larger again!

In case you can't quite get your head around it yet, the original is 7168 x 7168 which means there are 51,380,224 points or pixels in total. Each one of those requires up to one hundred iterations ie; repeating the same calculation and the average is about fifty, so that gives us around 2,569,011,200 iterations.

Each iteration of the Mandelbrot formula requires a minimum of six multiplies and four additions (that's the only operations you need to do for the Mandelbrot set: additions and multiplies, nothing more complicated - it's amazing you can get such beauty from such simple mathematics). To get the colours, I added another six additions, so that gives us six multiplies and ten additions, or sixteen floating-point operations per iteration. Thus, to get our picture above, it requires about 41,104,179,200 calculations!

To perform 41 billion calculations takes a while, even on a modern PC. Each processor core of mine can do about 10 billion a second, so that's just over four seconds at best. Fractal calculations are an example of an embarrassingly parallel calculation whereby each of those 51,380,224 points can be calculated totally independently from one another, and thus entirely in parallel. Here's where the streaming maths computation comes in! A modern graphics card is precisely just such a parallel maths computation device whereby it will compute as much of the problem in parallel as possible - unlike normal CPU's which do everything serially (ie; one thing at once). The current top-end graphics hardware (a NVidia GeForce 8800 GTX currently costing some £350) can process 350 billion ops a second and thus render the entire Mandelbrot set in less than a fifth of a second, but unfortunately I can't afford such high-end hardware. Instead, I have a bottom end ATI Radeon x1300 Pro which can at best do about 9.5-14 billion ops/sec, so it's about as fast as my CPU. The next generation of cards should exceed a trillion ops per second and they are expect to double that every year from now on (normal CPU's only double about once every eighteen months).

Such cheap & massive computational power is precisely why I am developing a
framework for utilising graphics cards for my proposed Economic model. I have taken
Brook, an aging research project from Stanford University's GPGPU
group, which had extremely
outdated OpenGL support and upgraded that to the most modern available (ie;
v2.0). Previously, the above Mandelbrot *wouldn't even compile*
under the ancient ARB OpenGL support within Brook, but with the new GLSL
backend support I have added it runs just fine.

Here are some figures for my ATI Radeon x1300 Pro graphics card for a 7168 x 7168 Mandelbrot (51,380,224 points):

Brook Computation Backend | Time Taken | Operations per second |
---|---|---|

DirectX 9 + PS30 (SM3.0) | 4.55 secs | 9 billion a second |

OpenGL + GLSL | 7.08 secs | 5.8 billion a second |

And for a 4096 x 4096 Mandelbrot (16,777,216 points):

Brook Computation Backend | Time Taken | Operations per second |
---|---|---|

DirectX 9 + PS30 (SM3.0) | 3.86 secs | 3.47 billion a second |

OpenGL + GLSL | 2.10 secs | 6.4 billion a second |

Yeah, I notice the ops per sec increasing as the problem size decreases with OpenGL too! That's the opposite of what it should be. Interestingly, the actual calculation itself is 8.9 billion ops/sec and that's pretty fixed - not too much below the DirectX SM3.0 implementation. The BIG problem is that the ATI drivers are being braindead when it comes to moving data from the graphics card back into the computer memory. It's a driver bug, pure and simple.

The really good news about the new OpenGL + GLSL support is that Brook now has equivalent functionality on Linux and Apple Mac OS X as it does on Windows. Just for your interest, Brook is used to perform protein folding among other things, so with a bit of luck my efforts this summer will contribute to disease breakthroughs. I know a lot of people think I am crazy to "waste" my summers not having a paid job, but hey, I may just have cured your cancer in years to come! And it may well yet pan out that I save your job and your entire future family from starving to death during a massive Economic downturn!

Ok, time for breakfast! Be happy!

Go to previous entry | Go to next entry | Go back to the archive index | Go back to the latest entries |