Category Archives: Miscellaneous

Radeon HD 5800 Demos

AMD has posted executables and videos for two new demos for the Radeon HD 5800 series. Both demos require Windows 7 (I guess that means that Vista support for DirectX11 isn’t quite here yet).

One of the demos show order-independent transparency; from the description it sounds like an A-buffer-like approach, which is interesting. The other shows a high-quality depth of field effect.

NVIDIA Optix ray-tracing API available – kind of

We’ve written about the NVIDIA Optix ray-tracing API (which used to be called NVIRT) once or twice before.  Well, today it is finally available – for free.  While it’s very nice of NVIDIA to make this available, there are a few caveats.

We already knew Optix would only work on NVIDIA hardware (duh), but the system requirements reveal another unwelcome fact; it does not even run on GeForce cards, only Tesla and Quadro (which are significantly more expensive than GeForce despite being based on exactly the same chips).  They say GeForce will be supported on their new Fermi architecture – I call shenanigans.

Award-Winning Architectural Renderings

I don’t know much about architectural renderings; I guess I always thought of them as utilitarian.  This page of award-winners proved me very wrong – there is true artistry on display here.  The bottom of the page also has a real-time category; of the five nominees in that category three (including the winner – Shockwave required) are available to view online.

Looping Through Polygon Edges

We mostly avoid coding issues in our book, as our focus is on algorithms, not syntax and compiler vagaries. There’s a coding trick that I want to pass on, as it’s handy. Graphics programmers appear to be divided into two groups with this method: those who think it’s intuitively obvious and learnt it on their pappy’s knee, and those who never saw it before and are glad to find out.

You want to loop through the edges of a polygon. The vertex data is stored in some array vertexData[count], an array of count of some sort of Vertex data structure. The headache is attaching the last and first vertices together to make the connecting edge. There are plenty of weak ways to walk through the edges and connect last and first:

  • Double the beginning vertex so it’s added to the end of array; the final edge is then just another pair of adjacent points. This is perhaps even fastest to actually execute but is generally a hideous solution, adding a copy of a vertex to the array.
  • Form the last edge explicitly, outside the loop. Poor for maintenance, as you then need to copy whatever other code is inside the loop to be called one more time.
  • Use an “if” statement to know if you’re at the end of the loop; if so, then connect the first and last vertices for the last edge. The “if” special case is needed for only one vertex, which is wasteful and we’d like to avoid “ifs”.
  • Use modulo arithmetic on the counter for one of the vertices, so that it loops back to the start.

Modulo isn’t terrible, but is overkill and costs processing speed, as the modulo operation is truly needed for only the very last iteration:

for ( int v = 0; v < count; v++ ) {
   // access vertexData[v] and vertexData[(v+1)%count] for the edge
}

Here’s the solution I prefer:

for ( int v1 = count-1, int v2 = 0; v2 < count; v1 = v2++ ) {
   // access vertexData[v1] and vertexData[v2] for the edge
}

The simple trick is that v1 starts at the end of polygon, so dealing with the tough “bridge” case immediately; v2 counts through the vertices, and v1 follows behind. You can, similarly, make a pointer-based version, updating the pV1 pointer by copying from pV2. If register space is at a premium, then modulo might be a better fit, but otherwise this loop strikes me as the cleanest solution.

This copy approach can be extended to access any number of neighboring vertices per iteration. For example, if you wanted the two vertices vp and vn, previous and next to a given vertex, it’s simply:

int vp, v, vn;
for ( vp = count-2, v = count-1, vn = 0; vn < count; vp = v, v = vn++ ) {
   // access vertexData[vp], [v], [vn] for the middle vertex v.
}

I’ve seen this type of trick in code in Geometric Tools, and Barrett formally presents it in jgt. I mention it here because I think it’s a technique every computer graphics person should know.

First DirectX11 GPU Ships

Today, AMD shipped the Radeon HD 5870, the first GPU to support the DirectX11 feature set.  Most of the resources have been doubled in comparison to AMD’s previous top GPU, including two triangle rasterization units. The Tech Report has a nice writeup – to help make sense of the various counts of ALUs / “wavefronts” / cores / etc.  I recommend reading the slides from Kayvon Fatahalian’s excellent presentation at SIGGRAPH this year.

Fundamentals of Computer Graphics, 3rd Edition

coversmallOne bit of deja vu for me at SIGGRAPH this year was another book signing at the A K Peters booth.  Last year’s SIGGRAPH had the signing for Real-Time Rendering; this year I was at the book signing for the third edition of Fundamentals of Computer Graphics.  My presence at the signing was due to the fact that I wrote a chapter on graphics for games (this edition also has new chapters on implicit modeling, color, and visualization, as well as updates to the existing chapters).  As in the case of Real-Time Rendering, I was interested in contributing to this book as a fan of the previous editions.  Fundamentals is targeted as a “first graphics book” so it has a slightly different audience than Real-Time Rendering, which is meant to be the reader’s second book on the subject.

At the A K Peters booth I also got to try out the Kindle edition of Fundamentals (the illustrations in Real-Time Rendering rely on color to convey information, so a Kindle edition will have to wait for color devices).  I haven’t jumped on the Kindle bandwagon personally (the DRM bothers me; when I buy something I like to own it), but I know people who are quite pleased with their Kindle (or iPhone Kindle application).

HPG and SIGGRAPH: pix and links

Some seven links to keep you busy while we digest HPG and SIGGRAPH:

  • Pictures of HPG and SIGGRAPH – even though just about everyone at these conferences has a camera in some form on them, we just about never take pictures. I decided to try to photograph anyone I recognized this year.
  • Tim Sweeney’s HPG keynote slides – I didn’t attend the keynote, unfortunately, but heard about it. Main takeaway for me is that programming these highly parallel machines is hard, and the more that IHVs can do to ease the burden and remove limitations the more successful they will be.
  • While waiting for our HPG reports, read Steve Worley’s.
  • The course notes for “Advances in Real-Time Rendering in 3D Graphics and Games” will be up in a few weeks, if not sooner. Crytek’s presentation is available at their website.
  • The “Beyond Programmable Shading” course notes are available now. I particularly liked Johan Andersson’s talk, partially for the sheer complexity of it all. The various factors that affect making a game engine fast are a bit mind-boggling.
  • The place to go for interactive ray tracing development information is the ompf.org forum.
  • This was the first year ever that I didn’t attend the Electronic Theater. Well, I did attend the first half-hour (live real-time demos), but then found myself looking at my watch as colorful but meaningless things occurred on the screen. I think the fact that we could attend the E.T. without needing a ticket meant that I could keep putting it off and also wouldn’t feel I lost anything if I missed it. If SIGGRAPH had issued me a ticket for a specific night, I suspect I would have willingly stayed for all of it, not wanting to lose the value of the ticket. Psychology. All that said, the best colorful but meaningless real-time demo I saw was “DT4 Identity SA“, freeware which runs on a Mac and is quite charming.

HPG 2009 – a closer look

When discussing things to do and see at SIGGRAPH, it is important to note the co-located conferences.  This year, SIGGRAPH is co-located with the Eurographics Symposium on Sketch-Based Interfaces and Modeling (SBIM), the Symposium on Computer Animation (SCA), Non-Photorealistic Animation and Rendering (NPAR), and High-Performance Graphics (HPG).  SCA has had good animation papers over the years, and is of interest to many game graphics programmers. NPAR is also a good conference for anyone interested in stylized rendering.  In this post I will focus on HPG, which is a new conference formed out of the merger of the venerable Graphics Hardware conference, and the newcomer Symposium on Interactive Ray Tracing.

HPG is a three-day conference; the first two days are just before SIGGRAPH, and the third overlaps the first day of SIGGRAPH (unfortunately conflicting with the excellent SIGGRAPH course, Advances in Real-Time Rendering in 3D Graphics and Games).

HPG has managed to line up two pretty amazing keynotes.  The first one is by Larry Gritz on film production rendering.  Larry is a legend in the field; he was with Pixar from the Toy Story days, and co-wrote one of the most well-regarded books on Renderman.  He since worked on several important renderers (BMRT, Gelato), and is now at Sony Pictures Imageworks.  The second keynote is by Tim Sweeney, on the future of GPUs.  As the outspoken chief architect of Epic’s Unreal Engine, Tim should need no introduction.

At the end of the conference, the two keynote speakers are joined by Steve Parker (NVIDIA) and Aaron Lefohn (Intel) for a panel on high-performance graphics in 7 year’s time.

HPG also has posters and “Hot 3D” systems presentations (hardware manufacturers talking about their latest designs).  Inexplicably, although the acceptance deadline for both has long since passed, the content of neither of these is listed on the conference website yet.

I briefly discussed HPG papers in a previous post, but then only paper titles were available, making it hard to judge relevance; now many of the papers have preprints linked from Ke-Sen Huang‘s HPG 2009 papers page.

Some of the papers look relevant to current or near-future work.  There are two interesting antialiasing papers: Morphological Antialiasing was covered by Eric in a recent post.  The other antialiasing paper (A Directionally Adaptive Edge Anti-Aliasing Filter) does not have a preprint, but the title is promising.  It is notable that one of the authors on this paper (Jason Yang) is listed as a speaker at the SIGGRAPH Advances in Real-Time Rendering in 3D Graphics and Games course; perhaps he will discuss the paper there.  Although the NVIDIA paper Image Space Gathering has no preprint (yet), some information on this technique was disclosed at GDC: it involves rendering reflections and shadows into 2D buffers and then performing cross bilateral filters to mimic glossy reflections and soft shadows.  I have seen similar techniques used in games, so it will be interesting to hear NVIDIA’s take on this concept.  Another promising paper title: Scaling of 3D Game Engine Workloads on Modern Multi-GPU Systems.

The paper Parallel View-Dependent Tessellation of Catmull-Clark Subdivision Surfaces deals with tessellation using GPGPU methods rather than the DX11 tessellation pipeline; I’m not an expert in this area so it’s hard for me to judge, but it might be of interest for people working in this field.

I’m a bit skeptical of depth peeling techniques in general, but recent work in this area has shown some promise.  The paper Bucket Depth Peeling lacks a preprint at this moment, but I look forward to learning more about it at the conference.

I found the title Data-Parallel Rasterization of Micropolygons With Defocus and Motion Blur promising because I am interested in the REYES micropolygon algorithm, and particularly in the way it handles defocus and motion blur effects.  The technique presented in this paper appears to be less efficient than the REYES method, except for cases with very high velocity and/or defocus.  The paper presents a GPU-efficient version of the REYES algorithm as well as an alternative algorithm which is faster in some cases.  One of the authors has a blog post that gives some interesting context for the paper.

The amount of actual graphics hardware papers at the Graphics Hardware conference has been declining for years, which is probably one of the factors that precipitated the conference merger with IRT.  This year there is only one paper which is clearly about hardware design: PFU: Programmable Filtering Unit for Mobile Multimedia Applications on Graphics Hardware.  It has a fairly self-explanatory title, which is fortunate since it has no preprint available.  Texture filtering is the last unassailable bastion of fixed-function hardware; for example, it is the only fixed-function unit in Larrabee.  Programmable filtering is an intriguing concept; I look forward to the paper presentation.  There is one more paper that may be about hardware (Embedded Function Composition); but the title is a bit opaque and it also has no preprint, so it is hard to be sure.

Despite my claim in the previous blog post, there do indeed appear to be quite a few papers about ray tracing this year: Efficient Ray Traced Soft Shadows using Multi-Frusta Traversal, Understanding the Efficiency of Ray Traversal on GPUs, Faster Incoherent Rays: Multi-BVH Ray Stream Tracing, Accelerating Monte Carlo Shadows Using Volumetric Occluders and Modified kd-Tree Traversal, Selective and Adaptive Supersampling for Real-Time Ray Tracing, Spatial Splits in Bounding Volume Hierarchies, Object Partitioning Considered Harmful: Space Subdivision for BVHs, and A Parallel Algorithm for Construction of Uniform Grids.  Another paper, Hardware-Accelerated Global Illumination by Image Space Photon Mapping, combines image-space, GPU-accelerated methods for the initial bounce and final gather with ray-tracing for a complete photon mapping solution.

There are only three “GPGPU” papers this year; two on GPU stream compaction (copying selected elements of an array into a smaller array): Efficient Stream Compaction on Wide SIMD Many-Core Architectures and Stream Compaction for Deferred Shading, and one on computing minimum spanning trees for graphs (Fast Minimum Spanning Tree for Large Graphs on the GPU).