Predicting the Past

Inspired by Bing (a person, not a search engine) and by the acrobatics I saw tonight in Shanghai, time for a blog post.

So what’s up with graphics APIs? I’ve been working on a project for a fast 3D graphics system for Autodesk for about 4 years now; the base level (which hides the various flavors of DirectX and OpenGL) is used by Maya, Max, AutoCAD, Inventor, and other products. There are various higher-level optimizations we’ve added (and why Microsoft’s fxc effect compiler suddenly got a lot slower is a mystery), with some particularly nice work by one person here in the area of multithreading. Beyond these techniques, minimizing the raw number of calls to the API is the primary way to increase performance. Our rule of thumb is that you get about 1000-1500 calls a frame (CAD isn’t held to a 60 FPS rule, but we still need to be interactive). The usual tricks are to sort by state, and to shove as much geometry and processing as possible into a single draw call and so avoid the small batch problem. So, how silly is that? The best way to make your GPU run fast is to call it as little as possible? That’s an API with a problem.

This is old news, Tim Sweeney railed against API limitations 3 years ago (sadly, the article’s gone poof). I wrote about his ideas here and added my own two cents. So where are we since then? DirectX 11 has been out awhile, adding three more stages to the pipeline for efficient tessellation of higher-order surfaces. The pipeline’s feeling a bit unwieldy at this point, with a lot of (admittedly optional) stages. There are still some serious headaches for developers, like having to somehow manage to put lighting and material shading in the same pixel shader (one good argument for deferred lighting and similar techniques). Forget about optimization; the arcane API knowledge needed to get even a simple rendering on the screen is considerable.

I haven’t heard anything of a DirectX 12 in the works (except maybe this breathless posting, which I feel obligated to link to since I’m in China this month), nor can I imagine what they’d add of any significance. I expect there will be some minor XBox 72o (or whatever it will be called) -related tweaks specific to that architecture, if and when it exists. With the various CPU+GPU-on-a-chip products coming out – AMD’s Fusion family, NVIDIA’s Tegra 2, and similar from other companies (I think I counted 5, all totaled) – some access costs between the two processors become much cheaper and so change the rules. However, the API still looks to be the bottleneck.

Marketwise, and this is based entirely upon my work in scapulimancy, I see things shifting to mobile. If that isn’t at least the 247th time you’ve heard that, you haven’t been wasting enough time on the internet. But, it has some implications: first, DirectX 12 becomes mostly irrelevant. The GPU pipeline is creaky and overburdened enough right now, PC games are an important niche but not the focus, and mobile (specifically, iPad and other tablets) is fine with the functionality defined thus far by existing APIs. OpenGL ES will continue to evolve, but I doubt we’ll see for a good long while any algorithmically (vs. data-slinging) new elements added to the API that the current OpenGL 4.x and DX11 APIs don’t offer.

Basically, API development feels stalled to me, and that’s how it should be: mobile’s more important, PCs are a (large but slowly evolving) niche, and the current API system feels warped from a programming standpoint, with peculiar constructs like feeding text strings to the API to specify GPU shader effects, and strange contortions performed to avoid calling the API in order to coax the GPU to run fast.

Is there a way out? I felt a glimmer while attending HPG 2011 this year. The paper “High-Performance Software Rasterization on GPUs” by Samuli Laine and Tero Karras was one of my (and many attendees’) favorites, talking about how to efficiently implement a basic rasterizer using CUDA (code’s open sourced). It’s not as fast as dedicated hardware (no surprise there), but it’s at least in the same ball-park, with hardware being anywhere from 1.5x to 8.1x faster for their test cases, median being 3.6x. What I find exciting is the idea that you could actually program the pipeline, vs. it being locked away. They discuss ideas for optimization such as loosening the “first in, first out” rule for triangles currently enforced by all APIs. With its “yet another language” dependency, I can’t say I hope GPGPU is the future (and certainly CUDA isn’t, since it cuts out non-NVIDIA hardware vendors, but from all reports it’s currently the best way to experiment with GPGPU). Still, it’s nice to see that the fixed-function bits of the GPU, while important, are not an insurmountable limit in considering more flexible and general interactive rasterization programming models. Or, ray tracing – always have to stick that in there.

So it’s “forward to the past”, looking at traditional algorithms like rasterization and ray tracing and how to gain efficiency (both in raw speed and in development time) on various modern architectures. That’s ultimately what it’s about for me, at least: spending lots of time fighting the API, gluing together strings to make shaders, and all the other craziness is a distraction and a time-waster. That said, there’s a cost/benefit calculation implicit in all of this. For example, using C# or Java is way more productive than C++, I’d say about 2x, mostly because you’re not tracking down memory problems like leaks and access uninitialized or non-existent values. But, there’s so much legacy C++ code around that it’s still the language of graphics, as previously discussed here. Which means I expect none of the API weirdness to change for a solid decade, at the minimum. Please do go ahead and prove me wrong – I’d be thrilled!

Oh, and acrobatics? Hover your cursor over the image. BTW, the ERA show in Shanghai is wonderful, unlike current APIs.

AMD CubeMapGen is now Open Source

UPDATE 9/1/2011: ignotion has put the source up on Google Code.

For a long time, I’ve found ATI’s (now AMD’s) CubeMapGen library to be an indispensable tool for creating prefiltered environment maps (important for physically based shading). Many older GPUs (all the ones in current consoles) do not filter across cube faces. CubeMapGen solves this problem and others – details can be found in a GDC presentation and a SIGGRAPH sketch, both from 2005.

Support for CubeMapGen has been spotty for the last few years, and a while ago AMD officially declared its end of life. Since then I’ve been wondering when AMD would open-source this important tool – there is a good precedent in NVIDIA texture tools, which has been open source for several years now.

Speaking of NVIDIA texture tools, a comment on its Google Code website just let me know that AMD has released source to CubeMapGen. A link to the source for version 1.4 can be found on the bottom of the CubeMapGen page. Note that this does not include the DXT compression part of the edge fixup (which was a pretty nifty feature – hopefully someone will reimplement it now that the library is open source).

Looking at the license doc in the zip file, the license appears to be a modified BSD license. This is excellent news – tools like this are far more useful when source is available. Perhaps someone should host the code on Google Code or github, to make it easier to add future improvements – or maybe it could be folded into the nvidia_texture_tools code base (if the license allows).

Advances in RTR Course Notes up

I’m finally back from a nice post-SIGGRAPH vacation in the Vancouver area. Both our computers broke early on in the trip, so it was a true vacation.

I hope to post on a bunch of stuff soon, but wanted to first mention something now available: the slides and videos presented in the popular SIGGRAPH course “Advances in Real-Time Rendering in 3D Graphics”. Find them here, and the page for previous years (well, currently just 2010) here. Hats off to Natalya Tatarchuk and all the speakers for quickly making this year’s presentations available.

A.K. Peters books at SIGGRAPH and beyond

OK, so I like the publisher A.K. Peters, for obvious reasons. They’re also kind/smart enough to send me review copies of upcoming graphics-related books. I’ve received two recently, with one of particular interest:

Practical Rendering and Computation with Direct3D 11, by Jason Zink, Matt Pettineo, and Jack Hoxley

This one’s very nicely produced (especially for the price): hardcover, color throughout, with paper a bit better than the GPU Gems volumes; basically, that level of quality. More important, it covers a topic that is not very well covered at all (from what I’ve seen), neither by Microsoft’s scattershot documentation nor other sources. Well, in fairness there’s Beginning DirectX 11 Game Programming, but that’s indeed for beginners. I don’t see anything about compute shaders, tessellation, or even stream output in the table of contents. These topics and many more are covered in the new book.

Skimming through it, it looks quite good, a book that I want to spend some serious time reading. You might recognize Zink and Hoxley’s names from the free book that never quite made it to publication, Programming Vertex, Geometry, and Pixel Shaders, coauthored by Wolfgang Engel (of ShaderX and GPU Pro fame), Ralf Kommann, and Niko Suni.

The other book I received was:

Visual Perception from a Computer Graphics Perspective, by William Thompson, Roland Fleming, Sarah Creem-Regehr, and Jeanine Kelly Stefanucci

This book is a survey of visual perception research and how it relates to computer graphics. If you’re a researcher and expect to delve into the field of visual perception, this looks like the place to start. With 68 pages of references, it clearly attempts to give you relevant research in a huge variety of areas. To be honest, I’m not all that interested in reading a whole book on the topic. I picked one topic, motion blur, as a quick test of the book’s usefulness to me. There’s just a brief mention of motion blur on one page, and the computer graphics papers referenced are from the 1980’s (fine papers, but ancient). I tried another: Fresnel – no index entry, half a page, no references. Depth of field: a page and a half, a fair number of references (newest being 2005), none about interactive graphics. So, it’s an extensive survey of the visual perception literature, but don’t expect much depth nor any serious coverage of the area of interactive computer graphics.

Two other books I expect to see at SIGGRAPH are Real-Time Shadows and 3D Math Primer for Graphics and Game Development, 2nd Edition. I got a peek at the latter and it looks to be quite in-depth (and still approachable and informal) – I’m not sure how it differs from the first edition at this point. A micro-review on this blog of the first edition is here, at the end.

There are a lot of other upcoming computer graphics books from A.K. Peters that sound intriguing, e.g. Shadow Algorithms Data Miner – two great tastes now together. Check out the list here or ask at the booth at SIGGRAPH.

Update on SIGGRAPH 2011 Beyond Programmable Shading Course

I have recently been notified by Aaron Lefohn that there have been some changes to the Beyond Programmable Shading course since I last described it here.

The new schedule is below. I’m especially interested to see the presentation by Raja Koduri (former CTO of AMD’s graphics division and now a graphics architect at Apple) – according to Aaron, it’s “an introduction to reasoning about power for rendering researchers”. Power is a very important constraint which is little-understood by most algorithm researchers and software developers. We are not too far from regularly having to take account of power consumption in graphics algorithm design (since an algorithm which causes the GPU to burn too much power may force clock speed reduction, negatively affecting performance). The topic of the closing panel is also an interesting one – graphics APIs have undergone some interesting changes, and I suspect will undergo more profound ones in the near future.

Beyond Programmable Shading I

9:00 Introduction [Aaron Lefohn, Intel]

9:20 Research in Games [Peter-Pike Sloan, Disney Interactive]

9:45 The “Power” of 3D Rendering [Raja Koduri, Apple]

10:15 Real-Time Rendering Architectures [Mike Houston, AMD]

10:45 Scheduling the Graphics Pipeline [Jonathan Ragan-Kelley, MIT]

11:15 Parallel Programming for Real-Time Graphics [Aaron Lefohn, Intel]

11:45 Software rasterization on GPUs [Samuli Laine and Jacopo Pantaleoni, NVIDIA]

Beyond Programmable Shading II

14:00 Welcome and Re-Introduction [Mike Houston, AMD]

14:05 Toward a blurry rasterizer (state of the art) [Jacob Munkberg, Intel]

14:45 Order-independent transparency (state of the art) [Marco Salvi, Intel]

15:15 Interative global illumination (state of the art) [Chris Wyman, Univ. of Iowa]

15:45 User-defined pipelines for ray tracing [Steve Parker, NVIDIA]

16:30 Panel: “What Is the Right Cross-Platform Abstraction Level for Real-Time 3D Rendering?”

Peter-Pike Sloan, Disney Interactive (Moderator)
David Blythe, Intel (Panelist)
Raja Koduri, Apple (Panelist)
Henry Moreton, NVIDIA (Panelist)
Mike Houston, AMD (Panelist)
Chas Boyd, Microsoft (Panelist)

… and free to veterans and unemployed professionals

Mauricio Vives pointed out that the Autodesk program I mentioned yesterday, where students and educators can get Autodesk products and training for free, also applies to veterans and “displaced professionals.” See this page for the logic. The fine print on the registration page is:

An Autodesk Assistance Program participant is either a veteran or unemployed individual who has (a) previously worked in the architecture, engineering, design or manufacturing industries, has completed the online registration for the Autodesk Assistance Program, and upon request by Autodesk is able to provide proof of eligibility for that program.

This is a nice thing.

All Autodesk software free to students and educators, and betas for everyone

I think I need to pop my head out of my gopher-hole more often and see what my company’s doing. It turns out Autodesk software – including Maya, 3DS Max, Mudbox, AutoCAD, and everything else – is now free to students and educators. Just register and you’re good to go. Wow, this is a big change from the old system, and is definitely great to see.

There are also a number of betas from Autodesk free to anyone: one is 123D, a modeler that is aimed to help out the Maker crowd and 3d printing. I’ve installed this but haven’t played with it yet.

Another project is Photofly 2.0, where you upload a number of images and it makes a 3D model from the data (i.e., photogrammetry). This is similar to My3dScanner. I tried these two out on a set of photos of a bunch of bananas, some taken with a flash and some without, a hard test case. I definitely didn’t follow the guidelines. My3dScanner threw up its hands, Photosynth’s point cloud was incomprehensible, Photofly gave it a sporting chance, getting a cloud and making a mesh – no magic bullet yet, but fun to try. I’m now even tempted to RTFM, as results were better than I thought.

Photosynth (examine set of photos here):

Photofly’s cubist rendering – it did output an interesting Wavefront OBJ model:

Some Info on the SIGGRAPH 2011 Papers

The SIGGRAPH 2011 papers were recently made available in the ACM Digital Library (see here). Although I recommend using Ke-Sen’s excellent papers links page when possible (it links to the freely accessible author copies and often to additional information provided by the authors), not all authors have chosen to make their papers available in this way. The Digital Library itself is pretty expensive (unless you’re a full-time student – see below), but if all you want is the SIGGRAPH stuff (including other conferences sponsored by SIGGRAPH), then a SIGGRAPH membership can get you access. SIGGRAPH memberships are only $42 ($30 for students, but students can get full Digital Library access with an ACM student membership for $42, which looks like a better deal).

In addition, for the first time ACM has published a document which includes the first page of each paper – kind of a paper version of the SIGGRAPH Papers Fast-Forward. This document is freely accessible here, and should be useful for people who just want to skim the papers program and see which papers to read in full. Be warned that the document is a bit large though (184MB). The video preview of the papers program might also be of interest (note that it only covers a few of the papers).

[Eric chiming in: here’s the link for how to access the Digital Library if you’re a SIGGRAPH member. Note that you can access all issues of ACM TOG as well as all SIGGRAPH-sponsored conference material and journals, not just SIGGRAPH, which is just about everything you’d want for graphics conferences: I3D, HPG, EGSR, NPAR, etc. Also, remember that the cool kids say SIG-GRAPH, not SEE-GRAPH.]

Seven Things for July 26th, 2011

The harddrive on my main computer died, which has the odd effect of making me have more time for blogging (and less for screwing around on random stuff). So, seven things:

First, if you’re going to HPG 2011, I’ll save you five minutes of searching for where it is: it’s at the Goldcorp Centre for the Arts, Google map here. Note also that things don’t start until 1:30 on Friday.
SIGGRAPH parties? I know nothing, except that the official SIGGRAPH reception is 9 to 11 PM Monday at the convention center, and the ACM SIGGRAPH Chapters Party is 8:30 PM to 2 AM on, oh, Monday again. Odd scheduling.
Timothy Lottes cannot be stopped: FXAA 3.11 is out (with improvements for thin lines), and 3.12 will soon appear. Note that the shader has a signature change, so your calling shader code will have to change, too.
At the Motorola developer site there’s a quick summary of various image compression types used for mobile phones and PCs.
Sebastien Hillaire implement the God Rays effect from GPU Gems 3, showing results and problems. Code and executable available for download.
I’ve been enjoying some worthwhile articles on patents and copyrights lately, both new and old. Worth a mention: Myrhvold madness; a comic (a bit old but useful) on copyright – a good overview; The Public Domain, a free book by a law professor who helped establish Creative Commons; the July 2011 CACM (behind the paywall, though) had a nice article on why the U.S. dropped “opt-in” copyright back in 1989 (blame Europe). Best idea gleaned, from The Public Domain: the length of copyright is meant to motivate people to create works for payment, so a retroactive increase in the length of copyright (e.g., to protect Mickey Mouse) makes no sense – it creates no motivation for works already created.
Polygon Pictures’ office corridor would be a bad place to be if you worked way too many hours. Otherwise, nice!

Seven Things for July 24th, 2011

Eric has done these until now, but I now find myself with a few small things that fit well into such a post.

Older SIGGRAPH Courses often have great material in them, but are tough to track down. This website has a bunch of links to course notes from 1999 to 2007.
The SIGGRAPH Education Committee has a page with links to a few even older courses, going back to 1996. The “Pixel Cinematography” course from 1996 looks especially interesting.
Fabian “ryg” Giesen is doing a great series of posts (as yet unfinished) on his blog, which take the reader on A trip through the graphics pipeline. He recently started reposting a slightly cleaned up version of the series on AltDevBlogADay.
A variant on a previously published paper (video here), Deferred Screen-Space Directional Occlusion by Yuriy O’Donnell has increased performance and plugs relatively easily into deferred shading pipelines.
Emil “Humus” Persson has recently released a demo of his Geometry Buffer Anti-Aliasing technique, which he will also be presenting at an upcoming SIGGRAPH course.
I’ve long been interested in the problem of filtering normals in a way that correctly accounts for surface appearance; we also discuss this in Section 7.8.1 of Real-Time Rendering. Stephen Hill has kicked off his new blog with an excellent post summarizing various solutions to the problem, including his own solution as well as a WebGL demo. The comments to the post are also well worth reading; a lively discussion has developed, with Brian Karis of Human Head Studios describing the solution used on the upcoming game Prey 2.
One of the techniques discussed in the aforementioned post was LEAN mapping and its lighter-weight variant CLEAN mapping. Inspired by that post, Marc Olano (first author on the LEAN mapping paper) has posted some of his own thoughts on those techniques.