2011 Color and Imaging Conference, Part I: Introduction

A few weeks ago, I attended the 2011 Color and Imaging Conference (CIC). CIC is a small conference (a little under 200 attendees) that nevertheless commands an important role in the fields of color science and digital imaging, similar to SIGGRAPH’s importance to computer graphics. CIC is co-sponsored by the Society for Imaging Science and Technology (IS&T) and the Society for Information Display (SID); it has been held annually in various US locations since 1993.

I attended this conference for the first time last year. In both years I attended, most of the conference attendees were academic color science researchers (the field appears to be dominated by a handful of institutions, most notably the color labs at the Rochester Institute of Technology and the University of Leeds), with the remainder primarily representing the R&D divisions of various camera, printer, display, and mobile phone manufacturers. There are typically also a few color experts from film companies such as Technicolor, ILM, Pixar, and Disney. I didn’t see any other game developers – I hope this will change in future years, as our industry starts paying more attention to this critical area.

Despite its modest attendance numbers, CIC boasted an impressive array of sessions, including courses, papers, short papers, and several keynotes. The content was of very high quality. The conference organizers are currently in the process of posting video of most of the conference content for free streaming and download in a variety of formats – a step which organizers of other conferences (such as SIGGRAPH) would do well to emulate.

I’ll be putting up several other posts with details of the conference content. They will be coming in rapid succession since I’m editing them down from an existing document (a report I did for work).

Do you spell these two words correctly?

We all have dumb little blind spots. As a kid, I thought “Achilles” was pronounced “a-chi-elz” and, heaven knows how, “etiquette” was somehow “eh-teak”. When you say goofy things to other people, someone eventually corrects you. However, if most of the people around you are making the same mistake (I’m sorry, “nuclear” is not pronounced “new-cue-lar”, it just ain’t so), the error never gets corrected. I’ve already mentioned the faux pas of pronouncing SIGGRAPH as “see-graph”, which seems to be popular among non-researchers (well, admittedly there’s no “correct” pronunciation on that one, it’s just that when the conference was small and mostly researchers that “sih-graph” was the way to say it. If the majority now say “see-graph”, so be it – you then identify yourself as a general attendee or a sales person and I can feel superior to you for no valid reason, thanks).

Certain spelling errors persist in computer graphics, perhaps because it’s more work to give feedback on writing mistakes. We also see others make the same mistakes and assume they’re correct. So, here are the two I believe are the most popular goofs in computer graphics (and I can attest that I used to make them myself, once upon a time):

Tesselation – that’s incorrect, it’s “tessellation”. By all rules of English, this word truly should have just one “l”: relation, violation, adulation, ululation, emulation, and on and on, they have just one “l”. The only exceptions I could find with two “l”s were “collation”, “illation” (what the heck is that?), and a word starting with “fe” (I don’t want this post to get filtered).

The word “tessellation” is derived from “tessella” (plural “tessellae”), which is a small piece of stone or glass used in a mosaic. It’s the diminutive of “tessera”, which can also mean a small tablet or block used as a ticket or token (but “tessella” is never a small ticket). Whatever. In Ionic Greek “tesseres” means “four”, so “tessella” makes sense as being a small four-sided thing. For me, knowing that “tessella” is from the ancient Greek word for a piece in a mosaic somehow helps me to catch my spelling of it – maybe it will work for you. I know that in typing “tessella” in this post I still first put a single “l” numerous times, that’s what English tells me to do.

Google test: searching on “tessellation” on Google gives 2,580,000 pages. Searching on “tesselation -tessellation”, which gives only pages with the misspelled version, gives 1,800,000 pages. It’s nice to see that the correct spelling still outnumbers the incorrect, but the race is on. That said, this sort of test is accurate to within say plus or minus say 350%. If you search on “tessellation -tesselation”, which should give a smaller number of pages (subtracting out those that I assume say “‘tesselation’ is a misspelling of ‘tessellation'” or that reference a paper with “tesselation” in the title), you get 8,450,000! How you can get more than 3 times as many pages as just searching on “tessellation” is a mystery. Finally, searching on “tessellation tesselation”, both words on the same page, gives 3,150,000 results. Makes me want to go count those pages by hand. No it doesn’t.

One other place to search is the ACM Digital Library. There are 2,973 entries with “tessellation” in them, 375 with “tesselation”. To search just computer graphics publications, GRAPHBIB is a bit clunky but will do: 89 hits for “tessellation”, 18 hits for the wrong one. Not terrible, but that’s still a solid 20% incorrect.

Frustrum – that’s incorrect, it’s “frustum” (plural “frusta”, which even looks wrong to me – I want to say “frustra”). The word means a (finite) cone or pyramid with the tip chopped off, and we use it (always) to mean the pyramidal volume in graphics. I don’t know why the extra “r” got into this word for some people (myself included). Maybe it’s because the word then sort-of rhymes with itself, the “ru” from the first part mirrored in the second. But “frustra” looks even more correct to me, no idea why. Maybe it’s that it rolls off the tongue better.

Morgan McGuire pointed this one out to me as the most common misspelling he sees. As a professor, he no doubt spends more time teaching about frusta than tessellations. Using the wildly-inaccurate Google test, there are 673,000 frustum pages and 363,000 “frustrum -frustum” pages. And, confusingly, again, 2,100,000 “frustum -frustrum” pages, more than three times as many as pages as just “frustum”. Please explain, someone. For the digital library, 1,114 vs. 53. For GRAPHBIB I was happy to see 42 hits vs. just 1 hit (“General Clipping on an Oblique Viewing Frustrum”).

So the frustum misspell looks like one that is less likely at the start and is almost gone by the time practitioners are publishing articles, vs. the tessellation misspell, which appears to have more staying power.

Addenda: Aaron Hertzmann notes that the US and Britain double their letters differently (“calliper”? That’s just unnatural, Brits). He also notes the Oxford English Dictionary says about tessellate: “(US also tesselate)”. Which actually is fine with me, except for the fact that Microsoft Word, Google’s spellchecker, and even this blog’s software flags “tesselate” as a misspelling. If only we had the equivalent of the Académie française to decide how we all should spell (on second thought, no).

Spike Hughes notes: “I think the answer for ‘frustrum’ is that it starts out like ‘frustrate’ (and indeed, seems logically related: the pyramid WANTS to go all the way to the eye point, but is frustrated by the near-plane).” This makes a lot of sense to me, and would explain why “frustra” feels even more correct. Maybe that’s the mnemonic aid, like how with “it’s” vs. “its” there’s “It’s a wise dog that knows its own fleas”. You don’t have to remember the spelling of each “its”, just remember that they differ; then knowing “it’s” is “it is” means you can derive that the possessive “its” doesn’t have an apostrophe. Or something. So maybe, “Don’t get frustrated when drawing a frustum”, remembering that they differ. Andrew Glassner offers: “There’s no rum in a frustum,” because the poor thing has the top chopped off, so all the rum we poured inside has evaporated.

Seven Things for 10/13/2011

  • Fairly new book: Practical Rendering and Computation with Direct3D 11, by Jason Zink, Matt Pettineo, and Jack Hoxley, A.K.Peters/CRC Press, July 2011 (more info). It’s meant for people who already know DirectX 10 and want to learn just the new stuff. I found the first half pretty abstract; the second half was more useful, as it gives in-depth explanation of practical examples that show how the new functionality can be used.
  • Two nice little Moore’s Law-related articles appeared recently in The Economist. This one is about how the law looks to have legs for a number of more years, and presents a graph showing how various breakthroughs have kept the law going over the past decades. Moore himself thought the law might hold for ten years. This one talks about how computational energy efficiency is doubling every 18 months, which is great news for mobile devices.
  • I used to use MWSnap for screen captures, but it doesn’t work well with two monitors and it hangs at times. I finally found a replacement that does all the things I want, with a mostly-good UI: FastStone Capture. The downside is that it actually costs money ($19.95), but I’m happy to have purchased it.
  • Ray tracing vs. rasterization, part XIV: Gavan Woolery thinks RT is the future, DEADC0DE argues both will always have a place, and gives a deeper analysis of the strengths and weaknesses of each (though the PITA that transparency causes rasterization is not called out) – I mostly agree with his stance. Both posts have lots of followup comments.
  • This shows exactly how far behind we are in blogging about SIGGRAPH: find the Beyond Programmable Shading course notes here – that’s just a mere two months overdue.
  • Tantalizing SIGGRAPH Talk demo: KinectFusion from Microsoft Research and many others. Watch around 3:11 on for the great reconstruction, and the last minute for fun stuff. Newer demo here.
  • OnLive – you should check it out, it’ll take ten minutes. Sign up for a free account and visit the Arena, if nothing else: it’s like being in a sci-fi movie, with a bunch of games being played by others before your eyes that you can scroll through and click on to watch the player. I admit to being skeptical of the whole cloud-gaming idea originally, but in trying it out, it’s surprisingly fast and the video quality is not bad. Not good enough to satisfy hardcore FPS players – I’ve seen my teenage boys pick out targets that cover like two pixels, which would be invisible with OnLive – but otherwise quite usable. The “no download, no GPU upgrade, just play immediately” aspect is brilliant and lends itself extremely well to game trials.

OnLive Arena

Seven things for 10/10/11

  • If you can get WebGL running properly on your browser, check out Shader Toy. Coolest thing is that you can edit any shader and immediately try it out.
  • Another odd little WebGL application is a random spaceship maker, with a direct tie-in to Shapeways to buy a 3D version of any model you make.
  • Speaking of Shapeways, I liked their “one coffee cup a day project“. The low-resolution cup is particularly good for computer graphics people, though I’m told that in real life it’s a fair bit more rounded off, due to the way the ceramic sets. Ironic. Also, note that these cups are actually quite small in real life (smaller than even espresso cups), which is too bad. Still, clever.
  • Source code for iOS versions of Castle Wolfenstein and the original DOOM is now available.
  • Patrick Cozzi has a nice rundown of his days at SIGGRAPH this August, with a particular emphasis on OpenGL and mobile. The links for each day are at the bottom of the entry.
  • Nice fractal video generated in near-real time (300 ms/frame) running a GLSL shader using this code. Reddit thread here, about an earlier video now pulled back online.
  • This site gives a darn long list of educational institutions offering videogame design degrees. It’s at least a place to start, if you’re looking for such things. That said, I’ve heard counterarguments from game company professionals to such specialized degrees, “just learn to program well and we’ll teach you the videogames business”.

Bonus thing: Draw a curve of your data for a number of years and see what it most closely correlates. Peculiar.

Predicting the Past

Inspired by Bing (a person, not a search engine) and by the acrobatics I saw tonight in Shanghai, time for a blog post.

So what’s up with graphics APIs? I’ve been working on a project for a fast 3D graphics system for Autodesk for about 4 years now; the base level (which hides the various flavors of DirectX and OpenGL) is used by Maya, Max, AutoCAD, Inventor, and other products. There are various higher-level optimizations we’ve added (and why Microsoft’s fxc effect compiler suddenly got a lot slower is a mystery), with some particularly nice work by one person here in the area of multithreading. Beyond these techniques, minimizing the raw number of calls to the API is the primary way to increase performance. Our rule of thumb is that you get about 1000-1500 calls a frame (CAD isn’t held to a 60 FPS rule, but we still need to be interactive). The usual tricks are to sort by state, and to shove as much geometry and processing as possible into a single draw call and so avoid the small batch problem. So, how silly is that? The best way to make your GPU run fast is to call it as little as possible? That’s an API with a problem.

This is old news, Tim Sweeney railed against API limitations 3 years ago (sadly, the article’s gone poof). I wrote about his ideas here and added my own two cents. So where are we since then? DirectX 11 has been out awhile, adding three more stages to the pipeline for efficient tessellation of higher-order surfaces. The pipeline’s feeling a bit unwieldy at this point, with a lot of (admittedly optional) stages. There are still some serious headaches for developers, like having to somehow manage to put lighting and material shading in the same pixel shader (one good argument for deferred lighting and similar techniques). Forget about optimization; the arcane API knowledge needed to get even a simple rendering on the screen is considerable.

I haven’t heard anything of a DirectX 12 in the works (except maybe this breathless posting, which I feel obligated to link to since I’m in China this month), nor can I imagine what they’d add of any significance. I expect there will be some minor XBox 72o (or whatever it will be called) -related tweaks specific to that architecture, if and when it exists. With the various CPU+GPU-on-a-chip products coming out – AMD’s Fusion family, NVIDIA’s Tegra 2, and similar from other companies (I think I counted 5, all totaled) – some access costs between the two processors become much cheaper and so change the rules. However, the API still looks to be the bottleneck.

Marketwise, and this is based entirely upon my work in scapulimancy, I see things shifting to mobile. If that isn’t at least the 247th time you’ve heard that, you haven’t been wasting enough time on the internet. But, it has some implications: first, DirectX 12 becomes mostly irrelevant. The GPU pipeline is creaky and overburdened enough right now, PC games are an important niche but not the focus, and mobile (specifically, iPad and other tablets) is fine with the functionality defined thus far by existing APIs. OpenGL ES will continue to evolve, but I doubt we’ll see for a good long while any algorithmically (vs. data-slinging) new elements added to the API that the current OpenGL 4.x and DX11 APIs don’t offer.

Basically, API development feels stalled to me, and that’s how it should be: mobile’s more important, PCs are a (large but slowly evolving) niche, and the current API system feels warped from a programming standpoint, with peculiar constructs like feeding text strings to the API to specify GPU shader effects, and strange contortions performed to avoid calling the API in order to coax the GPU to run fast.

Is there a way out? I felt a glimmer while attending HPG 2011 this year. The paper “High-Performance Software Rasterization on GPUs” by Samuli Laine and Tero Karras was one of my (and many attendees’) favorites, talking about how to efficiently implement a basic rasterizer using CUDA (code’s open sourced). It’s not as fast as dedicated hardware (no surprise there), but it’s at least in the same ball-park, with hardware being anywhere from 1.5x to 8.1x faster for their test cases, median being 3.6x. What I find exciting is the idea that you could actually program the pipeline, vs. it being locked away. They discuss ideas for optimization such as loosening the “first in, first out” rule for triangles currently enforced by all APIs. With its “yet another language” dependency, I can’t say I hope GPGPU is the future (and certainly CUDA isn’t, since it cuts out non-NVIDIA hardware vendors, but from all reports it’s currently the best way to experiment with GPGPU). Still, it’s nice to see that the fixed-function bits of the GPU, while important, are not an insurmountable limit in considering more flexible and general interactive rasterization programming models. Or, ray tracing – always have to stick that in there.

So it’s “forward to the past”, looking at traditional algorithms like rasterization and ray tracing and how to gain efficiency (both in raw speed and in development time) on various modern architectures. That’s ultimately what it’s about for me, at least: spending lots of time fighting the API, gluing together strings to make shaders, and all the other craziness is a distraction and a time-waster. That said, there’s a cost/benefit calculation implicit in all of this. For example, using C# or Java is way more productive than C++, I’d say about 2x, mostly because you’re not tracking down memory problems like leaks and access uninitialized or non-existent values. But, there’s so much legacy C++ code around that it’s still the language of graphics, as previously discussed here. Which means I expect none of the API weirdness to change for a solid decade, at the minimum. Please do go ahead and prove me wrong – I’d be thrilled!

Oh, and acrobatics? Hover your cursor over the image. BTW, the ERA show in Shanghai is wonderful, unlike current APIs.

AMD CubeMapGen is now Open Source

UPDATE 9/1/2011: ignotion has put the source up on Google Code.

For a long time, I’ve found ATI’s (now AMD’s) CubeMapGen library to be an indispensable tool for creating prefiltered environment maps (important for physically based shading). Many older GPUs (all the ones in current consoles) do not filter across cube faces. CubeMapGen solves this problem and others – details can be found in a GDC presentation and a SIGGRAPH sketch, both from 2005.

Support for CubeMapGen has been spotty for the last few years, and a while ago AMD officially declared its end of life. Since then I’ve been wondering when AMD would open-source this important tool – there is a good precedent in NVIDIA texture tools, which has been open source for several years now.

Speaking of NVIDIA texture tools, a comment on its Google Code website just let me know that AMD has released source to CubeMapGen. A link to the source for version 1.4 can be found on the bottom of the CubeMapGen page. Note that this does not include the DXT compression part of the edge fixup (which was a pretty nifty feature – hopefully someone will reimplement it now that the library is open source).

Looking at the license doc in the zip file, the license appears to be a modified BSD license. This is excellent news – tools like this are far more useful when source is available. Perhaps someone should host the code on Google Code or github, to make it easier to add future improvements – or maybe it could be folded into the nvidia_texture_tools code base (if the license allows).

Advances in RTR Course Notes up

I’m finally back from a nice post-SIGGRAPH vacation in the Vancouver area. Both our computers broke early on in the trip, so it was a true vacation.

I hope to post on a bunch of stuff soon, but wanted to first mention something now available: the slides and videos presented in the popular SIGGRAPH course “Advances in Real-Time Rendering in 3D Graphics”. Find them here, and the page for previous years (well, currently just 2010) here. Hats off to Natalya Tatarchuk and all the speakers for quickly making this year’s presentations available.

A.K. Peters books at SIGGRAPH and beyond

OK, so I like the publisher A.K. Peters, for obvious reasons. They’re also kind/smart enough to send me review copies of upcoming graphics-related books. I’ve received two recently, with one of particular interest:

Practical Rendering and Computation with Direct3D 11, by Jason Zink, Matt Pettineo, and Jack Hoxley

This one’s very nicely produced (especially for the price): hardcover, color throughout, with paper a bit better than the GPU Gems volumes; basically, that level of quality. More important, it covers a topic that is not very well covered at all (from what I’ve seen), neither by Microsoft’s scattershot documentation nor other sources. Well, in fairness there’s Beginning DirectX 11 Game Programming, but that’s indeed for beginners. I don’t see anything about compute shaders, tessellation, or even stream output in the table of contents. These topics and many more are covered in the new book.

Skimming through it, it looks quite good, a book that I want to spend some serious time reading. You might recognize Zink and Hoxley’s names from the free book that never quite made it to publication, Programming Vertex, Geometry, and Pixel Shaders, coauthored by Wolfgang Engel (of ShaderX and GPU Pro fame), Ralf Kommann, and Niko Suni.

The other book I received was:

Visual Perception from a Computer Graphics Perspective, by William Thompson, Roland Fleming, Sarah Creem-Regehr, and Jeanine Kelly Stefanucci

This book is a survey of visual perception research and how it relates to computer graphics. If you’re a researcher and expect to delve into the field of visual perception, this looks like the place to start. With 68 pages of references, it clearly attempts to give you relevant research in a huge variety of areas. To be honest, I’m not all that interested in reading a whole book on the topic. I picked one topic, motion blur, as a quick test of the book’s usefulness to me. There’s just a brief mention of motion blur on one page, and the computer graphics papers referenced are from the 1980’s (fine papers, but ancient). I tried another: Fresnel – no index entry, half a page, no references. Depth of field: a page and a half, a fair number of references (newest being 2005), none about interactive graphics. So, it’s an extensive survey of the visual perception literature, but don’t expect much depth nor any serious coverage of the area of interactive computer graphics.

Two other books I expect to see at SIGGRAPH are Real-Time Shadows and 3D Math Primer for Graphics and Game Development, 2nd Edition. I got a peek at the latter and it looks to be quite in-depth (and still approachable and informal) – I’m not sure how it differs from the first edition at this point. A micro-review on this blog of the first edition is here, at the end.

There are a lot of other upcoming computer graphics books from A.K. Peters that sound intriguing, e.g. Shadow Algorithms Data Miner – two great tastes now together. Check out the list here or ask at the booth at SIGGRAPH.

Update on SIGGRAPH 2011 Beyond Programmable Shading Course

I have recently been notified by Aaron Lefohn that there have been some changes to the Beyond Programmable Shading course since I last described it here.

The new schedule is below. I’m especially interested to see the presentation by Raja Koduri (former CTO of AMD’s graphics division and now a graphics architect at Apple) – according to Aaron, it’s “an introduction to reasoning about power for rendering researchers”. Power is a very important constraint which is little-understood by most algorithm researchers and software developers. We are not too far from regularly having to take account of power consumption in graphics algorithm design (since an algorithm which causes the GPU to burn too much power may force clock speed reduction, negatively affecting performance). The topic of the closing panel is also an interesting one – graphics APIs have undergone some interesting changes, and I suspect will undergo more profound ones in the near future.

Beyond Programmable Shading I

9:00 Introduction [Aaron Lefohn, Intel]

9:20 Research in Games [Peter-Pike Sloan, Disney Interactive]

9:45 The “Power” of 3D Rendering [Raja Koduri, Apple]

10:15 Real-Time Rendering Architectures [Mike Houston, AMD]

10:45 Scheduling the Graphics Pipeline [Jonathan Ragan-Kelley, MIT]

11:15 Parallel Programming for Real-Time Graphics [Aaron Lefohn, Intel]

11:45 Software rasterization on GPUs [Samuli Laine and Jacopo Pantaleoni, NVIDIA]

Beyond Programmable Shading II

14:00 Welcome and Re-Introduction [Mike Houston, AMD]

14:05 Toward a blurry rasterizer (state of the art) [Jacob Munkberg, Intel]

14:45 Order-independent transparency (state of the art) [Marco Salvi, Intel]

15:15 Interative global illumination (state of the art) [Chris Wyman, Univ. of Iowa]

15:45 User-defined pipelines for ray tracing [Steve Parker, NVIDIA]

16:30 Panel: “What Is the Right Cross-Platform Abstraction Level for Real-Time 3D Rendering?”

  • Peter-Pike Sloan, Disney Interactive (Moderator)
  • David Blythe, Intel (Panelist)
  • Raja Koduri, Apple (Panelist)
  • Henry Moreton, NVIDIA (Panelist)
  • Mike Houston, AMD (Panelist)
  • Chas Boyd, Microsoft (Panelist)