Category Archives: Miscellaneous

Computer-modeled stop motion in “Coraline”

This is a bit old, but still cool: the film Coraline (based on a great book by one of my favorite authors) was stop-motion, not CG.  However, the facial animation was extremely smooth, due to a unique process.  The expressions were modeled by computer, exactly as they would have been for a CG animated feature.  Then for each frame, the faces were “printed” out on a “3D printer” (rapid prototyping machine).  These tweened “frame faces” were swapped in for each stop-motion frame.  CGSociety has an article describing the production pipeline for this film, an intriguing combination of CG and stop-motion animation.

Left-Handed vs. Right-Handed Viewing

In my previous post I talked about how I think about left-handed vs. right-handed world coordinate systems. The basic idea is simply that there is an underlying reality, and the coordinate system choice is yours.

I compared notes with Jeff Weeks, a topologist friend (who wrote this cool book The Shape of Space that is not full of math symbols, but is just the opposite – approachable and fun, and what you should read after Flatland and Sphereland). Happily, he agrees. He also introduced me to another word for handedness: chirality, coined by Lord Kelvin. Jeff notes an interesting distinction:

You can ask whether the object’s own intrinsic coordinate system agrees with the ambient world space coordinate system (same chirality) or disagrees with the ambient world space coordinate system (different chirality), but you can’t meaningfully call either one of them “left-handed” or “right-handed”.

In other words, the only time you can meaningfully introduce the words “left-handed” and “right-handed” into the discussion is when a human being enters the picture.  Once an object (or a space) gets drawn into the human’s physical space, then “left-handed” means “same chirality as the human’s left thumb-index finger-middle finger” and “right-handed” means “same chirality as the human’s right thumb-index finger-middle finger”.

So in particular, data sitting in a file on disk has no intrinsic chirality. What it has is the author’s intention that it be drawn into the human’s physical space with one chirality or the other (so that, for example, the steering wheel in a car model appears on the left-hand side of the vehicle, as perceived by the human viewing it).

OK, so only for viewing you must also know the handedness of the data. We also know it’s fine to have local coordinates that are RH or LH inside a world that is LH or RH, e.g., you create a frog model by mirroring the right half of your frog model with a mirror matrix, but the world itself is one or the other. So far so good.

Where things get tricky in computer graphics is when we talk about something like a right-handed coordinate world and a left-handed viewing system. Right-handed coordinates are pretty common: all Autodesk applications I know use them. It’s a natural extension of a 2D Cartesian plane to make the Z axis point upwards – a 2D floor-plan that is extruded would be thought to extend upwards from the ground, for example. Also, the determinant of a right-handed viewing matrix is positive; left, negative.

However, a left-handed viewing system is used by default by DirectX: the viewer’s X axis goes to the right, the Y axis is up, and Z goes into the screen. This also feels natural, as you label the screen’s X and Y axes from the lower left corner of the display and Z increases with depth into the screen. OpenGL’s viewing system is right-handed by default, the difference being that +Z goes towards the viewer. This negation leads to a fair bit of confusion in documentation with what near and far mean, but it’s consistent with right-handed world coordinates.

So what if you want to use a left-handed viewing system with right-handed data, or vice versa? All it means: there must be a conversion from one to the other, or expect mirrored images. Like my first post notes, the world coordinate system chosen is arbitrary: your RH system or the mole men’s LH system are both fine, you just have to decide which one to use. However, once you choose, that’s it – the view itself ultimately has to be working in the same system, one way or another. It is honestly meaningless to say “I want to use LH viewing with RH world coordinates”, if you want to do so “without conversion”.

Some transform has to be done to go from RH to LH, but which one? Any mirroring transform can mirror through any arbitrary plane will convert from one chirality to the other. Mirroring along the world’s Z axis is one common solution, i.e., set Z’ = -Z for vertices, normals, light positions, etc. If the scene’s data is converted once to LH space, case closed, your LH camera will work fine.

However, say you don’t want to touch the data. Microsoft gives a way to convert from RH to LH, which boils down to again mirroring along the world’s Z axis as the first transform on the view matrix, i.e., done every frame. No time is lost, since the mirroring is simply part of a different view matrix. The funny bit is that you have to deal with the world in RH, the view in LH, as far as defining where the camera goes and how it is oriented. A common way to define a view is by giving a camera position, a target it is looking at, and an up direction. From this you can form a view basis, a matrix to transform from world to view. By Microsoft’s method, you have to make sure to use LH positions and vectors for forming your camera (with Z negated), vs. the RH coordinate you use for your data. Confusing.

There’s another somewhat sensible place to perform mirroring: go from RH to LH at the end of the transforms, after projection to clip space. You’re given a right-handed camera position, target, and up vector and you use these to make a left-handed view matrix – there’s nothing stopping you. None of these coordinates need to be transformed, they can be defined to be the same in RH and LH spaces. However, what will happen is that the image produced will be mirrored along the vertical axis. That is, left and right sides will be switched, since you didn’t compensate for the difference in chirality. Lots of models are symmetric, in fact, so this sort of mistake is not often immediately noticeable. By adding a simple mirror matrix with X’ = -X you can swap the left and right of the screen. This comes down to negating the upper-left value in the projection matrix.

By using this mirror matrix at the end, you’ve made your left-handed coordinate system into a right-handed one. Each time you move the camera, you’re essentially defining a new and (usually) different mirroring plane in world space, one that passes through your eye, target, and up vector. This mirror matrix will then not affect the eye, target, and up directions, since they lie in this plane. Maybe that’s fine for your system. However, this solution can also mess up basic operations and queries. Say you want to pan the camera to the left. So you query your camera for which direction it thinks is left. It hands you one, you move the camera that direction, but the view shifts the other way. This is because you have a mismatch in chirality: your camera’s basis is in LH, but you correct it at the back-end to be RH. The view matrix returned a left-vector which was also in LH, and needs to be converted to RH. Also confusing.

The whole point of the camera is to transform some chunk of visible space into clip coordinates, which then convert (by dividing by W) to NDC coordinates (visible space being -1 to +1 in X and Y, -1 to +1 in Z for OpenGL, 0 to +1 in Z for DirectX). Neither OpenGL nor DirectX in any way requires LH or RH coordinates, you can feed in whatever view transforms you like. Just make world and view chirality match and be done with it.

That’s about it – just make them match, and do so by setting the view (lookat) matrix to match the world’s chirality. At least, that’s my take, this week (which sounds flip, but I’ve been trying to get a good handle on this area for a long time; I like to think I now have one, as of today). If there are other ways to think about this area, and especially if I’ve made any mental errors, please do comment.

By the way, if you want an enjoyable book about handedness and symmetry, get Martin Gardner’s The New Ambidextrous Universe. It talks about some surprising stuff, such as experiments where parity in the universe is not conserved (so if you’re ever mirrored a few times through the fourth dimension on your way to a distant planet, you’ll know how to test which way you came out).

Oh, and also check out Jeff Weeks’ site – some fun and beautiful OpenGL math applications there for free, Mac and Windows, with source. I should note that he uses LH throughout for his OpenGL applications, the rebel. There’s no “correct” coordinate system, so choose your own.

Fast and Furious

Given my last links post referenced “The Fast and the Furious”, I might as well call this post by the 4th movie in that series. Which is bizarrely titled by simply removing two “the”s from the original title. So the 5th movie will just be “Fast Furious”? I can imagine this subtract-a-the for other movies: “Fellowship of Ring”, “Silence of Lambs”, “Singin’ in Rain”, “Back to Future”. Anyway, the goal of this post is to whip through the rest of my links backlog.

I’m still catching up with reading the post-GDC flurry of resources and blog posts and whatnot – you’re on your own. Well, mostly. One or two things: watch the last half of the Unreal 3 new features demo – some nice-looking stuff. Also, the GDC tutorials are available for download; the first set of 7 are what you want. Lots of DirectX 10 and 11 material, from my quick skim. The third talk, by the DICE guys, looked to have some interesting things to say about cascaded shadow maps. Here’s another older presentation, about the Frostbite rendering engine, parallelism, software occlusion culling, ray tracing, and other nice tidbits. What’s also interesting about this one is that it uses a slide hosting site, SlideShare, to hold the presentation. Speaking of slidesets, there are also these from the parallel graphics computing course at SIGGRAPH Asia 2008. 

But, you found you’re required to attend a conference between GDC and SIGGRAPH (If so, I want your job). In that case, EGSR 2009 is coming up at the end of June, in Girona, Spain. This is the conference for rendering research in general. There’s still a week before abstracts are due, so get cracking.

In my last links post I asked for open source that loaded and exported a variety of model types and allowed mesh manipulation. Two people answered back: MeshLab. The blog about this package is also worth skimming through.

Also in the previous links article I mentioned the server-side graphics computation model presented by OnLive.  I should also mention AMD’s Fusion Render Cloud project, in the same space. Hmmm, maybe this really could work, with compression, and if you don’t mind some lag.

In Gamasutra is a “sponsored” article, but a good one, on Intel’s Threading Building Blocks. I can attest that this component truly does help you take advantage of multiple cores. Knowing at least a bit about TBB is worth your while. Also on Gamasutra is part two of the data alignment article.

There’s a nice rundown of Killzone 2’s graphical features on Brian Karis’ blog.

The sIBL site has some HDR environment maps and manipulation software for download.

Paul Merrell has made plugins available for Max and Blender for his city synthesis procedural modeling research at UNC Chapel Hill.

This diagram of Windows’ graphics makes me think, “it’s just that easy.”

NVScale is an OpenGL-based SDK that lets you use up to four GPUs to store and render extremely large models. It’s nice to see NVIDIA supporting this (non-gaming) area of rendering.

Well-produced tutorial on volume rendering, along with demo code, by Kyle Hayward: part 1, part 2.

There are lots of articles about XNA graphics programming getting put at Ziggyware.

Nothing to do with computer graphics, but this seems like the best computer science class ever.

When nerds and lace-making meet: fractal doilies.

 

Left-Handed vs. Right-Handed World Coordinates

Two years ago I read a blog entry by Pete Shirley about left-handed vs. right-handed coordinates. I started to have a go at explaining these as simply as I could, but kept putting it off, to avoid saying something stupid or confusing. Having just dealt with this issue yet again at work, it’s time to write down my mental model.

One problem in thinking about this area is that there are two places where we care about them: world coordinates (where stuff is in space) and view coordinates (what we use for the view and perspective matrices). So, this first post will be just about world coordinates, as a start. Basic, but let’s get it down to begin.

The way I think about RH vs. LH for world space is there’s an objective reality out there. You are trying to define where stuff is in that reality. You stand on a plane and decide that looking East is the X+ axis, looking North is the Y+ axis (typical Cartesian coordinates). For the Z+ axis you decide that altitudes are positive numbers. That’s an RH coordinate system, and that’s why it’s the one used by most modeling packages, AFAIK (please do let me know if there are any LH modelers). We all likely know about how the right hand is used to explain the counterclockwise twist the three axes form, the right-hand rule. I was also happy to see on this same page how to label your two fingers and thumb to show the coordinate system.

You meet with Marvin the Moleman. He likes Y+ North, X+ East, just like you, but Z+ for him is downwards, his numbers increase as he digs his holes. He’s LH. So he hands you a model of his mole-lair, fully modeled in 3D. Fine, the transform to the RH space you like is a Z axis reflection, i.e., negate the Z coordinate and, as needed, normal values. He also gives you a 2D textured rectangle showing the floor plan, a 2D object. Viewing his dataset from above, your and his XYZ coordinates (and UV coordinates) happen to exactly match, the Z flip does nothing to these coordinates since Z is 0. You flip only the rectangle’s normal direction.

There are an infinite number of ways to transform between LH and RH, not just negate Z. A plane with any orientation and location can be used to mirror the vertices of the model; some planes are just more useful and convenient than others. A quick rule is that negating one axis or swapping two axes transforms from one coordinate system to the other.

One interesting property of such conversions is that, even though the normals get flipped along some plane (or perhaps I should say, because the normals get flipped), clockwise order of the vertices is not affected by conversion between these two model coordinate systems. Which is counterintuitive, on the face of it: if, for example, you do a planar mirror transform of a model so that you can render it again as an object reflected in a mirror (p. 386 on in RTR3), the mirroring matrix most definitely does change the ordering of the vertices. A clock seen in a mirror is reversed in the direction its hands travel.

The point is simply this: LH and RH are indeed just two ways of describing the same underlying world space. Conversion between the two does not change the world. A clock in the real world will always have its hands move in a clockwise direction, regardless of how you describe that world. I find this “conversion that does not modify clockwise” operation to be like the Zen koan, “It is not the wind that moves. It is not the flag that moves. It is your mind that moves.”

One last bit I found interesting: latitude/longitude. Typically we describe a location on the earth as lat/long/altitude, with North positive for latitude, East positive for longitude. So lat/long/altitude is left-handed, assigning them in this XYZ order. But, I’ve also seen such coordinates listed in longitude/latitude order, e.g., TerraServer USA uses this order. Which is right-handed, since the X and Y coordinates are swapped. In this case all the values are the same, but it’s simply the ordering that changes the handedness.

My next posting on this subject will be about LH vs. RH for viewing vs. world coordinates, which is where the real confusion comes in.

Connections: Larrabee, Michael Abrash, Intel, Dr. Dobb’s and me

There has been a spate of Larrabee information during the last two weeks.  Two GDC talks (slides near the bottom of this page), a prototype library, and an article by Michael Abrash on the Dr. Dobb’s website.

Dr. Dobb’s Journal has been out of print since February, but for many years it was one of the leading software publications.  When initially published in 1976 (as Dr. Dobb’s Journal of Computer Calisthenics & Orthodontia) it was the first journal focusing on software development for microcomputers.  Michael Abrash wrote many articles for Dr. Dobb’s over the years, including a series on the Quake software renderer in the mid 90’s.  This series made a great impression on me; when it was published I was considering a career change from microprocessor design to graphics programming.  At the time, I was working on Intel’s P55C processor, publicly known as “Pentium with MMX Technology”.  This chip was notable both for being the first X86 processor with a SIMD (single instruction multiple data) instruction set, and for being the last CPU to use the in-order Pentium micro-architecture.

When Michael Abrash wrote the Quake articles game rendering was 100% software, mostly written in assembly language.  Abrash was the uber-game programmer, having worked on DOOM, written the Quake renderer, and published (in addition to his Dr. Dobb’s articles) many influential books about graphics programming, assembly and optimization (the last of which is available online).

Within a few years (around the time I finally made the jump from CPU design to game graphics programming), it seemed to many that graphics hardware and compiler improvements had made software rendering and hand-coded assembly obsolete.  This was mirrored by my own experience; I was hired to my first game industry job on the strength of a software rasterization demo (written mostly in assembler) and by the time the game shipped, it required graphics hardware and contained very little assembly (none written by me).  Abrash started applying his considerable skills to what he saw as the next unsolved hard problem: natural language processing.

But he couldn’t stay away from graphics for long; when Microsoft started working on the XBox console he got involved in its design.  In the early 2000’s, he figured out that there was a market for software renderers after all, mostly due to the mess of caps bits, unorthogonal feature support, and flaky compliance that characterized low-end graphics hardware at the time (Intel was among the greatest offenders; compounding the problem, its graphics chips sold very well so there were a lot of them out there).  With Mike Sartain (another XBox designer), he wrote Pixomatic, a software renderer published by RAD Game Tools (until then mostly known for the Miles sound library, perhaps the most widely-used middleware in the games industry).  Of course, he published another series of articles in Dr. Dobb’s about the experience, where he discussed how he made use of SIMD instruction sets such as MMX and SSE when optimizing Pixomatic.

I found this particularly interesting due to my personal involvement with these instruction sets.  After working on the first MMX hardware implementation I helped define its successor, which was twice as wide (128 bits instead of 64) and added support for floating-point SIMD.  This instruction set was at first called MMX2, then VX, and finally split into two separate instruction sets: SSE and SSE2.  By this time SIMD instruction set extensions were becoming quite popular; AMD had their own version called 3DNow!, and PowerPC had the AltiVec instruction set.  Intel kept on adding new SIMD extensions: SSE3, SSSE3, SSE4.1, SSE4.2, and AVX.

As Abrash details in the Larrabee article, Larrabee got started when he decided to talk to Intel about some ideas for SIMD instructions to accelerate software rasterization.  As a result, Larrabee includes a powerful set of SIMD instructions.  Much wider than previous instruction sets (512 bits instead of 128, or 256 in the case of AVX), Larrabee’s instruction set contains several instructions tailored to software rasterization.  It is also general enough to allow for automatic code vectorization of a wide variety of loops.  Abrash had a key role in the design of the instruction set, bringing software rasterization back into the mainstream.

Besides a good instruction set, Larrabee also needed an efficient hardware design with a large number of cores.  Each of these cores needed to be very efficient in terms of performance-per-Watt and per-transistor.  Since the Larrabee team started out as a skunkworks, they  couldn’t afford to design a brand-new core so they looked at previous Intel cores, and the old in-order Pentium core (almost the same one I used in  the P55C) was the one chosen.

What I find fascinating about this story is that Abrash managed to follow rasterization all the way around the Wheel of Reincarnation.  This term refers to the common process where a piece of computing functionality is first implemented in software, then moves to special-purpose hardware which gradually becomes more general until it rivals a CPU in complexity, at which point the functionality is folded back into software.  It was coined in a 1968 article by T. H, Myer and Ivan Sutherland (the latter is widely considered the father of computer graphics).

ACMR and ATVR

ACMR is Average Cache Miss Ratio, which is used to measure the effectiveness of a vertex reordering scheme to see how it performs. That is, the GPU has a certain number of transformed vertices it keeps in its post-transformed vertex cache. This cache will be more or less effective, depending on the order you feed triangles into the GPU. ACMR is the sum of the number of times each vertex must be transformed and loaded into the cache, divided by the number of triangles in the mesh. Under perfect conditions it is 0.5, since the most a cached vertex can be shared on average in a (non-bizarre) mesh is 6 times and each triangle has 3 vertices.

Ignacio Castaño has an excellent point: a better measure is to ignore the number of triangles in the mesh but instead divide by the number of vertices. He calls this the ATVR (average transform to vertex ratio) of a scheme. The problem with using the number of triangles is that this number varies with the mesh’s topology. Optimal ACMR, vertices over triangles, gives a sense of the amount of shared data in a mesh. ATVR is a better measure of cache performance, as it provides a number that can be judged by itself: 1.0 is always the optimum, so if your caching scheme is giving 1.05 ATVR, you’re doing pretty good. The worst ATVR can get is 6.0 (or just shy of 6.0).

I think the reason ACMR is used is that we evolved from triangle strips to connected meshes. Individual triangles have an ACMR of 3.0. Triangle strips have an optimum ACMR of 1.0, since each vertex can be shared with a maximum of 3 triangles. The way to think about triangle strips is how they increase sharing of vertices with triangles. Gluing strips together in a certain order allowed a better ACMR, since the vertices in one tristrip could be then reused by the next tristrip(s). So the ACMR as an optimum measure was a way to show why general meshes were worthwhile. But, once general meshes came to the fore and became the norm, this vertex/triangle ratio became less meaningful. At this point ACMR just makes it difficult to compare algorithms, as you alway need to know the optimum ACMR. The optimum ACMR changes from mesh to mesh. The optimum ATVR is always 1.0, so different meshes can be compared on the same scale.

That said, this comparison of different meshes using just ATVR can be a bit bogus: if a “mesh” was actually a set of disconnected triangles, no sharing, the ATVR would be 1.0 no matter what, since each vertex would always be transformed once and only once. ACMR would always be 3.0. So, the optimum ACMR is also good to know: a lower optimum ACMR means more sharing is possible and can be exploited.

Ignacio has a followup article on optimal grid rendering, showing the ACMR and ATVR for various schemes.

Romanesco Broccoli

While you anxiously await the next links roundup, how about a snack?

This is just cool, a vegetable that looks like it was invented by a computer graphics hacker: Romanesco broccoli. Thanks to Morgan McGuire for pointing this thing out (which he calls a “cone-flake”), and for his photo below (appropriately enough, snapped at a farmer’s market in Sunnyvale, CA). Click on the photo for the full image.

I3D 2009 Papers Listed

I3D 2009, the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, will be in Boston on February 27 to March 1. Along with Morgan McGuire I’m a papers cochair this year. Naty served on the papers committee along with 81 others, plus external reviewers; over 300 reviewers were written. We were happy to see a large number of submissions: 87, up from 57 last year. 28 papers were accepted. The papers to be presented are now listed at http://kesen.huang.googlepages.com/i3d2009Papers.htm.

Heh, I just noticed Naty also posted this site; well, a little duplication won’t kill you. Amazing to me, Ke-Sen had already tracked down 9 of the papers accepted at I3D – acceptance notices went out on the 5th. I sent the list of all papers accepted to Ke-Sen, as this became open knowledge on the 15th. Ke-Sen’s listing is about as official as it will get until the final program is published at the I3D site.

I should also note that the Posters deadline is just a few days away, on December 19. Posters are a great way to present an idea or a demo and get feedback from the community, without having to spend the time and effort of writing a full formal paper.

Ray Tracing News v. 21 n. 1 is out

I’ve put out the Ray Tracing News for more than 20 years now. New issues come maybe once a year, but there you have it. There’s a little overlap with this blog, but not that much. Find the latest issue here. Now that I’m finally done with this issue I can imagine blogging again (it wasn’t just I3D that was holding me back).

Best Conference Ever

I’ve been busy with papers cochairing for I3D 2009 (he said casually, knowing he’ll probably never get an opportunity to do so again, being a working stiff and not an academic), but hope to get back to blogging soon. In the meantime, here’s the best conference ever: “Foundations of Digital Games“. I3D’s the end of February in Boston, vs. April on a cruise ship between Florida and the Bahamas. Why don’t I get invited to help out at conferences like this?