In my previous post I talked about how I think about left-handed vs. right-handed world coordinate systems. The basic idea is simply that there is an underlying reality, and the coordinate system choice is yours.
I compared notes with Jeff Weeks, a topologist friend (who wrote this cool book The Shape of Space that is not full of math symbols, but is just the opposite – approachable and fun, and what you should read after Flatland and Sphereland). Happily, he agrees. He also introduced me to another word for handedness: chirality, coined by Lord Kelvin. Jeff notes an interesting distinction:
You can ask whether the object’s own intrinsic coordinate system agrees with the ambient world space coordinate system (same chirality) or disagrees with the ambient world space coordinate system (different chirality), but you can’t meaningfully call either one of them “left-handed” or “right-handed”.
In other words, the only time you can meaningfully introduce the words “left-handed” and “right-handed” into the discussion is when a human being enters the picture. Once an object (or a space) gets drawn into the human’s physical space, then “left-handed” means “same chirality as the human’s left thumb-index finger-middle finger” and “right-handed” means “same chirality as the human’s right thumb-index finger-middle finger”.
So in particular, data sitting in a file on disk has no intrinsic chirality. What it has is the author’s intention that it be drawn into the human’s physical space with one chirality or the other (so that, for example, the steering wheel in a car model appears on the left-hand side of the vehicle, as perceived by the human viewing it).
OK, so only for viewing you must also know the handedness of the data. We also know it’s fine to have local coordinates that are RH or LH inside a world that is LH or RH, e.g., you create a frog model by mirroring the right half of your frog model with a mirror matrix, but the world itself is one or the other. So far so good.
Where things get tricky in computer graphics is when we talk about something like a right-handed coordinate world and a left-handed viewing system. Right-handed coordinates are pretty common: all Autodesk applications I know use them. It’s a natural extension of a 2D Cartesian plane to make the Z axis point upwards – a 2D floor-plan that is extruded would be thought to extend upwards from the ground, for example. Also, the determinant of a right-handed viewing matrix is positive; left, negative.
However, a left-handed viewing system is used by default by DirectX: the viewer’s X axis goes to the right, the Y axis is up, and Z goes into the screen. This also feels natural, as you label the screen’s X and Y axes from the lower left corner of the display and Z increases with depth into the screen. OpenGL’s viewing system is right-handed by default, the difference being that +Z goes towards the viewer. This negation leads to a fair bit of confusion in documentation with what near and far mean, but it’s consistent with right-handed world coordinates.
So what if you want to use a left-handed viewing system with right-handed data, or vice versa? All it means: there must be a conversion from one to the other, or expect mirrored images. Like my first post notes, the world coordinate system chosen is arbitrary: your RH system or the mole men’s LH system are both fine, you just have to decide which one to use. However, once you choose, that’s it – the view itself ultimately has to be working in the same system, one way or another. It is honestly meaningless to say “I want to use LH viewing with RH world coordinates”, if you want to do so “without conversion”.
Some transform has to be done to go from RH to LH, but which one? Any mirroring transform can mirror through any arbitrary plane will convert from one chirality to the other. Mirroring along the world’s Z axis is one common solution, i.e., set Z’ = -Z for vertices, normals, light positions, etc. If the scene’s data is converted once to LH space, case closed, your LH camera will work fine.
However, say you don’t want to touch the data. Microsoft gives a way to convert from RH to LH, which boils down to again mirroring along the world’s Z axis as the first transform on the view matrix, i.e., done every frame. No time is lost, since the mirroring is simply part of a different view matrix. The funny bit is that you have to deal with the world in RH, the view in LH, as far as defining where the camera goes and how it is oriented. A common way to define a view is by giving a camera position, a target it is looking at, and an up direction. From this you can form a view basis, a matrix to transform from world to view. By Microsoft’s method, you have to make sure to use LH positions and vectors for forming your camera (with Z negated), vs. the RH coordinate you use for your data. Confusing.
There’s another somewhat sensible place to perform mirroring: go from RH to LH at the end of the transforms, after projection to clip space. You’re given a right-handed camera position, target, and up vector and you use these to make a left-handed view matrix – there’s nothing stopping you. None of these coordinates need to be transformed, they can be defined to be the same in RH and LH spaces. However, what will happen is that the image produced will be mirrored along the vertical axis. That is, left and right sides will be switched, since you didn’t compensate for the difference in chirality. Lots of models are symmetric, in fact, so this sort of mistake is not often immediately noticeable. By adding a simple mirror matrix with X’ = -X you can swap the left and right of the screen. This comes down to negating the upper-left value in the projection matrix.
By using this mirror matrix at the end, you’ve made your left-handed coordinate system into a right-handed one. Each time you move the camera, you’re essentially defining a new and (usually) different mirroring plane in world space, one that passes through your eye, target, and up vector. This mirror matrix will then not affect the eye, target, and up directions, since they lie in this plane. Maybe that’s fine for your system. However, this solution can also mess up basic operations and queries. Say you want to pan the camera to the left. So you query your camera for which direction it thinks is left. It hands you one, you move the camera that direction, but the view shifts the other way. This is because you have a mismatch in chirality: your camera’s basis is in LH, but you correct it at the back-end to be RH. The view matrix returned a left-vector which was also in LH, and needs to be converted to RH. Also confusing.
The whole point of the camera is to transform some chunk of visible space into clip coordinates, which then convert (by dividing by W) to NDC coordinates (visible space being -1 to +1 in X and Y, -1 to +1 in Z for OpenGL, 0 to +1 in Z for DirectX). Neither OpenGL nor DirectX in any way requires LH or RH coordinates, you can feed in whatever view transforms you like. Just make world and view chirality match and be done with it.
That’s about it – just make them match, and do so by setting the view (lookat) matrix to match the world’s chirality. At least, that’s my take, this week (which sounds flip, but I’ve been trying to get a good handle on this area for a long time; I like to think I now have one, as of today). If there are other ways to think about this area, and especially if I’ve made any mental errors, please do comment.
By the way, if you want an enjoyable book about handedness and symmetry, get Martin Gardner’s The New Ambidextrous Universe. It talks about some surprising stuff, such as experiments where parity in the universe is not conserved (so if you’re ever mirrored a few times through the fourth dimension on your way to a distant planet, you’ll know how to test which way you came out).
Oh, and also check out Jeff Weeks’ site – some fun and beautiful OpenGL math applications there for free, Mac and Windows, with source. I should note that he uses LH throughout for his OpenGL applications, the rebel. There’s no “correct” coordinate system, so choose your own.
I was surprised not to see any mention of winding direction here. The order in which our vertices are specified for a triangle has a big impact on whether RH or LH coordinates are the right choice. It all comes down to the definition of the cross product. If our vertices are wound CCW when looking at the front side of a triangle, then translating so that one vertex is at the origin and taking the cross product between the other two vertices, in order, gives us the triangle’s normal. If we evaluate the scalar triple product of the vertices, in order, of each triangle in a closed mesh and sum, then we get six times the volume of the mesh. If our triangles were wound CW instead, then the normals and volumes would be negative if we calculated the same products.
Right-handed coordinate systems are preferred because the cross product was chosen to be right-handed at some point in time. This choice was arbitrary — each component of the cross product could be negated, and everything would still be algebraically consistent. The universe has no preference. The reason a choice has to be made lies in the fact that the cross product is really an abuse of a coincidence that occurs in 3D space for a more general product called the wedge product. (And this abuse forced physicists to make an unnecessary distinction between what they call polar vectors and pseudovectors.) The wedge product between two vectors produces something called a bivector, and it so happens that in 3D, a bivector has 3 algebraically independent components whose magnitudes are exactly those of the components of the cross product. (In 4D, a bivector has 6 components, and in 2D, a bivector only has 1 component.) The tensor representation of a bivector is an antisymmetric matrix containing both positive and negative versions of each cross-product component, so no sign choice needs to be made. It’s only when we want to represent a bivector with 3 numbers instead of the 6 nonzero entries of the tensor that we need to decide on RH or LH.
Hi, Eric, and thanks for commenting. Yes, I left winding direction out of it, as the article was getting long-winded as it was. I agree that if a model is in RH with CCW winding order, if you convert it to LH you’ll need to change the loop order to CW (e.g., reverse the order) to maintain the proper culling sense. I disagree that the order in which vertices are specified affects RH vs. LH being “the right choice”. In any graphics API I know you can always reverse the culling sense, so winding order can be treated as a related, but independent, variable. I believe your point is that RH with CCW makes various properties, such as volume, come out as positive numbers, which as humans we tend to favor.
I once read a great little article in an SGI’s now-defunct “Iris Universe” which basically said, “just try your code and negate as needed”. You can sit and think and scribble for a long time, trying to figure out whether a particular rotation will go clockwise or counter-clockwise, or whether the culling sense is right. This article (author forgotten) pointed out that, once you have the underlying math right, most of these decisions are boolean; it’s either one way or the other, and the difference is just a minus sign. Rather than spending time with paper and pencil and figuring out exactly what the ordering should be, the author said, “just try it”. Having this permission to do so has saved me hours of worrying about sign problems while programming. The ultimate goal, for me, is a picture on the screen; if a minus sign needs to be added here or there along the way, no problem.
You mention the wedge product (aka the outer product), from geometric algebra. I can appreciate the elegance of GA, but can’t say I understand much of it beyond a hand-waving level. It has an clean way of manipulating various objects and computing intersections between them (though given that GA can’t fully represent a triangle, I see its use as somewhat limited for computer graphics). I’m hoping someone someday writes a book on the topic that doesn’t immediately double back on itself with terminology. I was disappointed to find the promising-sounding “Geometric Algebra for Computer Science” book (sample chapter here) assumes the reader already knows about all there is to know about homogeneous coordinate space and projection, quaternions, Pluecker coordinates, etc. I suspect you’d understand it, but this book (and others on GA) have so far left me in the dust. “Geometric Algebra for Dummies” has yet to be published, unfortunately.
Pingback: Real-Time Rendering · Questionable Answers
Pingback: Why does handedness matter at all? | Question and Answer