Free Editing Tools and Tips

Takeaways from this article: Use Capitalize My Title to determine exactly which characters to capitalize. Use my free chex_latex Perl script to find common goofs in your .tex and text (just copy your .docx text to a text file) files. Also try pasting your .tex into Microsoft Word to see what it turns up. Jump to the end of this post for more info. And if you want a great technical copy editor or typesetter (she did both for Real-Time Rendering 4th edition and copy editing for Ray Tracing Gems), hire Charlotte Byrnes. No, really, do it; she copy edits papers, too.

Yesterday we finished putting together Ray Tracing Gems, turning the files over to our publisher. Tomas and I agree that we’re done with making books for a decade or so, at least. “Editing, it’ll be easier than writing,” said the naive editors. And, yes, we changed all “naïve” with the umlaut (a form that seems to be popular with authors from Germany) to “naive.”

There’s a lot we learned about process – basically, think through and establish good processes early on, for everything – but I want to focus on grammar and style-nerd stuff, since it’s SIGGRAPH paper-writing season. What follows are my fairly-well-informed-but-I-could-be-convinced-otherwise opinions. I think of myself as maybe a B+ English student, on a good day.

Gender-neutral: Instead of he/she, or alternating “he” and “she” in a passage, people can use “they” for the singular form. This concept dates back to 1375.

“Data is” is fine, according to The Guardian, which also notes that The Wall Street Journal is coming around. To quote: “It’s like agenda, a Latin plural that is now almost universally used as a singular. Technically the singular is datum/agendum, but we feel it sounds increasingly hyper-correct, old-fashioned and pompous to say ‘the data are’.”

Please spell out any acronym you use the first time you use it, e.g., “the bounding volume hierarchy (BVH) for the scene.”

Use the ACM Digital Library (DL) for references in your .bib files. On an article’s page on the upper right is “Export Formats: bibtex” – click it. Even then, the DL is sometimes not quite correct, e.g., “Proceedings of High Performance Graphics” should have a hyphen: “Proceedings of High-Performance Graphics,” as that’s truly the conference name. But they’re at least close. Here are some variants I spotted from authors:

High-Performance Graphics (nice and short, but a little hard to tell if this is a book or something else when reading the bibliographic entry in the paper)
High Performance Graphics
Proceedings of High Performance Graphics (ACM DL form, needs a hyphen)
Proceedings of the Conference on High Performance Graphics
High Performance Graphics 2009
Proceedings of the 5th High-Performance Graphics Conference
Proceedings of the ACM SIGGRAPH Symposium on High-Performance Graphics
Proceedings of the Fourth ACM SIGGRAPH / Eurographics Conference on High-Performance Graphics (wins for longest)

The DL bibtex entry sometimes puts the article title into lowercase when it is actually presented properly capitalized. I’m for using the capitalized version the authors intended. BTW, the Google Scholar Button is handy for finding references.

Hyphens are tricky. For a word where you could hyphenate it or leave out the hyphen, e.g., “non-planar”/”nonplanar,” we had a few rules of thumb:

  • Check Merriam-Webster. For example, “nonplanar” is an accepted spelling, so we left the hyphen out. Similarly, “nearfield,” “dataset,” and “retroreflection” are in the dictionary. You still must poke around, e.g., “retroreflect” is not.
  • If it’s not in the dictionary, consider the literature. So, “microfacet” and “framebuffer” are OK, since they are commonly used in computer graphics.
  • Otherwise, use a hyphen, e.g., “re-render,” “micro-detail.”

For phrases we used these hyphenation rules. For example, “world-space coordinates” vs. “world space” vs. “coordinates in world space” are all proper. Also, it’s never “Monte-Carlo” – you wouldn’t say “faster than a New-York minute.” Oh, and it’s never “physically-based,” as the rules note.

Our one exception to these sorts of rules was that we never hyphenated “ray-tracing” in Ray Tracing Gems. For starters, we’d have to considering changing the title to Ray-Tracing Gems, which takes emphasis off of ray tracing (“Hey, this book’s a rip-off, I thought it was about properly rendering gemstones”). It’s also the more popular adjective form in previous publications (though I found only a few articles to back up this claim). To confuse things a bit, Microsoft calls their DXR API “DirectX Raytracing” – one word for “ray tracing.” We use it this way whenever we spell it out for DXR. We also resisted using “any-hit shader” and “closest-hit shader,” as Microsoft does not hyphenate, no matter how much I wish they did. This quotidian stuff takes up a surprising amount of time, but luckily I’ve folded the various common hyphenation tests into chex_latex. If you disagree, it’s easy to change or remove the tests from the script (even if you don’t know Perl, you do know how to use the “delete” key).

I like to avoid too many uses of “this” without a noun after them. A high-school teacher of mine never allowed a “this” without a noun after each one, but I think that’s overkill. Still, it’s good to think about: If it’s not obvious what the noun should be after the “this”, then add it. I’ve also found that “This” at the start of a sentence can often be changed to “Doing so,” e.g., “We add up the probabilities. This simplifies the subsequent…” to “We add up the probabilities. Doing so simplifies the subsequent…” The second sentence is a bit more active and engaging.

Avoid “very” when you can, as it’s either fluff or is a lazy way to increase the impact. Mark Twain didn’t say (but it’s still a great quote): “substitute ‘damn’ every time you’re inclined to write ‘very’; your editor will delete it and the writing will be just as it should be.” There are lists out there, such as this one, giving synonyms, e.g., “massive” for “very big.” That said, I haven’t yet found good replacements for “very close” and “very time consuming.” Another word in this category of fluff is “really.” Please never use that in formal writing; even “very” is better.

At the start of a sentence, “In order to” can usually be replaced by “To.” Shorter is more concise and sounds more authoritative.

The word “only” can often get moved to the right. Placement matters: “I only cook hot dogs but don’t eat them” and “I cook only hot dogs, not hamburgers” have “only” modifying the proper words. So “We only render the opaque objects” is less precise than “We render only the opaque objects.” Both work – we’re used to mentally shoving the “only” to be with the objects in the first version – but why not say it right to begin with?

The abbreviations “i.e.” and “e.g.” always have commas after them in the U.S., e.g., such as what I just did in this sentence. Same thing with the phrases “for example” and “that is” – commas after. Also, this is my own hypersensitivity, but please reword to avoid “et al.’s” – it’s fine in some lawless parts of the world, but you drive a stake in my Latin teacher’s heart when you mix Latin and English in this way.

In the U.S., the Chicago Manual of Style says “toward” is preferred in the U.S. over “towards,” “forward” over “forwards,” and so on. I’ve also seen this in spell checkers, with “towards” being flagged. Most sites say either is fine for the U.S., though the English prefer “towards.” I decided on the “remove the ‘s'” versions for this U.S. book (and “color” not “colour,” etc.) in order to be consistent throughout and to avoid spell-checker catches.

OK, that’s way more than enough, and I haven’t started in on “that” vs. “which” or how semicolons get misused…

If I could change just one thing in U.S. English, it would be punctuation placement with quotes and double-quotes. In the U.S. we put commas and periods inside the double-quotes, “like this.” In the U.K. it’s “like this”. Note that all other punctuation – colon, question mark, semicolon – for the U.S. is outside the quotes, unless you’re quoting a phrase with these in it, e.g., “Quis custodiet ipsos custodes?” I wish we’d all switch to the U.K. model, it’s more consistent and puts my computer science heart at ease. Oh well. That said, it’s legal to keep the punctuation out if you’re trying to explain syntax itself, e.g., “To declare floating-point numbers you can use ‘double’ or ‘float’.”

To conclude, I’ll explain more what I said at the start, about free tools you should consider using:

  • Use Capitalize My Title to determine exactly which characters to capitalize in titles and section headers. There seems to be a tendency in Europe to not capitalize the hyphenated word. For example, the often-cited (and good!) paper “An Inexpensive BRDF Model for Physically-based Rendering” has a lowercase “b” in “based” (there also should not be a hyphen). It’s also visible in such things as the First-tier Tribunal. For U.S. publications the rule is to capitalize the hyphenated word, unless the word would normally be lowercase, e.g., “Texture Level-of-Detail Strategies for Real-Time Ray Tracing.”
  • Microsoft Word or other spell and style checker is worth your while. Copy the contents of your PDF or .tex file, paste into Word, and see what it turns up. If you do this simple trick as a reviewer, you look like some copy-editing genius by catching obscure errors. Beat reviewers to the punch by doing this to your own submissions. I did so with this article and it turned up six suggestions, four of which I used.
  • Give chex_latex.pl a run on your .tex or text file (e.g., copy your text from your Word document to a .txt file). Yes, you must install Perl, how old-school, but ten minutes of messing with this program will likely turn up a few things in your paper that are wrong or at least worth contemplating changing. It has hundreds of little tests, but my favorite is the one I stole from here, that finds accidentally doubled words, e.g., “the the” or “a a” – these occur surprisingly often. Perl scripts are trivial to edit, so if you don’t like a test, just delete or modify it. Or you can put “% chex_latex” at the end of any line to have the tool ignore it completely. I ran this tool on this article and it found two little goofs.
  • For longer works I find an interactive spell checker to lead to madness, as you hit author after author after author when you check, along with variable and procedure names. I finally wrote a tiny back-end program, aspell_sorter.pl, for the standalone command-line spell checker “aspell.” Now with three commands I concatenate all .tex files together, spell-check them all, and then sort by word and get a count for each “misspell.” For example, on my latest run, in 23,083 lines of .tex in Ray Tracing Gems, “aspell” found 7744 questionable terms. That’s like one spell check per 3 lines – insane. The “aspell_sorter” turns this into a list of 1599 questionable terms – almost 5x less. You can also set “$spellcount = 1;” to show only unique misspellings, things “misspelled” only once. Doing so will miss consistently misspelled words, but the list to skim over is now down to just 696 entries – a more than 11x culling of potential problems. Of these, I found three true misspellings: ROUGNESS, vextex, and identifer (and these were found after three editors had already read each of the chapters and I had similarly spell checked most of the text a month earlier). I had to wade through a lot of false positives, 693 of them, but it’s better than wading through 7741.
  • I’ve heard good things about Grammarly and Hemingway, but haven’t gotten into them myself.
  • For citation managers, Zotero and Mendeley get a lot of thumbs up, though I haven’t used them, since we use our own goofy .bib format.

Bonus info: it’s not about grammar, more about English in general, but I’m enjoying Bill Bryson’s The Mother Tongue – it’s a great bathroom read. The amount of useless knowledge per page is impressive. For example, I now know where the “r” sound in the word “colonel” comes from.

Atomic Age Photon Mapping

Well, not exactly Atomic Age, which I think of as late 40’s through mid 60’s, but close – what would you call it when you’re rendering on a plotter and drawing “+” signs for shading?

Arthur Appel’s paper from 1968 is considered the first use of ray casting for rendering, for eye rays and shadow rays.

One surprising part is that the paper includes the idea of shooting rays from the light and depositing the results on the surfaces – early photon mapping, or radiance caching, or something (details aren’t clear).

Update: an anonymous source let me know, “When I worked briefly at the IBM TJ Watson lab, I made a point of seeking out Art Appel. He was friendly and nice. As I recall, he said that he was working as technical support at the time of the ray tracing work. As he described it, most of his day was spent hanging around, waiting to answer the phone and help people with computer issues (this may have been a self-effacing description of a very different or important job, for all I know). He said that he had lots of free time during the day, and he was interested in using the computer to make images, so his ray tracing work was basically a hobby!”

 

A Regret

One regret Tomas and I have about the back cover of “Real-Time Rendering, 4th Ed.” is not going with this initial response: “‘The second-best book on rendering there is.’ — Matt Pharr”; would have been hilarious.

Print Quality

So, I don’t think my work on Real-Time Rendering, 4th Edition will ever be done, even though the thing’s published…

First, yes, we also were unhappy with how some of the images printed in the first printing of the book. And yes, I’m happy to report the second printing is better quality; we wish the first had been, as well, and we are all sorry it wasn’t. The main sources of problems are atoms and the process. There’s a limit to how thick a book can be and not have its spine break, or be readable without crushing your chest. The publisher pushed up the thickness on the second printing, which improved the readability of some figures (I’m ordering one now, after having seen it, so I can compare). Fingers crossed that the binding holds up. It’s why we published two chapters and two appendices online – a 1350+ page book is basically unprintable, let alone readable.

All diagrams and some of the images (those we have rights or permission for) are available for viewing and download on our website, about 2/3rds in total, and I added more today. If we are missing any that would help you understand the text better, please let me know and I’ll see if I can clear rights for them to be redisplayed there – my email’s on the website at the bottom. As important, we hope to provide slightly-improved images to the publisher for the far-distant day when a third printing might be made.

Which edition do you have? I think you can tell by thickness, with first printing being about 4.8 cm (1-7/8″) thick overall, and second printing being around 5.4 cm (2-1/8″), maybe more.

thickness comparison

I was interested to read that Matt Pharr had the same problems with the latest edition of “Physically-Based Rendering.” He writes:

It was a terrible feeling, having put endless hours into the book, putting in all of our own best efforts to make the very best thing we could, and then having something awful being put in readers’ hands. We had no control of all sorts of details beyond the text itself

Yup, I can relate. The last thing we see is the print-ready PDF, and after that it’s hope for the best. At least no sections of the book were deleted, duplicated, printed upside-down, or in inverse colors.

I see about 18 out of 676 figures that are noisier or darker or have less contrast than I would like. I am interested in which figures you find difficult to understand. Personally, I’ve love it if the electronic version and paper version were “bound together” and always came as a set – the electronic version, though not as high-resolution as I’d like (unsurprisingly, Amazon didn’t want to provide the 2.2 GB PDF that we get when we compile the book ourselves), does have more readable figures.

My personal take-aways are two: don’t try to print images showing noise, they’ll just about never print well, and don’t try to print images that are somewhat dark, they’ll always seem to get darker when printed. The first is a new lesson learned, the second is one I’ve seen in previous editions but always hold out the hope that it’ll be better this time around.

Best Birthday Evar

This last week I’ve been working a bit on stuffing Sphereflake into Chris Wyman’s sphere intersection demo for DXR. See my gallery of real-time ray tracing experiments for eye candy, statistics, and commentary.

This is my first DXR program (really, just a modification of Chris’s), and two things impressed me:

  • Sheer speed and size of what can be rendered in real-time now vs. 32 years ago: 60 FPS vs. 60+ minutes per frame (~216Kx speedup there alone), 48 million spheres vs. 7k spheres, 1920 x 1110 vs. 512 x 512. And this is on a currently-available GPU that will be considerably surpassed in ten days with the release of the GTX 2080ti and friends.
  • Programmability: add depth of field? Just jitter the eye ray’s origin. Want soft shadows? Jitter the shadow ray directions. Adding soft shadows took me about 15 minutes this morning, as a birthday treat to myself.

In all fairness, depth of field and soft shadows and whatnot are noisy, since I initially cast a single ray per pixel. I don’t use denoising, which is something that’s critical for acceptable real-time ray tracing performance whenever such effects are used. The images I show are what I see after a minute (or whenever I happened to do the screen capture; after a few seconds things usually looked good).

All that said, playing with the renderer is a lot of fun now. Add some path tracer functionality here or there and you have a new effect – no hours of hacky “rasterize and then do some funky post-processing effects.” I see this as a huge boon to CAD and pre-visualization packages that want to quickly add new effects or have users rapidly try out variants. It’s dangerously addictive, as I now want to add glossy reflection…

It Takes a Village…

It clearly takes a village to write the book Real-Time Rendering. Ola Olsson pointed out this entertaining bit on Google Scholar:

The fourth edition appears to be Volume 19 Issue 2. The article mentioned does exist, but is the very last reference in the book (#1978). The number of authors on the paper is impressive, quite an increase from the original three for this article. Ansel Adams, among others, gets listed three times as an author – excellent CV padding. My favorite, though, is the description of the article, a quote by Billy Zelsnack used at the beginning of our chapter “The Future.”

I poked around a bit more and found some alternate reality listings, such as this:

In both 4th editions our new three authors don’t show up. More disturbing is that in one universe’s edition Tomas Akenine-Möller also no longer exists (sad, since he’s listed six times as an author in the first image). And a strange universe it is, where the book has 40 citations, despite being out for less than 6 weeks. The prescience of some authors citing it is impressive, with one article published in the year 2000 referencing this fourth edition. Research must be wonderfully accelerated there, with developers being able to read about future breakthroughs that they can then write up.

 

Ray Tracing Monday, Again

I enjoyed the previous ray tracing monday two weeks ago, so let’s do it again.

Since it’s been two weeks, two resources:

  • “Real-Time Ray Tracing” – this is a free chapter (number 26) of Real-Time Rendering that we just finished and is free for download. 46 pages with a focus on DXR, but also including a detour through everyone’s favorite topic, spatial data structures for efficient ray tracing. Tomas, Angelo, and Seb did the heavy lifting on this one over the past 3 months, I helped edit and added bits where I could.
  • Ray Tracing Resources Page – along the way we ran across lots of resources. With the chapter finished as of today, I was inspired to make a page of links, mostly things I don’t want to lose track of myself or found of historical interest. Please do feel free to send me awesome resources to add, etc.

One entertaining thing I ran across was an image in the gallery for Chunky, a quite-good custom path tracer for Minecraft worlds:

Cornell blox

Cornell blox

CC0 – Public Domain for the World

Steve Hollasch mentioned that there’s a “new” (well, new to me – it’s 9 years old) Creative Commons instrument, CC0. Their website has an explanation of the problem with trying to put something you made into the public domain, and how CC0 solves this. Open Knowledge International (no, I never heard of them, either) recommends it, which I’ll take as a good sign. I didn’t know of this CC0 beast, and suspect readers here don’t, either, so now you do. It’s mostly not a license, it’s a “dedication,” a way to ensure that something you created is considered unowned and free to reuse in every country.

If you want to make sure your code is properly credited to you, use something such as the MIT license instead, or some other (often more restrictive) choice. Creative Commons recommends not using their other licenses for code, but rather some common source code license. I’m assuming (it’s not super-clear) that CC0 is also fine for code you’re putting into the public domain. Update: aha, Steve Hollasch sent this follow-up link – CC0 can be applied to code, and that link shows you how to do it.

Update: I received a number of interesting responses from my tweet of this post. David Williams points out that CC0 is approved by the Free Software Foundation for putting code in the public domain, as it has a fallback license for countries where public domain is not recognized. Arvid Gerstmann notes that dual-licensing with CC0 and the MIT license may be an even better option, for those companies where the lawyers haven’t approved the lesser-known CC0 but have approved use of code with the MIT license.

I know this all sounds like “it doesn’t matter, I’m never going to enforce this,” but it does, sadly. With Graphics Gems we made up a license long ago (basically, “don’t be a jerk”) because someone was trying to sell his company to a larger firm, which had some software testing firm test the smaller company’s code for copyright infringements, and various bits of Graphics Gems code popped up. If CC0 had existed, or maybe even the questionable (and rude) WTFPL, we would have gone with that. Happily, there is now the CC0. (tweet)