My response to the OSTP research access RFI

A few days ago, I urged (besides other actions), submitting responses to the RFIs from the White House Office of Science and Technology Policy regarding access to research. I myself responded to the the RFI regarding peer-reviewed scholarly publications (I didn’t feel qualified to respond to the other one regarding access to research data sets since I don’t use those as much in my work). The reply I sent is after the break – please note that this is my (Naty’s) personal opinion, and may not reflect Eric and Tomas’ positions.

Continue reading →

GPU Pro^3 is available for order

Like the title says, GPU Pro³, the next installment of the GPU Pro series, is now available for order. The publication date is realsoonnow (January 17th). The extended table of contents is a great way to get a sense of what it contains.

The GPU Pro series is essentially a continuation of the ShaderX series, just with a different publisher. I was given a look at the draft of this latest volume, and it appears in line with the others: some eminently practical and battle-tested approaches mixed with some pie-in-the-sky out-of-the-box done-with-the-metaphors ideas – having a mix keeps things lively. Articles such as the one covering the CryENGINE 3 is a fine combination of both, with solid algorithms alongside “this doesn’t always work but looks great when it does” concepts. Some of the material (including a fair bit of the CryENGINE 3 article) can be gleaned from presentations online from GDC and SIGGRAPH, but here it’s all polished and put in one place. Other articles are entirely fresh and new. Priced reasonably for a full-color book, it’s a volume that most graphics developers will find of interest.

Your Action Needed to Protect Open Access!

If you care about open access to research (and you should), there are several actions (some quite time-critical) that you can take to protect it.

First, some background (if you’re already familiar with this issue and just want to know what to do about it you can skip to the “1,2,3” list at the end and read the rest later).

In 2008, legislation was passed in the United States requiring all National Institute of Health (NIH) funded researchers to submit their papers to an openly available repository within a year of publication. The (perfectly reasonable) logic was that since the American public had paid for the research with their taxes, they had a right to see it without going through paywalls. If anything, the flaws in the legislation were that it did not cover all Federally-funded research, and that it still allowed publishers to lock papers up for one year.

Of course, scientific publishers (with a “researchers do all the work, we take possession of the results and sell them back to researchers” business model that resembles nothing so much as the “the sun grows the food, the ants pick the food, the grasshoppers eat the food” motto from Pixar’s film “A Bug’s Life”) hated this and immediately tried to stop it. They were unable to do so, which is very fortunate since the open access repository, PubMed Central, was a huge boon to everyone from researchers, to physicians, to patients trying to keep up with research into their diseases.

About a year later, the US Government started a “Request For Information” (RFI) process to figure out if this policy should be expanded to other Federally-funded research. Of course, for-profit scientific publishers like Elsevier filed lengthy letters against this. One would think that non-profit professional organizations like the Association for Computing Machinery (ACM) would not have such a short-sighted, rent-seeking position. Surely they would put the advancement of human knowledge ahead of their revenue streams? Well, no. Perhaps not so surprising, given their previous actions.

Fast forward to January 2012, when another legislative attack on Federal open access mandates was launched – the Research Works Act. In the charming bought-and-paid for tradition of US legislation, this was written by the Association of American Publishers (AAP), a lobbying group whose members have made large contributions to the campaigns of the two U.S. Representatives introducing the bill – a fact that I am sure had no influence whatsoever on their support. This bill makes it illegal for the government to mandate open-access; it would shut down PubMed Central (sorry, cancer patients! we’ve got revenue streams to protect!) as well as making any similar initiatives impossible. The timing of this bill was especially suspect, since it was launched a few days before the deadlines for another set of RFIs regarding open access. This odious bill launched a well-deserved internet shitstorm; our blog is relatively late to this party.

Sadly (but not surprisingly), it turns out that the ACM is a member of the AAP. One might hope that this was merely a case of the AAP doing something that some of its member organizations disagree with, but the ACM seems to like the Research Works Act just fine. You’ll like that last link; it’s one of the finest examples of disingenuous and circular reasoning I’ve seen in a while. Just to put a cherry on top of this shit sundae, it turns out that the AAP is also a supporter of SOPA (I’m now afraid to hear ACM’s own position on SOPA).

At this point, you’re most likely reading through a red veil of righteous rage. Fortunately, there are things you can do about this; some need to be done now.

If you are a researcher or someone who uses research, email responses to the two RFIs from the White House Office of Science and Technology Policy concerning access to Federally-funded research (one regarding peer-reviewed scholarly publications and one regarding research data). The deadline is in just three days. Although these are US government RFIs, my understanding is that you don’t have to be a US citizen or reside in the USA to respond. Harvard’s RFI response is worth reading for reference, though it is quite long.
If you are a US Citizen, let your representatives know how you feel about this legislation. The Alliance for Taxpayer Access has the information you need to do so.
If you are an ACM member, let the ACM know how you feel about their support for this act and the ACM’s membership of the AAP; be polite! The ACM bureaucracy is complex, but as far as I can tell the most appropriate people to contact are: Alain Chesnais, ACM President (achesnaisacm.org), Bernard Rous, ACM Director of Publications (rous@acm.org), and Cameron Wilson, ACM Director of Public Policy (cameron.wilson@acm.org). If you are a member of some other professional organization that belongs to AAP, contact it as well.

It’s time to let the scientific publishers know that things are going to change. From now on, the ants pick the food, the ants eat the food, and the grasshoppers leave!

I3D 2012 papers becoming visible

Ke-Sen Huang, who in a perfect world would be given a stipend just to maintain his wonderful pages, has been on the job collecting I3D 2012 papers. See them here.

High Performance Graphics 2012 CFP

The High Performance Graphics 2012 Call for Participation is up, go get it. HPG 2012 is in Paris (France, not Texas) June 25-27, co-located with EGSR, another excellent symposium.

Bloxing Day

My crazy-person project for the month is done. It’s a little program called Mineways, which is a bridge between Minecraft and Shapeways, the 3D printing service. You can grab a chunk of a Minecraft world for rendering or 3D printing. See the Mineways Flickr group for some results.

2011 Color and Imaging Conference Roundup

For convenience (using the CIC 2011 tag works but shows posts in reverse order), this post combines all the links to my CIC 2011 posts. Note that video for many of the presentations is available online, and many of the papers are also available on the various author’s home pages.

Next year, CIC will be in Los Angeles, between November 12 and November 16. If you are local, I warmly recommend attending at least some of the courses, especially the two-day “fundamentals” course.

Two recommended (albeit somewhat expensive) resources for people interested in further study of color topics:

The DVD of the two-day “Fundamentals of Color Science and Imaging” course, presented by Dr. Hunt.
The book “Digital Color Management: Encoding Solutions” (make sure to get the 2nd edition) has great coverage of the image reproduction problem and its solutions. I see this problem as “the other half of rendering” – what do you do after you’ve generated those physically-based scene radiance values?

2011 Color and Imaging Conference, Part VI: Special Session

This last post on CIC 2011 covers a special session that took place the day after the conference. Although not strictly part of the conference (it required a separate registration fee), it covered closely related topics.

The special session “Revisiting Color Spaces” was jointly organized by the Inter-Society Color Council (ISCC), the Society for Imaging Science and Technology (IS&T), and the Society for Information Display (SID) to mark the 15th anniversary of the publication of the sRGB standard. It included a series of separate talks, all related to color spaces:

sRGB – Work in Progress

This presentation was given by Ricardo Motta, a Distinguished Engineer at NVIDIA. Mr. Motta developed the first colorimetrically calibrated CRT display for his Master’s thesis at RIT, helped develop much of HP’s color imaging tech as their first color scientist, and was one of the original authors of the sRGB spec. Now he has responsibility for NVIDIA’s mobile imaging technology and roadmap.

The presentation started with some history on the development of sRGB. It actually started with an attempt by HP and Adobe in 1989 to get the industry to standardize on CIELAB as a device-independent color space. Their first attempts at achieving industry consensus didn’t go well: Gary Starkweather at Apple insisted that full spectral representations (highly impractical at the time) were the right direction, and initial agreement by Microsoft to standardize on CIELAB were scotched when Nathan Myhrvold insisted on 32-bit XYZ (also infeasible) instead. After these setbacks, the people at HP and Adobe who were working on this started realizing that RGB can actually work pretty well as a device-independent color space. They wrote drivers for the Mac first, and ported them when Windows got color capability. PC monitors and televisions at the time all used the same CRT designs (the PC market was as yet too small to justify custom designs), so Adobe characterized the typical CRT in their RGB drivers – first as an internal HP standard (“HP RGB”), and later in collaboration with Kodak and Microsoft as part of the FlashPix standard (“NIF RGB”). In 1996, HP presented NIF RGB to Microsoft as a proposed standard, ending with the sRGB standard proposal exactly 15 years ago.

Why does sRGB work? RGB tristimulus values by themselves are not enough to describe color appearance. The effect of viewing conditions and white balance on the appearance of self-luminous displays are not fully understood. Colorimetry mostly focuses on surface colors, not self-luminous aperture colors. Also, the limited gamut of displays make low-CCT white-balances impractical.

By standardizing the assumed viewing conditions and equipment (display with near 2.2 gamma, Rec.709 primaries, D65 white point at 80 nits (cd/m²), 200 lux D50 ambient, 1% flare) then the RGB data fully implies appearance with little processing needed. Also, daylight-balanced displays tend to remain constant in appearance over a wide range of viewing conditions (D65 is consistently perceived as neutral in the absence of other adapting illumination) so the results are robust in practice.

In 1996 the strength of sRGB was that these viewing conditions and equipment were common and widely used. 15 years later, this strength has become a limiting factor in some scenarios.

If self-luminous display colors are not very close to the correct scene surface colors, there is a perceptual “snap” as the image suddenly appears as a glowing rectangle instead of a 3D scene (this is similar to the “uncanny valley” problem). Current standard display primaries fail to match large classes of surface colors due to their limited gamut; newer developments (AMOLED, LED backlights) enable a much wider color gamut.

In addition, displays have been getting much brighter – every decade, LED brightness has consistently increased by at least 20X. Newest LCD tablets achieve 500 nits with over 1000:1 contrast ratio (CR), exactly matching reflected colors in most conditions. Daylight equivalence requires 6,400 nits; by the end of this decade, portable displays should be able to show actual surface colors under all lighting situations.

The sRGB approach is no longer valid in the mobile space – with highly variable viewing conditions and displays that can directly match reflective colors, we need to move from a “tristimulus + viewing conditions” encoding to an “object properties” encoding (still tristimulus-based).

OSA-UCS System: Color-signal Processing from Psychophysical to Psychometric Color

This presentation was given by Prof. Claudio Oleari from the Department of Physics at the University of Parma.

Psychophysical color specification is based on color matching under arbitrary viewing conditions. Psychometric color specification is based on quantifying perceived color differences and realizing uniform scales of perceived colors under controlled conditions (comparison of color samples on a uniform achromatic background under a chosen illuminant). Under these conditions, the only appearance phenomena are the instantaneous color constancy and the lightness contrast.

Between 1947 and 1974 the Optical Society of America (OSA) had a committee working on a uniform psychometric color scale; their goal was a lattice of colors in Euclidean color space where equal distances between points corresponds to equal visual differences. However, they eventually concluded that this is not possible – the human color system does not work this way. The resulting system (OSA-UCS) had only approximately uniform color scales and many scientists considered this to be a failure. However, OSA-UCS has a very strong property which is not shared by any other color space – it is spanned by a net of perceived geodesic lines. These are scales of colors which define the shortest perceptual path between colors, ordered with the difference between each pair of colors equal to one just noticeable difference (jnd).

Prof. Oleari has published an algorithm linking the cone activations (psychophysical color) to the OSA-UCS coordinates (“Color Opponencies in the System of the Uniform Color Scales of the Optical Society of America”, 2004).

Another of Prof. Oleari’s papers (“Euclidean Color-Difference Formula for Small-Medium Color Differences in Log Compressed OSA-UCS Space”, 2009) defines a Euclidean color-difference formula based on a logarithmically compressed version of OSA-UCS (like other such formula, it is only applicable to small color differences since a globally uniform space does not exist). This formula has only two parameters, but performs as well as the CIEDE2000 formula which has many more. Generalizing the formula to arbitrary illuminants and observers provides a matrix which is useful for color conversion of digital camera images between illuminants. Prof. Oleari claims that this matrix provides results that are clearly better for this purpose than other chromatic adaptation transforms (“Electronic Image Color Conversion between Different Illuminants by Perfect Color-Constancy Actuation in a Color-Vision Model Based on the OSA-UCS System”, 2010).

Design and Optimization of the ProPhoto RGB Color Encodings

This presentation was given by Dr. Geoff Wolfe, Senior Research Manager at Canon Information Systems Research Australia. However, the work it describes was done while he was at Kodak Research Laboratories, in collaboration with Kevin Spaulding and Edward Giorgianni.

ProPhotoRGB was created at a time (late 1990s – early 2000s) when the photographic world was in massive upheaval. In 2000 film sales were around 1 billion rolls/year; this decreased to 20 million by 2010 with an ongoing 20% volume reduction year on year. Digital cameras were just starting to become decent: in 1998 most consumer cameras had sensors under 1 megapixel, and in 1999 most had 2 megapixel sensors, and resolution continued to increase rapidly. Another interesting trend was digital processing for film; in 1990 the PhotoCD system scanned film to 24 megapixel images which were processed digitally and then printed out to analog film. ProPhoto RGB was intended to be used in a system which took this one step further: optically scanning negatives and then processing as well as printing digitally (today of course imaging is digital from start to finish).

During the mid to late 90s there was an increasing awareness that images could exist in different “image states”, characterized by different viewing environments, dynamic ranges, colorimetric aims and intended uses. The simplest example is to classify images as either scene-referred (unrendered) or output-referred (rendered picture or other reproduction). On one hand, scenes are very different than pictures – scenes have 14 stops or so of dynamic range vs. 6-8 stops in a picture, pictures are viewed in an adaptive viewing environment with a certain white point, luminance, flare, and surround – all of which affect color appearance. On the other hand, a scene and its picture are obviously closely related: the picture should convey the scene appearance. Memory and preference also play a part: people often assess an image against their memory of the scene appearance, which tends to be different (for example, more saturated) than the original scene. Even if they have never seen the original, people tend to prefer slightly oversaturated images.

There are several issues regarding the rendering of the scene into the display image. The first is the dynamic range problem – which 6-8 stops from the scene’s 14 should we keep? The adaptive viewing environment also poses some issues. An “accurate” reproduction of the scene colorimetry looks flat and dull compared to a “pleasing” rendition with adjustments to account for the viewing environment’s effect on perception.

ProPhoto RGB was designed as a related family of encodings allowing both original scenes and rendered pictures to be encoded. The encodings should facilitate rendering from scenes to picture with: common primaries for both scene and picture encoding, suitable working spaces for digital image processing, direct and simple relationships to CIE colorimetry and the ICC profile color space (PCS), and fast and simple transformations to commonly used output color encodings such as sRGB or Adobe RGB.

Since the desire was to have the same primaries for both scene and picture image states, choosing the right primaries was critical. The primaries needed to enable a gamut wide enough to cover all real world surface colors, and all output devices. On the other hand, making the gamut too wide could cause quantization errors (given a fixed bit depth and encoding curve, quantization gets worse with increasing gamut size). The primaries needed to yield the desired white point (D50) when present in equal amounts, and avoid objectionable hue distortions under tonescale operations (more on that below). However, the primaries did not need to be physically realizable; they could be outside the spectral locus.

Regarding tonescale operations: a common image processing operation is to put each channel through a nonlinear curve, for example an S-shaped contrast enhancement curve. Such operations are fast, convenient and generally well-behaved; they also are guaranteed to not go out of the color space’s gamut. However, in the general case, tonescale operations are not hue-preserving, and can result in noticeable hue shifts in natural “highlight to shadow” gradients. These hue shifts are particularly objectionable in skin tones, especially if they shift towards green.

All these constraints were fed into Matlab and an optimization process was performed to find the final primaries. The hue rotations could not be eliminated, but they were reduced overall and minimized for especially sensitive areas such as skin tones. The final set of ProPhoto primaries was much better in this regard than those of sRGB/Rec.709 or Adobe RGB (1998) primaries. Two of the resulting primaries were imaginary (outside the spectral locus), with the third (red) right on the spectral locus .

Besides the primaries and D50 white point, a nonlinear encoding (1/1.8 power with a linear toe segment) was added to create the ROMM (Reference Output Medium Metric) RGB color space, intended for display-referred data. A corresponding RGB space for scene-referred data was also defined: RIMM (Reference Input Medium Metric). RIMM had the same primaries as ROMM but a different encoding (same as Rec.709 but scaled to handle scene values up to 2.0, where 1.0 represents a perfect white diffuse reflector in the scene). An extended dynamic range version of RIMM (ERIMM) was defined as well. ERIMM has a logarithmic encoding curve with a linear toe segment, and can handle scene values up to 316.2 (relative to a white diffuse reflector at 1.0). All spaces can be encoded at 8, 12 or 16 bits per channel, but for ERIMM at least 12 bits are recommended.

The original intended usage for this family of color spaces was as follows. First, the negative is scanned and a representation of the original scene values is created in RIMM or ERIMM space. This is known as “unbuilding” the film response – a complex process that needs to account for capture system flare, the distribution of exposure in the different color layers, crosstalk between layers and the film response curve. Digital rendering of the image puts it through a tone scale and goes to the ROMM output space, and is finally turned into a printed picture or displayed image.

Digital cameras tend to have much simpler unbundling processes – it is straightforward to get scene linear values from the camera RAW sensor values. For this reason, Dr. Wolfe thinks that camera RAW can be an effective replacement for scene referred encodings such as RIMM/ERIMM, which he claims are now effectively redundant. On the other hand, he found that ROMM / ProPhoto RGB is still used by many photography professionals (and advanced amateurs) for its ability to capture highly saturated objects (such as iridescent bird feathers) and ease of tweaking in Photoshop.

During the Q&A period, several people in the audience challenged Dr. Wolfe’s statement that scene-referred encodings are no longer needed. The Academy of Motion Pictures Arts and Sciences (AMPAS) uses a scene-referred encoding in their Image Interchange Format (IIF) because their images come from a variety of sources, including different film stocks as well as various digital cameras. Even for still cameras, a scene-referred type of encoding is needed at least as the internal reference space (e.g. ICC PCS) even if the consumer never sees it.

Adobe RGB: Happy Accidents

This presentation was given by Chris Cox, a Senior Computer Scientist at Adobe Systems who has been working on Photoshop since 1996. It covered the history of the “Adobe RGB (1998)” color space.

In 1997-1998, Adobe was looking into creating ICC profiles that their customers could use with Photoshop’s new color management features. Not many applications had ICC color management at this point, so operating systems didn’t ship with them yet.

Thomas Knoll (the original creator – with his brother John – of Photoshop) was looking for relevant standards and ideas to build ICC profiles around; one of the specifications he found documentation for was the SMPTE 240M standard, which was the precursor to Rec.709. SMPTE 240M looked interesting – its gamut was wider than sRGB’s but not huge, and tagging existing content with it didn’t result in horrid colors. The official standards weren’t available online, and Adobe couldn’t wait to have a paper copy mailed since Photoshop 5 was about to ship, so they got the information from a somewhat official-looking website.

Adobe got highly positive feedback from their customers about the “SMPTE 240M” profile. Users loved the wide gamut and found that color adjustments looked really good in that space and that conversions to and from CYMK worked really well. A lot of books, tutorials and seminars recommended using this profile.

A while after Photoshop 5 shipped, people familiar with the SMPTE 240M spec contacted Adobe and told them that they got it wrong. It turns out that the website they used copied the values from an appendix to the spec which contained idealized primaries, not the actual standard ones. The real SMPTE-240M is a lot closer to sRGB (which Photoshop users didn’t like as a working space). Even worse, Thomas Knoll made a typo copying the red primary chromaticity values so the primaries Photoshop 5 shipped with weren’t even the correct ones from the appendix.

What to do? The profile was wrong in at least two different ways, but the customers REALLY liked it! Adobe tried to improve on the profile in various ways, and built test code to evaluate CMYK conversion quality (which was something the customers especially liked about the “SMPTE 240M” profile) in the new “fixed” profiles.

But no matter what they tried: correcting the red primary, changing the white point from D65 to the theoretically more prepress-friendly D50, widening the primaries, moving the green to cover more gamut, etc., every change made CYMK conversion worse than the “incorrect” profile.

In the end, Adobe decided to keep the profile but change the name. They picked “Adobe RGB” so they wouldn’t have to do a trademark search or get legal approval. The date was added to the profile name since they were sure they would be bringing out a better version soon, and the “Adobe RGB (1998)” profile was shipped in a Photoshop 5 dot release. Adobe kept experimenting, but was never able to improve on the profile. After a while they stopped trying.

After a while Kodak visited them to talk about ProPhoto RGB and how it was designed to minimize hue shifts under nonlinear tonescale operations (see previous talk). Adobe realized they had lucked into a color space that just happened to have good behavior in that regard, explaining the good CYMK conversions (which typically suffer from the same issue). Kodak assumed that Adobe had designed their color space like that on purpose.

Recent Work on Archival Color Spaces

This session was presented by Dr. Robert Buckley, formerly a Distinguished Engineer at Xerox, now a scientist at the University of Rochester.

It describes work done in collaboration between the CIE Technical Committee TC8-09 (of which Dr. Buckley is chair) and the Still Image Working Group of the Federal Agencies Digitization Initiative (FADGI).

TC8-09 did a recent study where they sent a set of test pieces to participating institutions to digitize with their usual procedures. The test pieces included four original color prints and three standard targets: X-Rite Digital ColorChecker SG, Image Engineering Universal Test Target (UTT) and the Library of Congress Digital Image Conformance Evaluation (DICE) Object Target. Special sleeves were made for the prints with holes to identify specific regions of interest (ROIs) for measurement. The technical committee members measured CIELAB values for the print ROIs and the standard target patches for later comparison with the results produced by the participating institutions.

Each institution used their usual scanning equipment and procedures; some used digital cameras, others used scanners; they used various profiles (manufacturer or custom) and some post-processed the resulting images.

The best agreement between the institution’s captures and the measured values were in the cases where digital cameras were used with custom profiles. In general the agreement was better for the targets than for the originals, which isn’t surprising since calibration uses similar targets. The committee concluded that better results would be obtained if the capture devices were calibrated to targets that contained colors more representative of the content being captured (not true for the standard targets).

Besides evaluating the various capture protocols, TC8-09 also wanted to establish which color space is best to use for image archiving. The gamut should of course include all the colors in the archived documents, but it should not be larger than necessary to avoid quantization artifacts. Specifically, if 8 bits per channel are used (which is common) then the gamut shouldn’t be much wider than sRGB. In practice, most of the material (with a few exceptions, such as a color plate in a book on gems) fit easily in the sRGB gamut.

Modern Display Technologies: Is sRGB Still Relevant?

This session was presented by Tom Lianza, “Corporate Free Electron” at X-Rite and Chair of the International Color Consortium (ICC).

One of sRGB’s main strengths is the fact that the primary chromaticities are the same as Rec.709 (and the two tone reproduction curves, while not identical, do have similarities). These similarities have led to the easy mixing of motion and still images in many different environments. The Rec.709 primaries were based on CRT primaries – at the time it was not clear whether they could be realized in flat-panel displays, but the standard pushed the manufacturers to make sure they did.

One of the goals of any color space is to reproduce the Pointer gamut of real-world surface colors. Unfortunately, there are cyans in this gamut that will be a problem for pretty much any physically realizable RGB system.

An output referred color space will always require some specification of ambient conditions. This is needed for effective perceptual encoding.

A missing element in many color spaces is a hard definition of black (Adobe RGB is one of the few that does have an encoding specification of black). The lack of this definition leads to inter-operability issues, and to non-uniform rendering in practice. ICC is now moving black point compensation into ISO to be considered as a standard, which would allow more vendors to use it (Adobe currently have an algorithm which their products use).

All commonly used display technologies (include the iPhone screen which has a really small gamut) encompass the Bartleson memory colors (“Memory Colors of Familiar Objects”, 1960). This explains why people find them all acceptable, although they vary greatly in gamut size and none of them cover the Pointer gamut completely.

Viewing conditions for sRGB are well defined but the assumptions of low-luminance displays viewed in low ambient lighting do not reflect how people view images today.

Cameras are not (and should not be) colorimeters. They do not use sRGB as a precise encoding curve (most cameras reproduce images with a relative gamma of 1.2-1.3 vs. the sRGB encoding curve, to take account low viewing luminance). Instead, cameras are designed to produce good images when viewed on an sRGB display – having a common target guides the different manufacturers to similar solutions. As an example, Mr. Lianza showed a scene with highly out-of-gamut colors, photographed with automatic white balancing on cameras from different vendors. There is no standard for handling out of gamut colors, but nevertheless all the cameras produced very similar images. This is because the critical visual evaluations of these camera’s algorithms were all done on the same (sRGB) displays.

Browsers have various issues with color management. ICC has a test page which can be used to see if a browser handles ICC version 4 profiles properly. Chrome does not have color management and shows the entire page poorly. Firefox shows the ICC version 2 profile test correctly, but not the ICC version 4 test. Safari has good color management and shows all images well, but not when printing.

Conclusions: sRGB is robust and can be used to reproduce a wide range of real-world and memory colors. Existence of the specification coupled with physically realizable displays makes the application of the spec quite uniform in the industries that use it. The lack of black point specification and the low luminance assumption has caused manufacturers to apply compensation to the images which may not work well at higher luminances encountered in mobile environments. It may be possible to tweak the spec for higher luminance situations, but any wholesale changes will have a very bad effect on the market place due to the huge amount of legacy content. The challenge to sRGB in the 21st century comes from disruptive display technologies and the implementations that allow for simultaneous display of sRGB and wide gamut images on the same media at high luminance and high ambient conditions.

Question from the audience: most mobile products don’t have color management, and this is a core issue now. Answer: ICC is splitting into three groups. ICC version 4 is staying stable to address current applications, the “ICC Labs” open-source project is intended for advanced applications, and there will be a separate project to establish a solution for the web and mobile (there is a current discussion regarding adding a new working group for mobile hardware).

Device-Independent Imaging System for High-Fidelity Colors

This session was presented by Dr. Akiko Yoshida from SHARP. It describes the same system that SHARP presented at SIGGRAPH 2011 (there was a talk about the system, and the system itself was shown in Emerging Technologies).

The system comprises a wide-gamut camera (which colorimetrically captures the entire human visual range of colors) and a 5-primary display with a gamut that includes 99% of Pointer’s real-world surface colors.

The camera they developed has sensor sensitivities that satisfy the Luther-Ives condition: the sensitivity curves are a linear combination of cone fundamentals (or equivalently, of the appropriate color-matching functions). This is the first digital camera to satisfy this condition. It is fully colorimetric, measuring the Macbeth ColorChecker chart with an accuracy of about 0.27 ΔE.

Today’s display systems cannot display many colors found in daily life, as can be seen by comparing their gamuts to the Pointer surface color gamut (“The Gamut of Real Surface Colors”, 1980). Although the Pointer gamut is relatively small compared to the gamut of human vision, it cannot be efficiently covered with three RGB primaries. SHARP set a goal to reproduce real-surface colors faithfully and efficiently with a five-primary system (“QuintPixel”) including RGB plus yellow and cyan. QuintPixel actually has six subpixels for each pixel – the red subpixels are repeated twice. This was necessary to get adequate coverage reds. This display can efficiently reproduce 99.9% of Pointer’s gamut.

Why not just extend the three primaries? Mitsubishi has rear-projection laser TVs with really wide RGB gamuts. The reason SHARP didn’t take this approach is efficiency – the gamut is much larger than it needs to be. Another advantage of adding primaries is color reproduction redundancy, which can be exploited to have brighter reproduction at the same power consumption, lower power consumption with the same brightness, or improved viewing angle. The larger number of sub-pixels can also be used to greatly increase resolution (similarly to Microsoft’s “ClearType” technology). These advantages can be realized without losing the wide gamut.

The camera sends 10-bit XYZ signals at 30Hz to the display via the CameraLink protocol. The display does temporal up-conversion from 30 to 60 Hz as well as interpreting the XYZ signal.

Q&A Session:

Question: Is the colorimetric camera available for purchase? Answer: yes, for 1M yen (about $13,000).

Question: 10 bits are not enough for XYZ, are they planning to address this? Answer: yes, they do plan to increase the bit-depth.

Question: what is the display resolution? Answer: They use a 4K panel and combine two pixels into one, cutting the resolution in half.

Is There Really Such a Thing As Color Space? Foundation of Uni-Dimensional Appearance Spaces

This talk was presented by Prof. Mark D. Fairchild, from the Munsell Color Science Laboratory in the Rochester Institute of Technology.

Color is an attribute of visual sensation – not physical values. Color scientists seldom question the 3D nature of color space, but Prof. Fairchild thinks that it is more correct to think about color as a series of one-dimensional appearance spaces or scales, and not to try to link them together.

Color vision is only part of the visual sense, which is itself just one of five senses. Only in color vision is a multidimensional space commonly used to describe perception. All the other senses are described with multiple independent dimensions as appropriate, not with multi-dimensional Euclidean differences.

For example, taste has at least five separate scales: sweet, bitter, sour, salty, and umami. But there is no definition of “delta-Taste” which collapses taste differences into a single number. Smell has about 1000 different receptor types, and some have tried to reduce the dimensionality to about six such as flowery, foul, fruity, spicy, burnt, and resinous. Hearing is spectral – our ears can perceive the spectral power distribution of the sound. Touch might well be too complex to summarize in a single sentence.

Why should color vision be different? Perhaps researchers have been misled by certain properties of color vision such as low-dimensional color matching and simple perceptual relationships such as color opponency. The 3×3 linear transformations between color matching spaces really reinforce the feeling of a three-dimensional color space, but they have nothing to do with perception. Color scientists have spent a lot of effort looking for the “holy grail” of a global 3D color appearance space with Euclidean differences, to no avail.

Perhaps this is misguided and efforts should focus on a set of 1D scales instead. There have been examples of such scales in color science. The Munsell system has separate hue, value and chroma dimensions. Similarly, Guth’s ATD model of visual perception was typically described in terms of independent dimensions. Color appearance models such as CIECAM02 were developed with independent predictors of the perceptual dimensions of brightness, lightness, colorfulness, saturation, chroma, and hue. This was compromised by requests for rectangular color space dimensions which appeared as CIECAM97s evolved to CIECAM02. The NCS system treats hue separately from whiteness-blackness and chromaticness, though it does plot the latter two as a two dimensional space for each hue.

This insight leads to the hypothesis that perhaps color space is best expressed as a set of 1D appearance spaces (scales), rather than a 3D space, and that difference metrics can be effective on these separate scales (but not on combinations of them). The three fundamental appearance attributes for related colors are lightness, saturation, and hue. Combined with information on absolute luminance, colorfulness and brightness can be derived from these and are important and useful appearance attributes. Lastly, chroma can be derived from saturation and lightness if desired as an alternative relative colorfulness metric.

Prof. Fairchild has derived a set of color appearance dimensions following these principles. The first step is to apply a chromatic adaptation model to compute corresponding colors for reference viewing conditions (D65 white point, 315 cd/m² peak luminance, 1000 lux ambient lighting). Then the IPT model is used to compute a hue angle (h) and then a hue composition (H) can be computed based on NCS. For the defined hue, saturation (S) is computed using the classical formula for excitation purity applied in the u’v’ chromaticity diagram. For that chromaticity, G0 is defined as the reference for lightness (L) computations that follow a “power plus offset” (sigmoid) function. Brightness (B) is Lightness (L) scaled by the Stevens and Stevens terminal brightness factor. Colorfulness (C) is Saturation (S) scaled by Brightness (B), and Chroma (Ch) is Saturation (S) times Lightness (L).

Prof. Fairchild plans to present his detailed formulation soon, and do testing and refinement afterwards.

HDR and UCS: Do HDR Techniques Require a New UCS Space?

This session was presented by Prof. Alessandro Rizzi from the Department of Information Science and Communication at the University of Milan. There was some overlap between this session and the “HDR Imaging in Cameras, Displays and Human Vision” course which Prof. Rizzi presented earlier in the week.

Colorimetry ends in the retinal cone outer segments; color appearance is at the other end of the human visual system. Appearance incorporates all the spatial processing of all the color responsive neurons. Thus color vision can be analyzed in two ways: bottom-up starting from the color matching response of retinal receptors accounting for pre-retinal absorption and glare (going through color matching tests, e.g. the CIE 1931 observer) or top-down starting from the color appearance generated by the entire human visual system (asking observers to describe the apparent distances between hues, chromas and lightnesses, e.g. the Munsell color space).

Recent work (“A Quantitative Model for Transforming Reflectance Spectra Into the Munsell Color Space Using Cone Sensitivity Functions and Opponent Process Weights”, 2003) has linked the two, solving for the 3-D color space transform that places LMS cone responses in the color-space positions measured for the Munsell Book of Color. The process includes a correction for veiling glare inside the eye, which causes the image on the retina to be different than the original scene intensities entering the cornea. The cone response is proportional to the logarithm of the retinal intensities, which (because of glare) is proportional to the cube root of scene intensities. This glare also limits the dynamic range of the retinal image. The link between cone responses and Munsell colors also involves a strong color-opponent process (creating signals differentiating opponent colors such as red-green or yellow-blue).

CIE L*a*b* also has a cube root response and opponent channel mechanism. L*a*b* handles the lightness component of HDR scenes with a two-component compression curve – the first component is a cube-root function in both lightness and chroma for high and medium light levels, and the second is a linear function for low light levels (the two components connect seamlessly). The sRGB and Rec.709 transfer functions are similarly constructed. CIE L*a*b* normalizes each of X, Y and Z to its maximum value over the image before further processing; this is equivalent to the way human vision effectively normalizes L, M and S cone responses (it processes differentials/ratios and not absolute values, as in Retinex theory). After normalization, the compression curve scales the large range of possible radiances into a limited range of appearances – 99% of possible lightnesses correspond to the top 1000:1 range of scene radiances – all remaining radiances (darker than 1/1000 of the white point) correspond to the bottom 1% of possible perceived lightness values. sRGB has similar behavior.

Given these considerations, Prof. Rizzi does not believe that new uniform color spaces (UCSs) are needed for HDR imaging; existing spaces can handle the range that the human eye can perceive in a single scene (note that this analysis does not relate to intermediate images, such HDR IBL – UCSs are only used to describe the perceived colors in the final viewed image).

Digital HDR Color Separations

This session was presented by John McCann, an independent color and imaging consultant since 1996. Previously he led Polaroid’s Vision Research Laboratory for over 30 years, working on topics including Retinex theory, color constancy, very large-format photography, and perceptually-guided color reproduction. John is a co-author of the recently published book “The Art and Science of HDR Imaging”.

Many applications (HDR exposure bracketing, various computer vision and spatial image processing algorithms) need linear light scene values. The JPEGs produced by cameras are very far from linear light; they are images created with the intention of creating a preferred rendering of the scene, which looks pleasing and is not colorimetrically accurate. Regular color print & negative film were designed with a similar intent and produce similar results.

Although the sRGB standard specifies an encoding from scene values, and camera manufacturers follow some aspects of the sRGB standard in producing JPEGs, the processing differs in important ways from the sRGB encoding spec. the algorithms that perform the demosaic, color balance, color enhancement, tone scale, and post-LUT for display and printing create discrepancies between the sRGB output in practice and an idealized conversion of scene radiances to sRGB space.

Together with Vassilios Vonikakis (Democritus University of Thrace, Greece), John McCann did an experiment to measure these discrepancies. Images of a Macbeth ColorChecker chart were taken under varying exposures using three methods: digitally scanned traditional color separation photographs, standard JPEG images from a commercial camera, and “RAW* separations” from the same camera. Traditional color separation photographs use R, G and B filters and panchromatic black and white film to create separate single-channel R, G and B images that are combined into a single color image. “RAW* separations” are the author’s names for linear RGB values that were generated from partially processed RAW camera data (read with LibRaw’s “unprocessed” function). This data does not even include demosaic – it is a black and white image with the mosaic pattern (e.g., Bayer) in it. The authors did their own, carefully calibrated processing on these images to create normalized, linear RGB data.

The photographic separations were most correct – the chromaticity of the Macbeth chart squares remained very stable across all the exposure values. The JPEG image had the largest chroma errors – the chromaticities of the colored Macbeth squares varied greatly with exposure – this is part of the “preferred rendering” performed by these cameras to make the resulting image look good. The RAW* separations were similar to film (slightly less stable chromaticities, but close).

The conclusion is that for any algorithm that needs linear scene data, it is important to use RAW data where most of the processing has been turned off and do carefully calibrated processing.

2011 Color and Imaging Conference, Part V: Papers

Papers are the “main event” at CIC. Unlike the papers at computer science conferences (which are indistinguishable from journal papers), CIC papers appear to be focused more towards “work in progress” and “inspiring ideas”. This stands in contrast to the work published in color and imaging journals such as the Journal of Imaging Science and Technology or the Journal of Electronic Imaging. This distinction is actually the norm in most fields – computer science is atypical in that respect.

Note that since CIC is single-track, I was able to see (and describe in this post) all the papers, including some that aren’t as relevant to readers of this blog.

Root-Polynomial Colour Correction

Images from digital cameras need to be color-corrected, since they typically have sensors which cannot be easily mapped to device independent color-matching functions.

The simplest mapping is a linear transform (matrix), which can be obtained by taking photos of known color targets. However this assumes that the camera spectral sensitivities are linear combinations of the device-independent ones, which is not the case.

Polynomial color correction is another option which can reduce the error of the linear mapping by extending it with additional polynomials of increasing degree. However, polynomial color correction is not scale-independent – there is a chromaticity shift when intensity changes (e.g. based on lighting). This shift can be quite dramatic in some cases.

This paper proposes a new method: root-polynomial color correction. It is very straightforward: simply take the nth root of each nth order term in the extended polynomial vector. Besides restoring scale-independence, the vector also becomes smaller since some of the terms now become the same (e.g. sqrt(r*r) = r).

Experiments showed that with fixed illumination, root-polynomial color correction performed similarly to higher-order polynomial correction. It performed much better if the illumination level changes, even slightly. A large improvement is achieved by adding only three terms to the linear model, so this technique provides very good bang for buck.

Tone Reproduction and Color Appearance Modeling: Two Sides of the Same Coin?

This invited paper was written and presented by Erik Reinhard (University of Bristol), who has done some very influential work on tone mapping for computer graphics and has also co-authored some good books on HDR and color imaging.

Tone mapping or tone reproduction typically refers to luminance compression (often sigmoidal), intended to map high-dynamic range images onto low-dynamic range displays. This can be spatially varying or global over the image. However, tone mapping typically does not take account of color issues – most tone mapping operators work on the luminance channel and the final color is reconstructed via various ad-hoc methods – the two most popular ones are by Schlick (“Quantization Techniques for Visualization of High Dynamic Range Pictures”, 1994) and Mantiuk (“Color correction for Tone Mapping”, 2009). They do not take account of the various luminance-induced appearance phenomena that have been identified over the years: the Hunt effect (perceived colorfulness increases with luminance), the Stevens effect (perceived contrast increases with luminance), the Helmholt-Kohlrausch effect (perceived brightness increases with saturation for certain hues), and the Bezold-Brücke effect (perceived hue shifts based on luminance).

Color appearance models attempt to predict the perception of color under different illumination conditions. They include chromatic adaptation, non-linear range compression (often sigmoidal), and other features used to compute appearance correlates. They are designed to take account of effects such as the ones mentioned in the previous paragraph, but most of them do not handle high dynamic range images (there are some exceptions, such as iCAM and the model presented in the 2009 SIGGRAPH paper “Modeling Human Color Perception Under Extended Luminance Levels”).

Tone mapping and color appearance models appear to have important functional similarities, and their aims partially overlap. The paper was written to show opportunities to construct a combined tone reproduction and color appearance model that can serve as a basis for predictive color management under a wide range of illumination conditions.

Tone mapping operators tend to range-compress luminance and ignore color. Color appearance models tend to identically range-compress individual color channels (typically in a sharpened cone space) and do separate chromatic adaptation. A recent color appearance model by Erik and others (“A Neurophysiology-Inspired Steady-State color Appearance Model”, 2009) combines chromatic adaptation and range compression into the same step (basically doing different range compression on each channel), which Erik sees as a step towards unifying the two approaches.

Another recent step towards unifying the two can be seen in HDR extensions to color spaces (“hdr-CIELAB and hdr-IPT: Simple Methods for Describing the Color of High-Dynamic Range and Wide-Color-Gamut Images”, 2010) which replace the compressive power function with sigmoid curves. A similar approach was taken for HDR color appearance modeling (“Modeling Human Color Perception Under Extended Luminance Levels”, 2009). Image appearance models such as iCAM and iCAM06 incorporate HDR in a different way, taking account of spatial adaptation.

Some of the most successful tone mapping operators are based on neurophysiology, but put the resulting “perceived” values into a frame buffer. This is theoretically wrong, but looks good in practice. Color appearance models instead run the model in reverse from the perception correlates to display intensities (with the display properties and viewing conditions). This is theoretically more correct, but in practice tends to yield poor tone mapping since the two sigmoid curves (one run forward, one in reverse) tend to cancel out, undoing a lot of the range compression. An ad-hoc way to combine the strengths of both approaches (the color management of color appearance models and the range compression of tone mapping operators) is to run a color appearance model on an HDR image, then resetting the luminance to retain only chromatic adjustments and compressing luminance via a tone mapping operator. However, it is hoped that the recent work mentioned above (combining chromatic adaptation & range compression, sigmoidally compressed HDR color spaces, and HDR color appearance models such as iCAM) can be built upon to form a more principled unification of tone mapping and color appearance modeling.

Real-Time Multispectral Rendering with Complex Illumination

Somewhat unusually for this conference, this paper was about a computer graphics real-time rendering system. The relevance comes from the fact that it was a multispectral real-time rendering pipeline.

RGB rendering is used almost exclusively in industry applications, however it is an approximation. Although three numbers are enough to describe the final rendered color, they are not enough in principle to compute light-material interactions, which can be affected by metameric errors.

The authors wanted their pipeline to support complex real world illumination (image-based lighting – IBL), while still allowing for interactive (real-time) rendering. They used Filtered Importance Sampling (see “Real-time Shading with Filtered Importance Sampling”, EGSR 2008) to produce realistic (Ward) BRDF interactions with IBL.

The implementation was in OpenGL, using 6 spectral channels so they could use pairs of RGB textures for reflectance and illumination, two RGB render targets, etc. After rendering each frame, the 6-channel data was transformed first to XYZ and then to the display space, optionally using a chromatic adaptation transform.

The reflectance data was taken from spectral reflectance databases and the spectral IBL was captured by removing the IR filter from a Canon 60D camera and taking bracketed-exposure images of a stainless steel sphere with two different spectral filters.

The underlying mathematical approach was to use a set of six spectral basis functions and multiply their coefficients for light-material interactions, as in the work of Drew and Finlayson (“Multispectral Processing Without Spectra”, 2003). However, the authors found a new set of optimized basis functions (primaries), optimized to minimize error for a set of illuminants and reflectances.

The authors compared the analysis of their results with best-of-class three-channel methods such as the one described in the EGSR 2002 paper “Picture Perfect RGB rendering using Spectral Prefiltering and Sharp Color Primaries”. The results of the six-channel method were visibly closer to the ground truth (the RGB rendering had quite noticeable color errors in certain cases).

Choosing Optimal Wavelengths for Colour Laser Scanners

Monochrome laser scanners are widely used to capture geometry but are incapable of capturing color information. Color laser scanners are a popular choice since they capture geometry and color at the same time, avoiding the need for a separate color capturing system as well as the registration issues involved in combining disparate sources of data. These scanners scan three lasers (red, green, blue) to simultaneously obtain XYZ coordinates as well as RGB reflectance.

However, laser scanners are effectively point-sampling the spectral reflectance at three wavelengths, which is known to be a highly inaccurate method, prone to metamerism. Also, the three wavelengths typically used (635nm, 532nm, and 473nm for the Arius scanner – similar wavelengths for other scanners) are chosen for reasons unrelated to colorimetric accuracy.

The authors of this paper did a brute-force optimization process to find the three best wavelengths for minimizing colorimetric error in color laser scanners. They found that the same three wavelengths (460nm, 535nm, and 600nm) kept popping up, regardless of the reflectance dataset, the difference metric, or any other variation in the optimization process. The errors using these wavelengths were much lower than with the wavelengths currently used by the laser scanners – the color rendering index (CRI) improved from 48 to 75 (out of a 0-100 scale). Interestingly, adding a fourth and fifth wavelength gave no improvement at all.

Since these wavelengths are independent of the color space, difference metric and sample set, they must be associated with a fundamental property of human vision. These wavelengths are very close to the ‘prime colors’ (approximately 450nm, 530nm, and 610nm) identified in 1971 by Thornton (“Luminosity and Color-Rendering Capability of White Light”) as the wavelengths of peak visual sensitivity. These wavelengths were later shown (also by Thornton) to have the largest possible tristimulus gamut (assuming constant power), and are therefore optimal as the dominant wavelengths of display primaries. The significance of these wavelengths can be understood by applying Gram-Schmidt orthonormalization to the human color-matching functions (with the luminance function as the first axis) – the maxima and minima of the two chromatic orthonormal color matching functions line up along these three wavelengths. In other words, these wavelengths produce the maximal excitation of the opponent color channels in the retina.

These results are applicable not just to laser scanners but also to regular (broadband-filter) cameras and scanners, in guiding the dominant wavelengths of the spectral sensitivity functions.

(How) Do Observer Categories Based on Color Matching Functions Affect the Perception of Small Color Differences?

The CIE 2° and 10° standard observers that underlie a lot of color science are well-understood to be averages; people with normal color vision are expected to deviate from these to some extent. There is even a CIE standard as to the expected variation (the somewhat amusingly-named CIE Standard Deviate Observer). However, this does not say how human observers are distributed – are variations essentially random, or are people grouped into clusters defined by their color vision? In last year’s conference, a paper was presented which demonstrated that humans can be classified into one of seven well–defined color vision groups. This paper is a follow-on to that work, which attempts to discover if observers ability to detect small color differences depends on the group they belong to.

It turns out that it does, which opens up some interesting questions. Does it make sense to customize color difference equations and uniform color spaces to each category? Modern displays with their narrow-band primaries tend to exaggerate observer differences, so it might be a good time to explore more precise modeling of observer variation.

A Study on Spectral Response for Dichromatic Vision

Dichromats are people who suffer from a particular kind of color blindness; they only have two types of functional cones. Previous work has dealt with projections from 3D to 2D space but didn’t deal with spectral analysis; this work aims to remedy that. The study looked at three types of dichromats (each missing a different cone type), classified visible and lost spectra for each, and validated certain previous work.

Saliency as Compact Regions for Local Image Enhancement

The goal of this paper is to improve the subjective quality of photographs (taken by untrained photographers) by finding and enhancing their most salient (visually important) features.

It was previously found that people highly prefer images with high salience (prominence), where a region is highly distinct from its background. However untrained photographers often capture images without salient regions. It would be desirable to find an automated way to increase salience, but salience is very difficult to predict for general images.

This paper sidesteps the problem by finding an easier-to-measure correlate – spatial compactness (a certain property is spatially compact if it is concentrated in a relatively small area). The idea is to look at the distribution of pixels with certain low-level attributes such as opponent hues, luminance, sharpness, etc. If the distribution is highly compact (peaked), then there is probably high saliency there and enhancing that attribute will make the photograph look better. There are a few additional tweaks (small objects are filtered out, and regions closer to the center of the screen are considered more important) but that is the gist. The enhancements they did are relatively modest (5-10% increase in the most salient attribute). The results were surprisingly strong: 91% of people preferred the modified image (which is quite an achievement in the field of automatic image enhancement).

The Perception of Chromatic Noise on Different Colors

Pixel size on CMOS sensors is steadily decreasing as pixel count increases, and appears set to continue doing so based on camera manufacturer roadmaps. This increases the likelihood of noisy images; noise reduction filters (e.g. bilateral filters) are becoming more important. Tuning these filters correctly depends on a good model for noise perception. Previous work has shown that the perception of chromatic noise (noise which does not vary luminance) depends on patch color; this study was done to further explore this and to attempt an explanation.

It was found that the perception of chromatic noise was weakest when the noise was added to a grey patch, and strongest when the noise was added to a purple, blue, or cyan patch. Orange, yellow and green patches were in the middle.

Further experiments implied that these differences could be due to the Helmholtz-Kohlrausch (H-K) effect, which causes chromatic stimuli of certain hues to appear brighter than white stimuli of same brightness. Due to this effect, the chromatic noise on certain patches was partially perceived as brightness noise, which has higher spatial visual resolution.

Predicting Image Differences Based on Image-Difference Features

Image-difference measures are important for estimating (as a guide to reducing) distortions caused by various image processing algorithms. Many commonly used measures only take into account the lightness component, which makes them useless for applications such as gamut mapping where color distortions are critical. This paper takes a new approach, by combining many simple image-difference features (IDFs) in parallel (similar to how the human visual cortex works). The authors took a large starting set of IDFs, and (using a database of training images) isolated a combination of IDFs that best matched subjective assessments of image difference.

Comparing a Pair of Paired Comparison Experiments – Examining the Validity of Web-Based Psychophysics

Paired comparison experiments are fairly common in color science, but it is difficult to get enough observers. Some attempts have been made to do experiments over the web; this could greatly increase observer count, but has several issues (uncalibrated conditions, varying screen resolutions, applications like f.lux that vary color temperature as a function of time, etc.).

This paper describes a “meta-experiment” meant to determine the accuracy of web experiments vs. those conducted in a lab.

The correlation between web and lab experiments appears to be poor. That’s not to say that the data gained is not useful; when working on consumer applications, results are typically viewed in uncontrolled conditions. The web experiment performed for this paper ended up not having many participants and had a few other issues (bad presentation design, etc.)

The authors are now doing a second web experiment which has had a lot more participants and better correlation to the lab experiment. They hope to come back next year with a paper on why this second experiment was more successful.

Recent Development in Color Rendering Indices and Their Impacts in Viewing Graphic Printed Materials

Background information on color rendering indices can be found in the “Lighting: Characterization & Visual Quality” course description in my previous blog post. The current CIE standard (CIE-R_a) has several acknowledged faults (use of obsolete metrics such as the von Kries chromatic adaptation transform and the U*V*W* color space, low saturation of the test samples). A CIE technical committee (TC 1-69) was started in late 2006 to investigate methods that would work with new light sources including solid state/LED; this paper reports on the current status of their work.

There have been several proposals for color rendering indices. The current front-runner is based on the CAM02-UCS uniform color space (itself based on the CIECAM02 color appearance model). Various test sample sets were evaluated. The committee currently have a set of 273 samples primarily selected from the University of Leeds dataset (which contains over 100,000 measured reflectance spectra), and are working on reducing it to around 200. The color difference weighting method and scaling factors were also adjusted. Finally, the new index was compared with several others in a typical graphic art setting (common CMY ink set and 58 different D50-simulating lighting sources), and was found to perform well.

Memory Color Based Assessment of the Color Quality of White Light Sources

Although color rendering indices such as the one discussed in the previous paper are needed for professional applications where color fidelity is important, for home and retail lighting color fidelity is not necessarily the most desirable lighting property, instead lights that make colors appear more “vibrant” or “natural” may be preferred. Recognizing this, very recently (July 2011) a new CIE technical committee (TC1-87) was formed to investigate an assessment index more suitable for home and retail applications.

Many such metrics have been proposed over the years, most of which use a Planckian or daylight illuminant as an optimal reference. However, some light sources produce more preferred color renditions than these reference illuminants. This paper focuses on an attempt to define color quality without the need for a reference illuminant.

The approach is based on “memory colors” – the colors that people remember for certain familiar objects. The theory is that if a light source renders familiar objects close to their memory colors, people will prefer it. Experiments were performed where the apparent color of 10 familiar objects was varied and observers selected the preferred color as well as the effect of varying the color (e.g., whether changing saturation relative to the preferred color is perceived as worse than changing the brightness, etc.). This data was fit to bivariate Gaussians in IPT color space to produce individual metrics for each object. The geometric mean of these was rescaled to a 0-100 range, with the F4 illuminant at 50 (which is also its score in the CIE-R_a metric) and D65 at 90 (D65 is a reference illuminant in CIE-R_a, but was found to be non-optimal for memory color rendition).

The authors did a large study to validate the new metric and found that it matched observer’s judgments of visual appreciation better than the other metrics. For future work, they are planning to study how cultural differences affect memory colors.

Appearance Degradation and Chromatic Shift in Energy-Efficient Lighting Devices

During the next few years, many countries will mandate replacement of the incandescent lamp technology which has served humanity’s lighting needs faithfully since 1879. Incandescent lamps, being blackbody radiators, appear “natural” to consumers – they have very good color rendering and remain on the Planckian locus over their lifetime (albeit shifted in color temperature). The general CIE color rendering index (CIE-R_a) is not a sufficient metric – the speaker showed three lights, all with CIE-R_a of 85 and correlated color temperature (CCT) of 3000K; they didn’t look alike at all.

Most consumers inherently recognize the difference between incandescent and energy-efficient lamps. The lighting from the latter just doesn’t “look natural” to them. When asking focus groups about important lighting considerations, they first mention appearance issues: color quality (color rendering), color temperature (warm, normal, or cool white), form factor (shaped like a bulb, a tube, or other), dimmability (will a household triac dimmer work with it?), and glare. Appearance issues are followed by efficiency, brightness, lifetime, environmental friendliness and instant on/off.

The two types of energy efficient lights in common use today are compact fluorescent lamps (CFL) and light emitting diode (LED). In CFLs, a mercury vapor UV light excites phosphors, which emit light in the visible spectrum. White light LEDs have a blue LED which excites phosphors. Both of these light types are characterized by two-stage energy conversion. There are other energy-efficient lighting devices (HIR, HID, OLED, hybrids), but these are not practical for residential lighting.

Since both LEDs and CFL phosphors are operated at high energy densities, heat causes them to degrade over time. Since white is obtained via multiple phosphors, the differential degradation (between phosphors or between phosphors and LED) causes a chromatic shift during usage.

The authors measured aging for all three types of lamp over 5000 hours. The incandescent barely changed. The CFL had some phosphor types degrade a lot & others somewhat, causing a shift toward green. The LED lamp had huge degradation in the phosphors and almost none in the blue LED – light comes from both so the color shifted quite a bit towards blue. Both energy-efficient lamps started with bad color rendition (CRI) and it got a lot worse; luminous efficacy (lumens/Watt) also decreased.

In theory, UV sources combined with trichromatic phosphors that age uniformly could solve the problem, but that challenge has not yet been solved. Emerging energy-efficient lamp types (ESL and others) are supposed to help but aren’t ready yet, which is worrying since the transition has already started.

During Q&A, the speaker stated that he doesn’t think color rendering indices are useful at all; instead he uses color rendering maps that show the color rendering for various points on the color gamut simultaneously. These color rendering maps can show which colors are most affected. Since computers are now fast enough to compute such a map in less than a second, why use a single number? Also, the CIE CRI in common use is overly permissive – it will give high scores to some pretty bad-looking illuminants. Of course, for this very reason the light manufacturers will fight against changing it.

Meta-Standards for Color Rendering Metrics and Implications for Sample Spectral Sets

Like the previously presented paper “Recent Development in Color Rendering Indices and Their Impacts in Viewing Graphic Printed Materials”, this is also a report on the work done by the TC-69 CIE technical committee working on proposals for a new standard color rendering index, but by a different subcommittee. Neither paper appears to represent a consensus; presumably one of these approaches (or a different one) will eventually be selected.

In a recent meeting, the technical committee recommended selecting a reflectance sample set for the new CRI that is simple and as “real” as possible. This paper will talk about potential “meta-standards” by which to select the new CRI standard, and what this means for the reflectance sample set.

Their approach is based on the idea that the CRI should be equally sensitive to perturbations in light spectra regardless of where in the visible spectrum the perturbation occurs. This implies that the average curvature of the reflectance sample set should be uniform, since an area with higher curvature will be more sensitive to perturbations in light spectra. The average curvature of the 8 current CRI samples is very non-uniform, unsurprising due to the low sample count.

At first they tried to select 1000 samples from the University of Leeds sample set (which includes over 100,000 reflectance spectra). The samples were picked to be roughly equally spaced throughout the color set. The average curvature was still highly non-uniform, since many of the materials share the same small set of basic dyes and pigments. Generating completely random synthetic spectra would solve this problem, but then there would be no guarantee that they would be “natural” in the sense of having similar spectral features and frequency distributions. The authors decided to go for a “hybrid” solution where segments of reflectance spectra from the Leeds database were “stitched” together and shifted up or down in wavelength. This resulted in a set of 1000 samples with a much smoother curvature distribution while keeping the “natural” nature of the individual spectra.

1000 samples may be too high for some applications, so the authors attempted to generate a much smaller set of 17 mathematically regular spectra which yield similar results to the set of 1000 hybrid samples. The subcommittee is proposing this set (named “HL17”) to the full technical committee for consideration.

Image Fusion for Optimizing Gamut Mapping

There are various methods for mapping colors from one gamut to another (typically reduced) gamut. Each method works well in some circumstances and less well in others. Previous work applied different gamut mapping algorithms to an image and automatically selected the one that generated the best image based on some quality measure. The authors of this paper tried to see if this can be done locally – if different parts of the same image can be productively processed with different gamut mapping algorithms, and if this produces better results than using the same algorithm for the whole image.

Their approach involved mapping the original with every gamut mapping algorithm in the set, and generating structural similarity maps for each algorithm. This was followed by generation of an index map for the highest similarity at each pixel. Each pixel was mapped with the best algorithm, and the results were fused into one image.

Simple pixel-based fusion results in artifacts, so the authors tried segmentation and bilateral filtering. Bilateral fusion ended up producing the best results, then segmented fusion, followed by picking the best overall algorithm for each image, and finally the individual algorithms. So the fusion approach was promising in terms of visual quality, but computation costs are high. They plan to improve this work as well as applying it to other imaging problems like locally optimized image enhancement and tone mapping operators.

Image-Adaptive Color Super-Resolution

The problem is to take multiple low-resolution images and estimate a high-resolution image. There has been work in this area, but challenges remain, especially correct handling of color images. The authors treated this as an optimization problem with simple constraints (individual pixel values must lie in the 0 to 1 range, warping and blurring must preserve energy over the image, as well as some assumptions on the possible properties of blurring and warping). They add a novel chrominance regularization term to hand color edges properly. The results shown appear to be better than those achieved by previous work.

Two-Field Color Sequential Display

Color-sequential displays mix primaries in time rather than in space as most displays do. Since the color filters are removed (replaced by a flashing red-green-blue backlight), the power efficiency is increased by a factor of three. However, very high frame rates are needed (problematic with LCD displays) and the technique is prone to color “breakup” artifacts.

This paper proposes a display composed of two temporal fields instead of three, to reduce flicker. Optimal pairs of backlight colors are found for each screen block to reduce color “breakup”. This is implemented via an LCD system with local RGB LED backlights. The authors built a demonstration system and experimented with various images. Most natural images are OK, but some man-made objects look bad. The number of segments can be increased, reducing the errors but not eliminating them. They were able to achieve reasonable results with 576 blocks.

Efficient Computation of Display Gamut Volumes in Perceptual Spaces

This paper discussed fast methods to compute gamut volume – the motivation is for use in optimizing display primaries (three or more). I’m not sure how important it is to do this fast, but that is the problem they chose to solve.

Computing gamut volume of three-primary displays in additive spaces is very easy (just the magnitude of the determinant of the primary matrix). The authors want to compute the gamut volume in CIELAB space, which is more perceptually uniform but has non-linearities which complicate volume computation. They found a way to refactor the math into a relatively simple form based on certain assumptions on the properties of the perceptual space. For three-primary displays in CIELAB this reduces to a simple closed-form expression.

Computing gamut volume for multi-primary displays is more complex. The authors represent the gamut as a tessellation of parallelepipeds. To determine the total volume in CIELAB space they solve a numerical problem in a way similar to Taylor series.

Appearance-Based Primary Design for Displays

LED-backlit LCD displays have recently entered the market. They have many advantages over traditional LCD displays: higher dynamic range, high frame rate, wider color gamut, thinner, more environmentally friendly, etc. There are two main types of such displays. RGB-LED-based LCD displays can potentially deliver more saturated primaries (and thus wider color gamuts) due to the narrow spectral width of the LEDs used, while white-LED-based LCD displays might provide high brightness and contrast but smaller gamuts by using high efficiency LEDs in combination with the LCD-panel RGB filters.

The choice between the two is primarily a tradeoff between saturation and brightness. However, the two are linked due to the Hunt effect, which causes perceived colorfulness to increase with luminance. The Stevens effect (perceived contrast increases with brightness) is also relevant. Could these effects lead to a win-win (increased perceived saturation and contrast, as well as actual brightness) even if actual saturation is sacrificed?

The authors investigated two possible designs. One adds a white LED to an RGB LED backlight (RGBW LED backlight). The other keeps the RGB LED backlight, and adds a white subpixel to the LCD (RGBW LCD). The RGBW LED backlight design proved to work best, with an increased white up to 40% providing increased colorfulness as well as brightness. The RGBW LCD white-subpixel design always decreased perceived colorfulness regardless of the amount.

This was determined via a paired comparison experiment. It is interesting to note that neither CIELAB nor CIECAM02 models predicted the result for the RGBW LED backlight – CIELAB predicted that colorfulness would decrease, while CIECAM02 predicted it would increase but not the right amount. In the case of the RGBW LCD subpixel design, both CIELAB and CIECAM02 predicted the results.

HDR Video – Capturing and Displaying Dynamic Real World Lighting

This paper (by Alan Chalmers, WMG, University of Warwick) described the HDR video pipeline under development at the University of Warwick. It includes a Spheron HDRv camera (capable of capturing 20 f-stops of exposure at full HD resolution and 30 fps), NukeX and custom dynamic IBL (image-based lighting) software for post-production, various HDR displays (including a 2×2 “wall” of Brightside DR37-P HDR displays), and a specialized HDR video compression algorithm (for which they have spun off a company, goHDR).

Prof. Chalmers made the case that the 16 f-stops which traditional film can acquire is not sufficient, and showed various examples where capturing 20 f-stops produced better results. He also discussed the recently begun European Union COST (Cooperation in Science and Technology) Action IC1005-7251 “HDRi” which focuses on coordinating European HDR activity and proposing new standards for the HDR pipeline.

High Dynamic Range Displays and Low Vision

This paper was presented by Prof. James Ferwerda from the Munsell Color Science Lab at the Rochester Institute of Technology. Low vision is the preferred term for visual impairment. It is defined as the uncorrectable loss of visual function (such as acuity and visual fields). Low vision (caused by trauma, aging, and disease) affects 10 million people in the USA, and 135 million people worldwide.

HDR imaging offers new opportunities for understanding low vision. This paper describes two projects: simulating low vision in HDR scenes, and using HDR displays to test low vision.

The framework behind tone reproduction operators (which simulate on an LDR display what an observer would have seen in the HDR scene) can be adapted to simulate an impaired scene observer instead of a normal-vision one. Aging effects (such as increased bloom and slower adaptation) can also be simulated.

The importance of using HDR displays to test vision comes from the fact that people with low vision have problems in extreme (light, dark) lighting situations, such as excessive glare or adaptation issues. In addition, there are theories that changes in adaptation time can be good early predictors of retinal disease. However, standard vision tests use moderate light levels so they are not capable of identifying adaptation or other extreme-lighting-induced issues.

Before experiments could be started on the use of HDR displays for vision testing, the NIH (very reasonably) wanted to ensure that these displays could not cause any damage to the test subjects’ vision. Damage caused by light is called “phototoxicity” and can be related to either extremely high light levels in general, or more moderate levels of UV or even blue light. Blue light has recently been identified as a hazard, especially to people with retinal disease. The International Committee on Non-Ionizing Radiation Protection (ICNIRP) has established guidelines for safe light exposure levels, including blue light.

The authors estimated the phototoxicity potential of HDR displays, using the Brightside/Dolby DR37-P as a test case. At maximum brightness, they computed the amount of light which would reach the retina, with the ICNIRP “blue light hazard” spectral filter applied. The result was 4 micro-Watts; since the ICNIRP limit for unrestricted viewing is 200 micro-Watts of blue light, there appears to be no phototoxicity issue with HDR displays. Another way of looking at this: to reach the ICNIRP limit, the display would have to produce the same luminance as a white paper in bright sunlight: 165,000 cd/m2 (for comparison, the DR37-P peak white is about 3000 cd/m2 and the current Dolby HDR monitor – the PRM-4200 – peaks at 600 cd/m2).

Appearance at the Low-Radiance End of HDR Vision: Achromatic & Chromatic

This paper (by John J. McCann, McCann Imaging) studies how human vision works at the low end, close to the absolute threshold of visibility. In particular, does spatial processing change? There are a lot of physiological differences between rods and cones – spatial distribution, wiring, etc., so it might be expected that spatial processing would differ between scotopic and photopic vision. A series of achromatic tests designed to demonstrate various features of spatial vision processing were tested in extreme low-light conditions. The result was exactly the same as in normal light conditions – it appears that spatial processing does not change.

The authors also did experiments with low-light color vision. Although rods by themselves cannot see color (which requires at least two different detector types with distinct spectral sensitivity curves), they can be used for color vision when combined with at least one cone type. In particular, light which has enough red to activate the L cones (but not S or M) and enough light in the right wavelengths to activate the rods will enable dichromatic color vision using the rods and L cones (firelight, a 2000° K blackbody radiator, has the best balance of spectral light for this). This enabled comparing the spatial component of color vision in low-light and normal-light conditions. As before, the observers saw all the same effects, showing that spatial processing was the same in both cases.

Hiding Patterns with Daylight Fluorescent Inks

This paper describes the use of daylight fluorescent inks (which absorb blue & UV light and emit green light, in addition to reflecting light as normal inks do) to create patterns inside arbitrary images which are invisible under normal daylight but appear with other illuminants.

The authors looked at different combinations of regular and fluorescent inks and calculated the gamut for each one. The areas of the gamut that are metameric (under D65) with regular inks can be used to hide patterns. They also calculated proper ink coverage amounts needed to match the fluorescent and regular inks under D65.

Optimizing HANS Color Separation: Meet the CMY Metamers

The Halftone Area Neugebauer Separation (HANS) approach presented at last year’s CIC offered opportunities for optimizing various aspects of the printing process. This year’s paper further explores some of those possibilities.

Regular color halftoning works by controlling the coverage of each colorant, e.g. cyan, yellow, and magenta in a CMY system. HANS extends this to controlling the coverage of each possible combination of colorants (these are called the Neugebauer primaries). For example, a CMY system has 8 Neugebauer primaries: white (bare paper), cyan, magenta, yellow, blue (combination of cyan & magenta), green (combination of cyan & yellow), red (combination of magenta & yellow), and black (combination of all three colorants). Trichromatic color printing (e.g. CMY) has only one halftone pattern for each color in the available gamut. Extending this to more primaries (as HANS does) allows for metamers – different halftone patterns that can obtain the same color, and thus optimization opportunities.

With CMY inks, the authors found that the ink use varied quite a bit depending on the metamers used, indicating that a significant amount of ink could be saved even with such a limited ink set. HANS can save even more ink when used with more typical ink sets which include four or more inks.

Local Gray Component Replacement Using Image Analysis

Gray Component Replacement (GCR) refers to the practice of saving ink in a CMYK printing system by replacing amounts of CMY by similar amounts of K. GCR advantages include deeper blacks, ink savings, and increased sharpness of small details. But it does have one large drawback – it can cause excessive graininess (visible noise) in certain cases. This causes most printer manufacturers to use it very lightly, if at all.

This paper seeks to exploit the fact that noise perception depends on content activity. Noise is quite visible in smooth areas, not so visible in “active” areas. A GCR scheme that adapts to the content of the image has the potential to realize significant GCR benefits without causing noticeable noise.

One problem with this approach is that existing methods to find the “active” areas of the image either do not take account of the properties of the human visual system, require too much computation, or both. The authors’ insight was that cosine-based compression schemes have been heavily designed to exploit the properties of the human visual system, and can be adapted to this application. They do a DCT (discrete cosine transform) of the image and run it through a weighting matrix originally designed for JPEG quantization. The authors put the result through a mapping table (values based on experimentation) to find the desired black ink amount.

A Study on Perceptually Coherent Distance Measures for Color Schemes

This is related to the NPAR 2011 paper “Towards Automatic Concept Transfer”, which allowed for transferring color palettes associated with a concept (such as “earthy”) to images. This paper attempts to find an automated way to assess similarity between color palettes so that different palettes associated with the same concept can be explored.

Most of the existing color metrics either require the compared palettes to have the same number of colors, are dependent on color ordering, or both. The authors came up with a metric called “Color-Based Earth-Mover’s Distance”, which performed well.

Effects of Skin Tone and Facial Characteristics on Perceived Attractiveness: A Cross-Cultural Study

The aim is to study the impact of observer’s cultural background on the perception of faces. Earlier studies showed that slightly more colorful skin tones are more attractive than measured ones, and that larger eyes are rated more attractive. Some cultural background effects were found as well.

The authors took a set of face images, and manipulated skin color, eye size, distance between eyes, and nose length. The resulting images were shown to sets of both British and African observers.

Conclusions: Observers were more sensitive to changes in facial characteristics for faces of their own ethnic group than to those of other ethnic groups. British observers preferred skin colors of higher chroma, with a hue angle of about 41 degrees. African observers had preferences for more reddish, higher chroma faces.

Color Transfer with Naturalness Constraints

The motivation of this work (done at Hewlett-Packard) was to make it easier for untrained users to make pleasing collages by assembling photographs on top of a themed background. In many cases the colors of the selected photographs do not match each other or the background. The photographs may come from different cameras, and some might be downloaded from the web. Some of the images might even be drawings or paintings.

The idea is to use color transfer to make the various images match more closely each other as well as the background. There has been previous work in color transfer, but none of it fit the constraints of this specific application. The users often know what the original photo looked like and will not accept drastic changes. They are also not willing to do extensive manual tinkering to get good results.

Naturalness is important – colors in each image should be modified as a whole, familiar object colors (esp. skin) should remain “plausible”, and the white point should not change too drastically. The authors used a color adaptation model, constraining the color changes to those consistent with adapted appearance under natural illuminants. For each image, they find an estimated illuminant (defined by its CCT and luminance). This is not white point balancing or illuminant estimation; most consumer photos are white-balanced already. The idea is to describe the collective characteristics of the image’s color with parameters that are amenable to transfer to another image through an illuminant adaptation model. A simple Bayesian formulation was used to find the CCT and luminance, based on based on a set of 81 simulated illuminants (9 CCTs and 9 log-spaced luminance levels), applied to 170 object surfaces from Vrhel’s natural surface reflectance database (ftp://ftp.eos.ncsu.edu/pub/eos/pub/spectra/). Once a CCT & luminance is found for each image, color transfer was handled as a chromatic adaptation process. Any model would work; the authors used RLAB since they found it best suited to their needs.

If the photos differ greatly in tone as well as color, they can also do a “tone transfer” process using similar methods.

The results were very effective in improving the user-created collage while still keeping the individual photographs natural-looking and recognizable.

The Influence of Speed and Amplitude on Visibility and Perceived Subtlety of Dynamic Light

Modern light sources (such as LEDs) enable continuously varying color illumination. Users have expressed a desire for this, but prefer slow/small changes over fast/large ones. For residential lighting, it’s important that the dynamic lighting not distract or be obtrusive. In other words, the changes need to be subtle.

This work explores what “subtle” means in the context of dynamic lighting. Do people understand subtlety in the same way? Can you measure a subtlety threshold that is distinct from visibility?

The authors tried dynamic lighting with different amplitude and speed of changes, and asked people whether they considered the lighting subtle, and whether they could see it happening at all.

The dynamics were considered to be subtle if they were slow, smooth, and over a narrow hue range. People seem to agree on what “subtle” means; a subtlety threshold is distinct from the visibility threshold and can be measured.

Blackness: Preference and Perception

In practice, printed black patches vary in hue. This study attempted to determine among a range of blacks of varying hues, which blacks are most preferred and which are considered to be “pure black”.

Used various hues from Munsell color system – three neutrals (with Value 0, 1 and 2 – N0, N1, N2) and the “blackest” version (Value 1 and Chroma 2) of the midpoint of each Munsell hue (5R, 5YR, 5Y, 5GY, 5G, 5BG, 5B, 5PB, 5P, 5RP), for a total of 13 patches. The patches were presented to the observers in pairs against a neutral grey background.

Blackness: N0 (true black) was perceived to be closest to pure black. N1 and 5PB were in second and third places. 5R, 5RP and 5Y were considered to be the least pure black. There was little difference found between UK and China populations, or between genders.

Preference: On average 5B, 5PB and 5RP were most preferred. 5GY, 5Y and 5YP were the least preferred. Here there were some differences between Chinese and UK observers, and quite large differences based on gender.

Summary: blackness perception is not strongly linked to nationality and gender, but preference among black colors is. Observers appeared to have a strong preference for bluish blacks and purplish blacks over achromatic and even pure blacks.

Evaluating the Perceived Quality of Soft-Copy Reproductions of Fine Art Images With and Without the Original

This study was done as part of a project to evaluate the perceived image quality of fine art reproductions. The reproductions were printed on the same printer but based on digital scans done at different institutions using different methods. The goal was to see how the presence or absence of the original for comparison effects how people rank the different reproductions.

Two experiments were performed. One under controlled conditions in a laboratory, ranking the different reproductions of each artwork both with and without the original for comparison. The second study was done via the web, in uncontrolled conditions and without the originals.

For the controlled experiment with the original, the subjects were shown the images in pairs and asked to click on one more closely matching the original. The experiments without the original (both in the lab and on the web) were also based on pairs, but the subjects were asked to click on the image they preferred.

In the case of the experiment with the original, the subject’s rankings corresponded closely to measured color differences between the original and reproductions. In the two experiments done without the original, there was no such correlation.

There were low correlations between the results of the controlled experiments done with and without the originals, implying that preference is not strongly linked to fidelity vs. the original. However, there was a strong correlation between the controlled and web experiments done without the originals, implying that testing conditions do not significantly impact the preference judgment of images.

In the web-based experiment, the subjects were also asked to click on the parts of the picture that most influenced their choice. Users tended to click on specific objects, especially faces.

Scanner-Based Spectrum Estimation of Inkjet Printed Colors

Fast and accurate color measurement devices are useful for the printing process, but such devices are expensive. Scanners are cheap and some printing presses even have them integrated into the output paper path, but they are not accurate measuring devices.

This paper describes a method for using knowledge about both the printing process and the scanner characteristics to estimate spectral reflectances of printed material based on scanner output. The scanner response is estimated by scanning various patches with known spectra. Then the spectra of the printed materials are inferred from their scanned pixel values, the scanner response, and a physical model of the printing technology. The method yielded fairly accurate results.

Evaluating Color Reproduction Accuracy of Sterero One-shot Six-band Camera System

Multi-band imaging can be a good solution for accurate color reproduction. Most such systems are time-sequential and cannot capture moving objects or handle camera motion. Several multi-band video systems have been developed, but they all require expensive optical equipment.

The proposed system uses a stereo six-band camera system to acquire depth information and multi-band color information (for accurate color reproduction) simultaneously for moving scenes. The system is comprised of two consumer-model digital cameras, one of which is fitted with an interference filter which chops off half of each of the sensor R, G and B spectral sensitivity curves. Care needs to be taken during processing since the parallax between the two camera images may affect color reproduction. There are also issues relating to gloss, shade, lighting direction, etc. that need to be resolved.

The authors use a thin-plate spline model to deform the images to each other for registration purposes. When a corresponding point cannot be found, they use only the unfiltered image for color reproduction of that pixel. The authors evaluated color accuracy with a Macbeth ColorChecker chart. The colorimetry was accurate and they even got some information on the reflectance spectra.

Efficient Spectral Imaging Based on Imaging Systems with Sensor Adaptation Using Tunable Color Pixels

Current multispectral cameras come in two main types.

Time multiplexing multispectral camera – for static scenes only (high quality capture can take up to 30 seconds).
Color filter array (CFA) with 6 or more filter types, many cameras do this today (Canon Expo 50 megapixel camera). But this causes a large loss of resolution.

This paper discusses the use of tunable imaging sensors with spectral sensitivity curves that can be dynamically varied on a pixel-by-pixel bases. One such sensor is the transverse field detector (TFD), which exploits the fact that the penetration depth of photons into Silicon depends on their wavelength. The TFD uses a transverse electrical field to collect photons at various penetration depths.

The authors simulated a TFD-based system with several selectable per-pixel sensor capture states. The idea is to analyze where the primary spectrum transition is happening and ensure the sensor has a sensitivity peak there. The system has an initial stage where the derivatives of a preview image are fed to a support vector machine (SVM) that has been trained on a set of images for which the ground truth reflectance spectra were known.

For evaluation, the authors simulated spectral capture systems of all three types – time multiplexing, CFA, and tunable sensor. They found that there is a big improvement on the tunable sensor when going from only one possible state per pixel to two, but no improvement when further increasing the number of states. For scientific applications, the tunable system did slightly outperform the other two in terms of accuracy, and gave similar results for consumer images (the authors suspect that with a better choice of sensor states an improvement would be possible here too). More importantly, the tunable sensor technique doesn’t suffer from the primary drawbacks of the other two (reduced resolution in the case of CFA, multiple-shot requirement in the case of time multiplexing).

A New Approach for the Assessment of Allergic Dermatitis Using Long Wavelength Near-Infrared Spectral Imaging

Hyperspectral imaging can help with diagnosis of allergic dermatitis, which is one of the most common skin diseases. Near infra-red (IR) can penetrate under the skin and show what is happening there. The author’s system showed early stages of both disease and treatment success before visible changes were apparent. It could also clearly discriminate irritation from allergic reaction (which is very difficult from visual inspection), even distinguishing between different types of allergic reactions.

Saliency-Based Band Selection for Spectral Image Visualization

Visualizing multispectral data on an RGB display always involves some data loss, but the goal is to show the most important data while keeping a somewhat natural-looking image. The authors of this paper used saliency (visual importance) maps (previously used to predict where people would spend most time looking in an image), to find the most important three channels.

Spectral Estimation of Fluorescent Objects Using Visible Lights and an Imaging Device

Many everyday materials (for reasons of safety, fashion or others) contain fluorescent substances. The principle of fluorescence is that the material absorbs a photon, goes from ground state to a highly excited state, slowly goes to a less excited stage and finally jumps back to ground state, releasing a less-energetic (lower frequency) photon.

Standard fluorescent measurements involve either two monochromators (expensive and only usable in laboratory setups) or UV light and a spectro-radiometer (hard to estimate accuracy; also, the use of UV lights poses a safety problem).

The authors of this paper propose a method for estimating the spectral radiance factor of fluorescent objects by using visible lights and a multispectral camera. Measurements assume that the fluorescence is equally emitted along all wavelengths lower than the excitation wavelength, which is a pretty good assumption for most fluorescent materials. Analysis of the results showed them to be of high accuracy.

2011 Color and Imaging Conference, Part IV: Featured Talks

CIC typically has several featured talks such as keynotes and an “evening lecture” – these are invited talks about topics of interest to attendees:

Keynote: Color Responses of the Human Brain Explored with fMRI

The first keynote of the conference was given by Kathy Mullen from McGill Vision Research at the McGill University Department of Opthalmology.

In this keynote, Prof. Mullen (who also taught the course “The Role of Color in Human Vision” this year) discussed research into human vision that uses fMRI (functional magnetic resonance imaging) to measure BOLD (blood oxygen level dependent response). This takes advantage of the fact that oxyhemoglobin increases in venous blood during neuronal activity, which results in an increase in the intensity of the BOLD signal after a time delay (2-3 seconds). BOLD is imaged volumetrically at a resolution of about 3mm cubed, which is typical for fMRI.

At a high level, visual information flows from the optic nerve via the thalamus (relayed through the lateral geniculate nucleus – LGN) to the visual cortex in the back of the head. Then it splits into two primary streams – the dorsal stream, thought to be associated with motion, and the ventral stream, thought to be associated with objects. The BOLD experiments attempt to localize particular aspects of human perception more precisely. These experiments involve showing volunteers specific stimuli which are carefully designed to isolate certain visual processing areas.

A few different fMRI studies were discussed; for example, it was found that the different opponency signals (blue-yellow, red-green, and achromatic) have widely differing intensities in the LGN (corresponding roughly to the differing proportions of the cones and opponency neurons driving them), but the cortex responds to all three roughly equally – this implies that selective amplification is occurring between the LGN and the cortex. This amplification also appears to depend on temporal frequency -it does not occur for signals cycling at 8 Hz or faster.

In general, fMRI appears to be a bit of a blunt instrument but it can tell us where in the brain certain things happen, making it a useful complement to psychophysical data and low-level (single neuron) experiments on monkeys.

Keynote: The Challenge of Our Known Knowns

This keynote was given by Robert W. G. Hunt (Michael R. Pointer was supposed to be presenting part of it but couldn’t make it to the conference due to illness).

Dr. Hunt is a titan in the field of color science, following up 36 years at Kodak Research (where his accomplishments included the design of the first commercial subtractive color printer) with thirty years as an independent color consultant. He has written over 100 papers (including several highly influential ones) on color vision, color reproduction, and color measurement, as well as two highly-regarded textbooks: “The Reproduction of Colour” (now in its 6th edition) and “Measuring Colour” (now in its 4th edition). He has won many accolades for his work in the field, including appointment as an Officer of the British Empire (OBE) in 2009. He has been a constant presence at CIC over the years, teaching courses and giving many keynotes.

This keynote speech focused on factors which are known to have a (sometimes profound) effect on the appearance of colored objects, but for which agreed quantitative measures are not yet available.

Successive Contrast

This is the phenomena of adaptation to previous images affecting the current image. This needs to be accounted for in (e.g.) motion picture film editing, but there is no measure for it. We need a quantitative representation of successive contrast as a function of the luminance, chromaticity, and time of exposure of the adapting field.

Simultaneous Contrast

The appearance of a color can be greatly affected by the presence of other color around it. This phenomenon has been known since the mid-19th century. Dark surround is known to make colors look lighter, and light surround makes them look darker. CIECAM02 and other proposed measures do account for simultaneous contrast, but not for the fact that the strength of the effect depends on the extent of the contact between the color and its surround (for example, the perceived color of a thin “X” embedded in the surround will be affected a lot more than that of a rectangular patch).. A comprehensive quantitative representation would have to be a function of the luminance factor and chromaticity of adjacent areas, extent of their contact, and include allowance for cognitive effects.

Assimilation

When stimuli cover small angles in the field of view, the opposite of simultaneous contrast can occur. The color appear to be more, rather than less, like their surroundings, an effect called assimilation. The likely causes of the effect include scattering of light in the eye, and the fact that the color difference signals in the visual system have lower resolution than that of the achromatic signal. A quantitative representation of assimilation would have to be a function of the luminance factor and chromaticity of the adjacent areas, the extent of their contact, and the angular subtense of the elements.

Gloss

The surfaces of most objects have some gloss, and the appearance of their colors is affected by the geometry of the lighting. Colors of glossy objects can appear very different in lighting that is diffuse vs. directional. Current methods of measuring gloss do so by measuring the ratio of magnitude of specularly reflected light to incident light at certain angles.

However, such measurements do not account for other factors that affect apparent gloss – geometry of illumination, roughness of objects, and amount of non-specular (diffuse) reflection.

There appear to be two major perceptual dimensions to gloss: contrast gloss (a function of the specular and diffuse reflectance factors) and distinctness of image (a function of specular spread). A quantitative representation of gloss might have to be a function or functions of not just specular reflectance factor, diffuse reflectance factor, and spread of specular reflection, but also of the geometry of illumination.

Translucency

Translucency is very important in the apparent quality of foodstuffs. Like gloss, it also appears to have two main perceptual dimensions: clarity (the extent to which fine detail can be perceived through the material) and haze (the extent to which objects objects viewed through the material appear to be reduced in contrast). Clarity and haze can be measured with the right apparatus. A quantitative representation of translucency might have to be separate functions of clarity and haze, but the relationship between these requires further research.

Surface Texture

Pattern is a fundamental attribute belonging to a surface; texture is a parameter relating to the perception of that pattern, which will, among other variables, be a function of the viewing distance. Surface textures can be characterized in terms of structure (structured-unstructured), regularity (irregular-regular) and directionality (directional-isotropic). The measurement of texture is still in its infancy. A quantitative representation of texture might have to be a function of structure, regularity and directionality. Some early research into this area is being done by people working in machine vision.

Summary

Are color, gloss, translucency, and surface texture all independent phenomena? They all derive from the optical properties of materials. Gloss can have a large effect on color (light reflections decrease saturation). Surface texture has been found to have a large effect on gloss. All these phenomena are known, but quantitative measures are lacking – many industries would benefit.

Evening Lecture: Exploring the Fascinating World of Color Beneath the Sea

The lecture was given by David Gallo, Director of Special Projects at the Woods Hole Oceanographic Institution. David Gallo is a prominent undersea explorer and oceanographer who has used manned submersibles and robots to map the ocean world with great detail. Among many other expeditions, he has co-led expeditions exploring the RMS Titanic and the German battleship Bismarck.

The talk was very interesting and included a lot of stunning imagery, but is hard to summarize. David Gallo emphasized the importance of the ocean, and the fact that humans have only explored about 5% of it. Images were shown of underwater lakes, rivers, and waterfalls (formed of denser and saltier water than the surrounding ocean), as well as a variety of bioluminescent creatures from both mid and deep waters, including species living around poisonous hydrothermic vents. Acquiring high-quality color images in the deep ocean is a difficult challenge, ably met by David’s colleague William Lange at the Woods Hole Advanced Imaging and Visualization Lab.

Keynote: The Human Demosaicing Algorithm

This keynote was given by Prof. David Brainard, from the Department of Psychology at the University of Pennsylvania.

Almost all color cameras use interleaved trichromatic sampling – not all channels are sampled at all pixels, instead there is a mosaic (e.g. Bayer Mosaic). The output of this mosaic must be processed (“demosaiced”) into a thrichromatic image. There is information loss, often resulting in artifacts such as magenta-green fringing. Algorithms are constantly being improved to try to reduce the artifacts but they can never be completely eliminated.

The human retina has the same interleaved design; there is only one cone (long – L, medium – M, or short – S) at each location. S cones are sparse, and the L & M cones are arranged in a quasi-random pattern. The same ambiguities exist, but we very rarely see these chromatic fringes. High enough frequencies do reach the retina to cause such artifacts in theory, so some very clever algorithm must be at work.

Making the problem more complicated, there are very large differences between observers in proportions of L, M and S cones, even among observers that test with normal color vision. Some people have a lot more M than L, or the other way around. Yet somehow this does not affect their color perception.

There are two functional questions:

How does the human visual system (HVS) process the responses of an interleaved cone mosaic to approximate full trichromancy?
How do individuals with very different mosaics perceive color in the same way?

Prof. Brainard uses a Bayesian approach to analyze cases where sensory data is ambiguous about physical variables of interest. He picked this approach because is has simple underlying principles, provides an optimal performance benchmark, and is often a good choice for the null hypothesis about performance.

The basic Bayesian recipe is: model the sensory system as a likelihood, express statistical regularities of the environment as a prior distribution, and apply the Bayes Rule.

Prof. Brainard gave a simple example to illustrate the use of Bayesian priors and posteriors, and then scaled it up to the systems used in his work.

The Bayesian system was able to correctly predict many facets of human color vision, including the demosaicing, and how people with very different cone distributions are able to have similar color vision. To stress-test the predictive power of the model, they examined an experiment by Hofer et al in 2005 which used corrective optics to image spots of light on the retina smaller than the distance between adjacent cones. The results predicted by the Bayesian system matched Hofer’s results.

For this method to work, the visual system must process the cone output with ‘knowledge’ of the type of each cone at a fine spatial scale. It appears that the brain needs to learn the cone types, since there doesn’t seem to be a biochemical marker. Another experiment was done to determine how well cones can be typed via unsupervised learning, and it was found that this is indeed possible – about 2500 natural images are sufficient for a system to learn the type of each input in the mosaic.

None of this proves that the HVS works in this way, but it does show that it is possible and correctly predicts all the data. In the future, Prof. Brainard plans to explore engineering applications of these methods.