Glare in Computer Vision

Observing Photographs

Whenever you see a beautiful reproduction of a high dynamic range scene, be very careful about using it for computational imaging. Images that are accurate, linear reproductions of natural scene radiances look flat and desaturated. Beautiful images are nonlinear transformations of scene radiance. They contain radiance errors that improve the appearance of the reproduction. Using this transformed data in calculations can introduce systematic errors in output values. Known linear input data is required for many calculations in computer vision.

As well, we must be careful to avoid confusing physical and psychophysical metrics. We cannot look at a display of an image and attempt to infer anything about radiance - a physical quantity. We have to use a meter to understand (measure) the radiances coming from a display. In our thinking, we have to separate the signal processing incorporated in the display device from the subsequent human spatial image processing we use when looking at the display. Any standard camera image in computer memory is buried in the middle of the imaging chain! Sending that digital image to a display, or printer, for visual analysis is appropriate for photography, but that nonlinear transformed digital data is not appropriate for computer vision.

Computer Vision

In most machine vision pipelines, light is directed through a lens system to a sensor pixel grid for sampling. Once the light is captured, various levels of signal and image processing take place, such as demosaicing, white balancing, noise reduction, exposure correction, tone mapping, chroma enhancements, etc.[1,2]

Once the image is formed, higher level statistical analysis of visual information takes place, in order to build meaningful hierarchical representations of the world. Starting from simple edge filters, gradually moving through spatial pooling to more complicated features, leading to the encoding of surfaces and the detection/recognition of objects and, eventually, to a course of action based on the objectives of the system.

Optical limitations and nonlinear transformations of sensor quanta catch affect the performance of all subsequent stages of this complicated pipeline. SEE REVIEW: Standard sRGB vs. RAW format

In earlier research, significant effort went into the control of scene lighting. Today, artificial vision systems live outdoors using (smartphones, outdoor security cameras, robots/drones, driver monitoring systems etc) in uncontrolled illumination.

Glare in camera optics is a major issue in some applications. Every optical system scatters a very small amount of light from each pixel onto every other pixel. Glare is the sum of all the small contributions from all the other pixels. As such, captured images are not an accurate reproduction of the scene, but rather a combination of both scene content and glare.

Glare transforms the image. It changes the darker image segments much more than the light ones. Locally, glare overcasts affect the gradient amplitude of the image. Differentiating noise from actual meaningful gradient orientations becomes more difficult. Gradient-based keypoint detectors, such as SIFT, may be affected, since extracted keypoints will become unstable . Roughness of textures is reduced and colors are washed out. Consequently, detection thresholds and recognition accuracy of systems trained under “ideal” imaging conditions will be affected.

Glare’s effects vary with:

•the relative intensity of the image segment,
•the relative intensity of the local segments around the image segment
•the relative intensity population of the entire scene.
•the dynamic range of the scene

HDR images are not rare nowadays. Outdoor security cameras in ATMs have to deal with strong backlight illumination. Images of captured faces will be compromised, limiting face recognition algorithms. Traffic monitoring cameras capture roads under constantly varying illumination range conditions. Glare may be a limiting factor in license plate recognition, or specific car identification. Computer vision applications implemented in smart phones may be compromised as well. Outdoor consumer photography (such as mobile phones panoramas, or drones) is subject to the impact of glare.

[1] J. McCann, and V. Vonikakis, "Accurate Information vs. Looks Good: Scientific vs. Preferred Rendering", Proc.IS&T, Color in Graphics, Imaging, and Vision; Amsterdam 6, 231-238 (2012).
<http://mccannimaging.com/Retinex/Talks_files/12CGIVf.pdf>
[2] McCann, J., and Vonikakis, V., Bonanomi, C. and Rizzi, A. (2013) "Chromaticity limits in color constancy calculations", Proc. IS&T/SID Color Imaging Conference, 21, 52-60.
<http://mccannimaging.com/Retinex/Talks_files/13CIC21.pdf>

HDR Engineering

Until now, there has been a considerable body of engineering work dealing with compensating the range of light in natural images (HDR capture and reproduction) [3-5]. As well, computational removal of rain, fog and snow are popular topics. [6-12]. Finally, illumination invariant feature detectors and descriptors have also attempted to compensate for uneven illumination [13,14].

Surprisingly, there are very few works addressing the computational removal of glare [15,16]. All of these algorithms generate images for human inspection. They study photography, which is fine, but lacks the precision needed for Computer Vision.

The considerable success of HDR photography is based on the fact that cameras capture inaccurate scene radiances that are manipulated to make better data spatial renditions of scene information. That rendition on a display is subsequently transformed by human vision. Both physical and physiological systems share major roles in making beautiful HDR reproductions. [SEE REVIEW: Glare in Reproduction]

[3] Paul E. Debevec and Jitendra Malik. 1997. Recovering high dynamic range radiance maps from photographs. In Proceedings of the 24th annual conference on Computer graphics and interactive techniques (SIGGRAPH '97).
[4] P. Sen and C. Aguerrebere, "Practical High Dynamic Range Imaging of Everyday Scenes: Photographing the world as we see it with our own eyes," in IEEE Signal Processing Magazine, vol. 33, no. 5, pp. 36-44, Sept. 2016.
[5] J. Fernández-Berni, R. Carmona-Galán and Á Rodríguez-Vázquez, "Single-Exposure HDR Technique Based on Tunable Balance Between Local and Global Adaptation," in IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 63, no. 5, pp. 488-492, May 2016.
[6] B. Cai, X. Xu, K. Jia, C. Qing and D. Tao, "DehazeNet: An End-to-End System for Single Image Haze Removal," in IEEE Transactions on Image Processing, vol. 25, no. 11, pp. 5187-5198, Nov. 2016.
[7] Z. Li and J. Zheng, "Edge-Preserving Decomposition-Based Single Image Haze Removal," in IEEE Transactions on Image Processing, vol. 24, no. 12, pp. 5432-5441, Dec. 2015.
[8] D. K. Jha, B. Gupta and S. S. Lamba, "l2-norm-based prior for haze-removal from single image," in IET Computer Vision, vol. 10, no. 5, pp. 331-341, 8 2016.
[9] K. He, J. Sun and X. Tang, "Single Image Haze Removal Using Dark Channel Prior," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 12, pp. 2341-2353, Dec. 2011.
[10] J. H. Kim, J. Y. Sim and C. S. Kim, "Video Deraining and Desnowing Using Temporal Correlation and Low-Rank Matrix Completion," in IEEE Transactions on Image Processing, vol. 24, no. 9, pp. 2658-2670, Sept. 2015.
[11] V. Toka, N. H. Sankaramurthy, R. P. M. Kini, P. K. Avanigadda and S. Kar, "A fast method of fog and haze removal," 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, 2016, pp. 1224-1228.
[12] X. Zhao, P. Liu, J. Liu and X. Tang, "Removal of dynamic weather conditions based on variable time window," in IET Computer Vision, vol. 7, no. 4, pp. 219-226, August 2013.
[13] Vonikakis, V., Chrysostomou, D., Kouskouridas, R., & Gasteratos, A. (2013). A biologically inspired scale-space for illumination invariant feature detection. Measurement Science and Technology, 24(7), p. 074024 (13pp).
[14] F. Sur, "Illumination-invariant representation for natural colour images through SIFT matching," 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, 2013, pp. 1962-1966.
[15] Eino-Ville Talvala, Andrew Adams, Mark Horowitz, and Marc Levoy. 2007. Veiling glare in high dynamic range imaging. ACM Trans. Graph. 26, 3, Article 37 (July 2007)
[16] Raskar, R.; Agrawal, A.; Wilson, C.; Veeraraghavan, A., "Glare Aware Photography: 4D Ray Sampling for Reducing Glare Effects of Camera Lenses", ACM Transactions on Graphics (TOG), ISSN: 0730-0301, Vol. 27, No. 3, August 2008.

Transformations that Change Spatial Content

Optical limits and firmware image processing[1,2] modify scene radiance information, thus altering the scene's spatial relationships in the captured image:

•High-end professional digital cameras modify the output data to reduce the effects of optical vignetting and other optical artifacts. This approach uses specific camera and lens metadata to reduce known artifacts.
•The light on the camera image plane is scene radiance plus scene-dependent veiling glare [17-19]. Although veiling glare removal would be highly desirable, it is not possible [19]. [SEE REVIEW: Camera Glare Spread Function]
•Spectral information is not accurately recorded in standard digital color separation data. The improvements in color photography from the use of color enhancement are found in most color film reproduction processes. The technique was introduce by Albert [20] in 1889.

Calibration of scene radiance vs. camera response is required for scientific use of color digital camera output. Departures from accurate scene rendition are well known to camera engineers [21-28]. However, there are many computer experiments in which the actual magnitude of radiance errors are larger than expected. These departures from accurate scene radiances can have significant adverse effects on quantitative image processing algorithms.

[17] Jones L A & Condit H (1941) “The Brightness Scale of Exterior Scenes and the Computation of Correct Photographic Exposure” , J Opt Soc Am , 31 , 651 – 78 .
[18] McCann, J. J. and Rizzi, A (2007) “Camera and visual veiling glare in HDR images”, J. Soc. Information Display, 15(9). <http://www.mccannimaging.com/Lightness/HDR_Papers_files/07HDR2Exp.pdf>
[19] ISO 9358: 1994 Standard, Optics and Optical Instrument: Veiling Glare of Image Forming Systems. Definitions and Methods of Measurement. International Organization for Standardization, Geneva, (1994).
[20] Albert E, German Patent, 101379, (1889).
[21] Adams, J.; Parulski, K.; Spaulding, K.; "Color processing in digital cameras", IEEE Micro, 18:6, (1998).
[22] http://www.color.org/index.xalter
[23] Spaulding KE, Woolfe GJ & Giorgianni EJ, "Image States and Standard Color Encodings (RIMM/ROMM RGB)", Proc. IS&T Color Imaging Conf, 8, 288-294, (2000).
[24] Woolfe GJ, Spaulding KE, & Giorgianni EJ, "Hue Preservation in Rendering Operations–An Evaluation of RGB Color Encodings", Proc. IS&T Color Imaging Conf, 10, 317-324, (2002).
[25] Holm J, Tastl I, Hanlon L, & Hubel P, "Color processing for digital photography", in Colour Engineering: Achieving Device Independent Colour, Green P. & MacDonald L., eds., Wiley,(2002).
[26] ISO 22028-l:2004, "Photography and graphic technology: Extended colour encodings for digital image storage, manipulation and interchange; Part 1 Architecture and requirements, Annex C, Section C12, p 45, (2004).
[27] Holm J, "Color processing for digital cameras", SPIE Electronic Imaging Newsletter, 14.1, January (2004).
[28] Ramanath, R.; Snyder, W.E.; Yoo, Y.; Drew, M.S.; "Color image processing pipeline", IEEE Signal Processing Magazine, 22 Issue:1, (2005).

Accurate Image Data

Accurate radiance information (linear with a real zero) is necessary for computational imaging techniques that use camera digits as scene data, e.g. arithmetic operations, such as calculating average values, estimating the reflectance of objects in a scene, or the illumination falling on scenes.

Camera Response Function (CRF) calibrations and LibRAW data extractions can remove nonlinearities introduced by camera signal processing pipelines. We used LibRAW to get as close as possible to the sensor’s quanta catch data [SEE REVIEW: Standard sRGB vs. RAW format] However, CRF and LibRAW cannot remove glare. Accurate scene information from the multiple exposure technique cannot extend camera ranges limited by veiling glare. The captured range depends on exposure time, camera design, and most important, scene content. If glare and other factors severely limit camera accuracy in a single exposure, then multiple exposures cannot extend the camera’s range.

This is a curious, but important result. Glare is a small error in radiance estimation. If it was a random error, it would not be a problem. However, it is a systematic over-estimate of scene radiance for darker image segments. Further, it varies with scene content. Glare adds variable amounts of light to low-light image segments. The absolute value of the error is small, but it is added to small radiance values. It is a large fraction of the small value. The reflectance calculation of a dark sample is the ratio of a small number, with a large error, divided by a large number. The ratio process in finding reflectance is effectively amplifying the error.

The problem is even more severe in chromaticity. [SEE REVIEW: Standard sRGB vs. RAW format] Chromaticity is the ratio of a camera response to the sum of three responses. In standard photography, with S-shaped tone scale responses, camera chromaticities of a single object, varies with camera exposure. This is the result of camera nonlinearities.[1,2] Chromaticities need to be calculated using linear data. However, even in LibRAW images, glare adds unwanted response to dark image segments. That means highly chromatic segments will have accurate high-digital values in one, or more, channels; and inflated low-digit values in others. Again, reproductions will look fine, but calculated chromaticity values have scene dependent errors. These errors are not problematic in reproduction, but can be of critical importance in image calculations.

Inverse Camera Response Functions

A physics-based approach to HDR imaging finds a camera CRF, so as to apply its inverse function to measure radiance. The fundamental assumption of these algorithms is that cameras respond to scene radiances without any spatial distortions as a result of scene content.

Xiong et al.[29], and Kim et al.[30] describe techniques for converting standard images to “estimated RAW” for further processing. The common thread is that these papers attempt to remove the camera response function from its digital data to estimate scene radiances. They propose that these estimates of the original scene could be used in subsequent processing, such as improving the color balance of an image. The authors do not consider glare and scene content. The authors report a population of “outliers”, namely pixels that have scene radiances that are not recoverable. This technique can be applied to improving photos for visual inspection (e.g., white balance and color profiles), as long as the outliers are not particularly visible. Their algorithms can make pictures that look better, but their is no analysis of the accuracy of “estimated RAW” data. If recovering accurate arrays of scene data for calculations are necessary, this technique is not adequate.

We must honestly confront the observations that cameras distort scene radiances in our work with images. We need to find logical arguments that do not make false assumption about camera performance. We need analytical tools that do not make incorrect assumptions. Clearly, the a great many computer vision algorithms are indifferent to glare and nonlinear image transforms. However, we need to study the effect the sequence of processes in Computer Vision of camera-based nonlinear spatial transforms of scene data.

[29] Xiong, Y., Saenko, K., Darrell, T., & Zickler T, "From Pixels to Physics: Probabilistic Color De-rendering", Proc. IEEE Comp. Vis. & Pattern Recog. (CVPR), (2012).
[30] Kim, S.,J., Lin H.T., Lu Z., Susstrunk S., Lin S. & Brown M.S., "A New In-Camera Imaging Model for ColorComputer Vision and its Application", IEEE Trans. Pat.. Anal & Mach. Intel, 34,2289-2302 (2012).

We encourage researchers to submit contributions related to computer vision and glare. This includes, but is not limited to: techniques, systems, models, deep learning networks, or computational optimization approaches for estimating and reducing the effects of glare in modern imaging and computer vision systems. Also, comparative studies that demonstrate in a quantitative manner the impact of glare in computer vision and compressive imaging systems.