Recognition of light sources on environment maps
This article presents a Python implementation of the algorithm for recognizing light sources on environment maps (LDR or HDR) using an equirectangular projection. However, after making minor changes, it can also be used with simple background images or cubic maps. Examples of the possible application of the algorithm: ray tracing programs in which it is necessary to recognize primary light sources in order to emit rays from them; in rasterized renderers, it can be used to cast shadows using an environment map; in addition, the algorithm can also be used in spotlight elimination programs, for example in AR.
The algorithm consists of the following steps:
- Decrease in resolution of the original image, for example, to 1024.
- Convert the image to brightness (luminance), if necessary, with image blur.
- Application of the quasi-Monte Carlo method.
- Transformation from spherical coordinates to equidistant coordinates.
- Filtering samples based on the brightness of a neighbor.
- Sort samples based on their brightness.
- Filtering samples based on the Euclidean metric.
- Merging samples using the Bresenham algorithm.
- Calculation of the position of the lighting cluster based on its brightness.
There are many algorithms for reducing the resolution of images. Bilinear filtering is the fastest or easiest to implement, and besides, it is best suited in most cases. To convert brightness in both LDR and HDR images, you can use the standard formula:
lum = img[:, :, 0] * 0.2126 + img[:, :, 1] * 0.7152 + img[:, :, 2] * 0.0722
Additionally, you can apply a slight blur to the brightness image, for example, 1-2 pixels for an image with a resolution of 1024, to eliminate all high-frequency details (in particular, caused by a decrease in resolution).
The most common projection in environment maps is equidistant projection 3 . My algorithm can work with other projections, for example, with panoramic and with cubic maps, however, in the article we will consider only an equally spaced projection. First you need to normalize the image coordinates:
pos = x / width pos = y / height
Then we need to convert from and to Cartesian coordinates using spherical coordinates, i.e. θ and φ, where θ = x * 2π, and φ = y * π.
def sphereToEquirectangular(pos): invAngles = (0.1591, 0.3183) xy = (math.atan2(pos, pos), math.asin(pos)) xy = (xy * invAngles, xy * invAngles) return (xy + 0.5, xy + 0.5) def equirectangularToSphere(pos): angles = (1 / 0.1591, 1 / 0.3183) thetaPhi = (pos - 0.5, pos - 0.5) thetaPhi = (thetaPhi * angles, thetaPhi * angles) length = math.cos(thetaPhi) return (math.cos(thetaPhi) * length, math.sin(thetaPhi) * length, math.sin(thetaPhi))
The next step will be to apply the quasi-Monte Carlo method, for example, Hammersley 2 sampling :
You can use other sampling methods, such as Holton 4 , but Hammersley is faster and provides a good distribution of samples across the sphere. Holton would be a good choice for plane samples if a simple image is used instead of the environment map. A mandatory requirement for Hammersley sampling is the inversion of the roots (row) of van der Corpute, for more details see links 2 . Here is its quick implementation:
def vdcSequence(bits): bits = (bits << 16) | (bits >> 16) bits = ((bits & 0x55555555) << 1) | ((bits & 0xAAAAAAAA) >> 1) bits = ((bits & 0x33333333) << 2) | ((bits & 0xCCCCCCCC) >> 2) bits = ((bits & 0x0F0F0F0F) << 4) | ((bits & 0xF0F0F0F0) >> 4) bits = ((bits & 0x00FF00FF) << 8) | ((bits & 0xFF00FF00) >> 8) return float(bits) * 2.3283064365386963e-10 # / 0x100000000 def hammersleySequence(i, N): return (float(i) / float(N), vdcSequence(i))
Then we use uniform overlay on the sphere:
def sphereSample(u, v): PI = 3.14159265358979 phi = v * 2.0 * PI cosTheta = 2.0 * u - 1.0 # map to -1,1 sinTheta = math.sqrt(1.0 - cosTheta * cosTheta); return (math.cos(phi) * sinTheta, math.sin(phi) * sinTheta, cosTheta)
To sample Hammersley, we use a fixed number of samples, depending on the resolution of the image, and convert from spherical coordinates to Cartesian, and then to equidistant:
samplesMultiplier = 0.006 samples = int(samplesMultiplier * width * height) samplesList =  # apply hammersley sampling for i in range(0, samples): xi = hammersleySequence(i, samples) xyz = sphereSample(xi, xi) # to cartesian imagePos = sphereToEquirectangular(xyz) luminance = lum[imagePos * width, imagePos * height]
This will give us a good distribution of samples that will be checked for the presence of light sources:
Filtering light sources
In the first pass of filtering, we ignore all samples that do not exceed the brightness threshold (for HDR cards, it may be higher), and then sort all the samples by their brightness:
localSize = int(float(12) * (width / 1024.0)) + 1 samplesList =  # apply hammersley sampling for i in range(0, samples): xi = hammersleySequence(i, samples) xyz = sphereSample(xi, xi) # to cartesian imagePos = sphereToEquirectangular(xyz) luminance = lum[imagePos  * width, imagePos  * height] sample = Sample(luminance, imagePos , xyz) luminanceThreshold = 0.8 #do a neighbour search for the maximum luminance nLum = computeNeighborLuminance(lum, width, height, sample.imagePos, localSize) if nLum > luminanceThreshold: samplesList.append(sample) samplesList = sorted(samplesList, key=lambda obj: obj.luminance, reverse=True)
The next pass will filter based on the Euclidean metric and the threshold distance between pixels (depending on the resolution of the image) - this is a spatial data structure that can be used to get rid of complexity O (N 2 ):
euclideanThreshold = int(float(euclideanThresholdPixel) * (width / 2048.0)) # filter based euclidian distance filteredCount = len(samplesList) localIndices = np.empty(filteredCount); localIndices.fill(-1) for i in range(0, filteredCount): cpos = samplesList[i].pos if localIndices[i] == -1: localIndices[i] = i for j in range(0, filteredCount): if i != j and localIndices[j] == -1 and distance2d(cpos, samplesList[j].pos) < euclideanThreshold: localIndices[j] = i
The resulting samples go through the merging stage to further reduce the number of light sources:
Merging light sources
At the last stage, merging samples belonging to the same lighting cluster is performed. To do this, we can use the Bresenham algorithm and start with the samples with the highest brightness, because they are already ordered. When we find a light source that satisfies the Bresenham test, we use its position to change the position of the source based on the weight of the run:
# apply bresenham check and compute position of the light clusters lights =  finalIndices = np.empty(filteredCount); finalIndices.fill(-1) for i in localIndices: sample = samplesList[i] startPos = sample.pos if finalIndices[i] == -1: finalIndices[i] = i light = Light() light.originalPos = np.array(sample.pos) # position of the local maxima light.worldPos = np.array(sample.worldPos) light.pos = np.array(sample.pos) light.luminance = sample.luminance for j in localIndices: if i != j and finalIndices[j] == -1: endPos = samplesList[j].pos if bresenhamCheck(lum, width, height, startPos, startPos, endPos, endPos): finalIndices[j] = i # compute final position of the light source sampleWeight = samplesList[j].luminance / sample.luminance light.pos = light.pos + np.array(endPos) * sampleWeight light.pos = light.pos / (1.0 + sampleWeight) imagePos = light.pos * np.array([1.0 / width, 1.0 / height) light.worldPos = equirectangularToSphere(imagePos) lights.append(light)
The Bresenham function checks for a continuous line that has the same brightness. If the delta in the current pixel exceeds a certain threshold, then the check fails:
def bresenhamCheck(lum, imageSize, x0, y0, x1, y1): dX = int(x1 - x0) stepX = int((dX > 0) - (dX < 0)) dX = abs(dX) << 1 dY = int(y1 - y0) stepY = int((dY > 0) - (dY < 0)) dY = abs(dY) << 1 luminanceThreshold = 0.15 prevLum = lum[x0][y0] sumLum = 0.0 c = 0 if (dX >= dY): # delta may go below zero delta = int (dY - (dX >> 1)) while (x0 != x1): # reduce delta, while taking into account the corner case of delta == 0 if ((delta > 0) or (delta == 0 and (stepX > 0))): delta -= dX y0 += stepY delta += dY x0 += stepX sumLum = sumLum + min(lum[x0][y0], 1.25) c = c + 1 if(abs(sumLum / c - prevLum) > luminanceThreshold and (sumLum / c) < 1.0): return 0 else: # delta may go below zero delta = int(dX - (dY >> 1)) while (y0 != y1): # reduce delta, while taking into account the corner case of delta == 0 if ((delta > 0) or (delta == 0 and (stepY > 0))): delta -= dY x0 += stepX delta += dX y0 += stepY sumLum = sumLum + min(lum[x0][y0], 1.25) c = c + 1 if(abs(sumLum / c - prevLum) > luminanceThreshold and (sumLum / c) < 1.0): return 0 return 1
It should be noted that, if necessary, improvements can be made to the Bresenham test, which will lead to better merging of samples, for example, it can take into account the horizontal transfer of light sources located at the edges of the image. In addition, the function can be easily expanded so that it approximates the area of the light sources. Another improvement: you can add a distance threshold so you do not combine samples that are too far away. Final Results:
Blue denotes the local maxima of the lighting clusters, blue denotes the final positions of the light sources, and red denotes samples that are part of the same lighting cluster and are connected by lines.
Other examples of results: