# Recognition of light sources on environment maps

Original author: Daniel-Alin Loghin
• Transfer

This article presents a Python implementation of the algorithm for recognizing light sources on environment maps (LDR or HDR) using an equirectangular projection. However, after making minor changes, it can also be used with simple background images or cubic maps. Examples of the possible application of the algorithm: ray tracing programs in which it is necessary to recognize primary light sources in order to emit rays from them; in rasterized renderers, it can be used to cast shadows using an environment map; in addition, the algorithm can also be used in spotlight elimination programs, for example in AR.

The algorithm consists of the following steps:

1. Decrease in resolution of the original image, for example, to 1024.
2. Convert the image to brightness (luminance), if necessary, with image blur.
3. Application of the quasi-Monte Carlo method.
4. Transformation from spherical coordinates to equidistant coordinates.
5. Filtering samples based on the brightness of a neighbor.
6. Sort samples based on their brightness.
7. Filtering samples based on the Euclidean metric.
8. Merging samples using the Bresenham algorithm.
9. Calculation of the position of the lighting cluster based on its brightness.

There are many algorithms for reducing the resolution of images. Bilinear filtering is the fastest or easiest to implement, and besides, it is best suited in most cases. To convert brightness in both LDR and HDR images, you can use the standard formula:

lum = img[:, :, 0] * 0.2126 + img[:, :, 1] * 0.7152 + img[:, :, 2] * 0.0722

Additionally, you can apply a slight blur to the brightness image, for example, 1-2 pixels for an image with a resolution of 1024, to eliminate all high-frequency details (in particular, caused by a decrease in resolution).

### Equidistant Projection

The most common projection in environment maps is equidistant projection 3 . My algorithm can work with other projections, for example, with panoramic and with cubic maps, however, in the article we will consider only an equally spaced projection. First you need to normalize the image coordinates:

pos[0] = x / width
pos[1] = y / height

Then we need to convert from and to Cartesian coordinates using spherical coordinates, i.e. θ and φ, where θ = x * 2π, and φ = y * π.

def sphereToEquirectangular(pos):
invAngles = (0.1591, 0.3183)
xy = (math.atan2(pos[1], pos[0]), math.asin(pos[2]))
xy = (xy[0] * invAngles[0], xy[1] * invAngles[1])
return (xy[0] + 0.5, xy[1] + 0.5)
def equirectangularToSphere(pos):
angles = (1 / 0.1591, 1 / 0.3183)
thetaPhi = (pos[0] - 0.5, pos[1] - 0.5)
thetaPhi = (thetaPhi[0] * angles[0], thetaPhi[1] * angles[1])
length = math.cos(thetaPhi[1])
return (math.cos(thetaPhi[0]) * length, math.sin(thetaPhi[0]) * length, math.sin(thetaPhi[1]))

### Hammersley Sampling

The next step will be to apply the quasi-Monte Carlo method, for example, Hammersley 2 sampling :

You can use other sampling methods, such as Holton 4 , but Hammersley is faster and provides a good distribution of samples across the sphere. Holton would be a good choice for plane samples if a simple image is used instead of the environment map. A mandatory requirement for Hammersley sampling is the inversion of the roots (row) of van der Corpute, for more details see links 2 . Here is its quick implementation:

def vdcSequence(bits):
bits = (bits << 16) | (bits >> 16)
bits = ((bits & 0x55555555) << 1) | ((bits & 0xAAAAAAAA) >> 1)
bits = ((bits & 0x33333333) << 2) | ((bits & 0xCCCCCCCC) >> 2)
bits = ((bits & 0x0F0F0F0F) << 4) | ((bits & 0xF0F0F0F0) >> 4)
bits = ((bits & 0x00FF00FF) << 8) | ((bits & 0xFF00FF00) >> 8)
return float(bits) * 2.3283064365386963e-10 # / 0x100000000
def hammersleySequence(i, N):
return (float(i) / float(N), vdcSequence(i))

Then we use uniform overlay on the sphere:

def sphereSample(u, v):
PI = 3.14159265358979
phi = v * 2.0 * PI
cosTheta = 2.0 * u - 1.0 # map to -1,1
sinTheta = math.sqrt(1.0 - cosTheta * cosTheta);
return (math.cos(phi) * sinTheta, math.sin(phi) * sinTheta, cosTheta)

To sample Hammersley, we use a fixed number of samples, depending on the resolution of the image, and convert from spherical coordinates to Cartesian, and then to equidistant:

samplesMultiplier = 0.006
samples = int(samplesMultiplier * width * height)
samplesList = []
# apply hammersley sampling
for i in range(0, samples):
xi = hammersleySequence(i, samples)
xyz = sphereSample(xi[0], xi[1]) # to cartesian
imagePos = sphereToEquirectangular(xyz)
luminance = lum[imagePos[0] * width, imagePos[1] * height]

This will give us a good distribution of samples that will be checked for the presence of light sources:

### Filtering light sources

In the first pass of filtering, we ignore all samples that do not exceed the brightness threshold (for HDR cards, it may be higher), and then sort all the samples by their brightness:

localSize = int(float(12) * (width / 1024.0)) + 1
samplesList = []
# apply hammersley sampling
for i in range(0, samples):
xi = hammersleySequence(i, samples)
xyz = sphereSample(xi[0], xi[1]) # to cartesian
imagePos = sphereToEquirectangular(xyz)
luminance = lum[imagePos [0] * width, imagePos [1] * height]
sample = Sample(luminance, imagePos , xyz)
luminanceThreshold = 0.8
#do a neighbour search for the maximum luminance
nLum = computeNeighborLuminance(lum, width, height, sample.imagePos, localSize)
if nLum > luminanceThreshold:
samplesList.append(sample)
samplesList = sorted(samplesList, key=lambda obj: obj.luminance, reverse=True)

The next pass will filter based on the Euclidean metric and the threshold distance between pixels (depending on the resolution of the image) - this is a spatial data structure that can be used to get rid of complexity O (N 2 ):

euclideanThreshold = int(float(euclideanThresholdPixel) * (width / 2048.0))
# filter based euclidian distance
filteredCount = len(samplesList)
localIndices = np.empty(filteredCount); localIndices.fill(-1)
for i in range(0, filteredCount):
cpos = samplesList[i].pos
if localIndices[i] == -1:
localIndices[i] = i
for j in range(0, filteredCount):
if i != j and localIndices[j] == -1 and distance2d(cpos, samplesList[j].pos) < euclideanThreshold:
localIndices[j] = i

The resulting samples go through the merging stage to further reduce the number of light sources:

### Merging light sources

At the last stage, merging samples belonging to the same lighting cluster is performed. To do this, we can use the Bresenham algorithm and start with the samples with the highest brightness, because they are already ordered. When we find a light source that satisfies the Bresenham test, we use its position to change the position of the source based on the weight of the run:

# apply bresenham check and compute position of the light clusters
lights = []
finalIndices = np.empty(filteredCount); finalIndices.fill(-1)
for i in localIndices:
sample = samplesList[i]
startPos = sample.pos
if finalIndices[i] == -1:
finalIndices[i] = i
light = Light()
light.originalPos = np.array(sample.pos)  # position of the local maxima
light.worldPos = np.array(sample.worldPos)
light.pos = np.array(sample.pos)
light.luminance = sample.luminance
for j in localIndices:
if i != j and finalIndices[j] == -1:
endPos = samplesList[j].pos
if bresenhamCheck(lum, width, height, startPos[0], startPos[1], endPos[0], endPos[1]):
finalIndices[j] = i
# compute final position of the light source
sampleWeight = samplesList[j].luminance / sample.luminance
light.pos = light.pos + np.array(endPos) * sampleWeight
light.pos = light.pos / (1.0 + sampleWeight)
imagePos = light.pos * np.array([1.0 / width, 1.0 / height)
light.worldPos = equirectangularToSphere(imagePos)
lights.append(light)

The Bresenham function checks for a continuous line that has the same brightness. If the delta in the current pixel exceeds a certain threshold, then the check fails:

def bresenhamCheck(lum, imageSize, x0, y0, x1, y1):
dX = int(x1 - x0)
stepX = int((dX > 0) - (dX < 0))
dX = abs(dX) << 1
dY = int(y1 - y0)
stepY = int((dY > 0) - (dY < 0))
dY = abs(dY) << 1
luminanceThreshold = 0.15
prevLum = lum[x0][y0]
sumLum = 0.0
c = 0
if (dX >= dY):
# delta may go below zero
delta = int (dY - (dX >> 1))
while (x0 != x1):
# reduce delta, while taking into account the corner case of delta == 0
if ((delta > 0) or (delta == 0 and (stepX > 0))):
delta -= dX
y0 += stepY
delta += dY
x0 += stepX
sumLum = sumLum + min(lum[x0][y0], 1.25)
c = c + 1
if(abs(sumLum / c - prevLum) > luminanceThreshold and (sumLum / c) < 1.0):
return 0
else:
# delta may go below zero
delta = int(dX - (dY >> 1))
while (y0 != y1):
# reduce delta, while taking into account the corner case of delta == 0
if ((delta > 0) or (delta == 0 and (stepY > 0))):
delta -= dY
x0 += stepX
delta += dX
y0 += stepY
sumLum = sumLum + min(lum[x0][y0], 1.25)
c = c + 1
if(abs(sumLum / c - prevLum) > luminanceThreshold and (sumLum / c) < 1.0):
return 0
return 1

It should be noted that, if necessary, improvements can be made to the Bresenham test, which will lead to better merging of samples, for example, it can take into account the horizontal transfer of light sources located at the edges of the image. In addition, the function can be easily expanded so that it approximates the area of ​​the light sources. Another improvement: you can add a distance threshold so you do not combine samples that are too far away. Final Results:

Blue denotes the local maxima of the lighting clusters, blue denotes the final positions of the light sources, and red denotes samples that are part of the same lighting cluster and are connected by lines.

Other examples of results: