In a diversion from my usual programming related geo-geeking, I'm heading to China on January 20th to do some in person, on the ground examination of geography. I will be visiting a friend there, and returning on the 29th. I will be visiting Shanghai, and Fuzhou (on the coast roughly half way between Shanghai and Hong Kong).

Today I received my visa from the Chinese embassy. This is the first time I've ever had to get a visa for international travel, and I have a new respect for those who have to do it regularly to attend events like FOSS4G. It is nearly a five hour activity to travel to Ottawa and visit the embassy, which I had to do to drop off my request, and to pick up the visa. Also, they were going to refuse to accept my request until I could provide a clear address I would be at. I had to rush out to an internet cafe to book a hotel for the first night just so I would have an address to write on the form. I barely got back before the office closed (9am to 1pm!), which would have cost me another day of travel. Normally I prefer to play things by ear after I arrive.

I'm flying via Newark despite my preference to avoid going through the USA given the security hassles that entails. But the price difference over flying Air Canada through Vancouver was just too significant.

I'm not sure what my connectivity will be, so I'm hoping to load up my laptop with some work I can do offline if needed. Likely I will focus on write/update access for PCIDSK vectors in the PCIDSK SDK.

## Friday, January 15, 2010

## Monday, January 4, 2010

### hsv_merge.py

Recently Trent Hare indicated an interest in an extension to the gdaldem program to produce a color relief image mixed with a hillshade.

The gdaldem utility is a utility program for producing various products from DEMs (elevation models). It was originally written by Matthew Perry, with contributions by Even Rouault, Howard Butler, and Chris Yesson to incorporate it into GDAL proper. It can produce color relief maps, hillshades, slope, aspect and various roughness values.

But what Trent wanted was a product with both a hillshade and a color relief image merged. The color relief shows overall elevation with color, and the hillshade gives a sense of local structure. He has been doing this with ImageMagick to produce the product from the hillshade and relief produced by gdaldem, but this resulted in the georeferencing being lost. It is accomplished by transforming the relief image to HSV (hue, saturation and value) color space, and then replacing the "value" portion with the hillshade greyscale image. The value is really the intensity or brightness. This is a common technique and is often used to merge high resolution greyscale imagery with lower resolution RGB image to produce a result with meaningful color and high resolution.

I reviewed the gdaldem.cpp code with the thought of adding a new mode, but it turns out the program is fairly complicated and a new hillshade/relief mode would likely be hard to maintain properly. So instead I set out to create a utility that could do the rgb/greyscale merge in hsv space as a standalone utility. When possible I prefer to do non-core utilities as python instead of C++. It feels lighter and easier for folks to munge for other purposes.

In Python we generally try to do image processing using the Python numpy package, so that is what I set out to do. For some things numpy lets us treat arrays as if they were scalars, and operate on the whole array with a single statement. So the "scalar" function to convert hsv to rgb looks like:

I was able to change this into a numpy based function that looked fairly similar in parts, but somewhat different in other areas:

The first step just breaks a single hsv array into three component arrays and is just bookkeeping:

The next few statements look very much like the original and are essentially simple math that happens to be done on whole arrays instead of single values. The only odd part is using "astype(int)" to convert to int instead of the int() builtin function that was used for scalars.

But following this there are many if/then else statements which had looked like:

Unfortunately we can't really do something quite like an if statement with numpy that will be evaluated on a per-pixel (array entry) basis. The old scalar logic assumed that if a condition was met, it could do something for that case, and return. But for arrays we are going to have to do things quite differently. The above is essentially a case statement, so we instead use the numpy choose() function like this:

The choose function goes through each entry in the "i" array, and uses it's value as an index into the set of possible return values (v,q,p,p,t,v in the first case). So if i[0] is 0, then the first item v[0] is assigned to r[0]. If i[0] was 4 then t[0] would be assigned to r[0]. Thus the six if statements are handled as a choose() with six possible return results, and the choose is evaluated for each array entry (pixel).

This works reasonably well for this very structured case. We then finish by merging the red, green and blue values into an 3 band array:

One thing missing in my rendition of this function is the saturation == 0 case:

It is very difficult to have the pixels for which the saturation is zero handled differently than all the rest since we can't actually return control to the caller for only some pixels (array entries). In this particular situation it seems that this was just an optimization and if s is zero the result will be v,v,v anyways so I just omitted this statement.

But this helps highlight the difficulty of taking different logic paths for different entries in the array using numpy. The rgb_to_hsv() logic was even more involved. The original was:

This became:

The key mechanism here is the computation of true/value masks (maxc_is_r, maxc_is_g, maxc_is_b) which are used with choose() statements to overlay different results on the pixels that match particular conditions. It may not be immediately evident, but we are essentially evaluating all the equations for all pixels even though the values are only going to be used for some of them. Quite wasteful.

The final result, hsv_merge.py, works nicely but I am once again left frustrated with the unnatural handling of logic paths in numpy. It took me over three hours to construct this small script. Most of that time was spent poring over the numpy reference trying to figure out how to do things that are dead easy in scalar python.

I love Python, but to be honest I still can't say the same for numpy.

The gdaldem utility is a utility program for producing various products from DEMs (elevation models). It was originally written by Matthew Perry, with contributions by Even Rouault, Howard Butler, and Chris Yesson to incorporate it into GDAL proper. It can produce color relief maps, hillshades, slope, aspect and various roughness values.

But what Trent wanted was a product with both a hillshade and a color relief image merged. The color relief shows overall elevation with color, and the hillshade gives a sense of local structure. He has been doing this with ImageMagick to produce the product from the hillshade and relief produced by gdaldem, but this resulted in the georeferencing being lost. It is accomplished by transforming the relief image to HSV (hue, saturation and value) color space, and then replacing the "value" portion with the hillshade greyscale image. The value is really the intensity or brightness. This is a common technique and is often used to merge high resolution greyscale imagery with lower resolution RGB image to produce a result with meaningful color and high resolution.

I reviewed the gdaldem.cpp code with the thought of adding a new mode, but it turns out the program is fairly complicated and a new hillshade/relief mode would likely be hard to maintain properly. So instead I set out to create a utility that could do the rgb/greyscale merge in hsv space as a standalone utility. When possible I prefer to do non-core utilities as python instead of C++. It feels lighter and easier for folks to munge for other purposes.

In Python we generally try to do image processing using the Python numpy package, so that is what I set out to do. For some things numpy lets us treat arrays as if they were scalars, and operate on the whole array with a single statement. So the "scalar" function to convert hsv to rgb looks like:

def hsv_to_rgb(h, s, v):

if s == 0.0: return v, v, v

i = int(h*6.0) # XXX assume int() truncates!

f = (h*6.0) - i

p = v*(1.0 - s)

q = v*(1.0 - s*f)

t = v*(1.0 - s*(1.0-f))

if i%6 == 0: return v, t, p

if i == 1: return q, v, p

if i == 2: return p, v, t

if i == 3: return p, q, v

if i == 4: return t, p, v

if i == 5: return v, p, q

# Cannot get here

I was able to change this into a numpy based function that looked fairly similar in parts, but somewhat different in other areas:

def hsv_to_rgb( hsv ):

h = hsv[0]

s = hsv[1]

v = hsv[2]

i = (h*6.0).astype(int)

f = (h*6.0) - i

p = v*(1.0 - s)

q = v*(1.0 - s*f)

t = v*(1.0 - s*(1.0-f))

r = i.choose( v, q, p, p, t, v )

g = i.choose( t, v, v, q, p, p )

b = i.choose( p, p, t, v, v, q )

rgb = numpy.asarray([r,g,b]).astype(numpy.byte)

return rgb

The first step just breaks a single hsv array into three component arrays and is just bookkeeping:

h = hsv[0]

s = hsv[1]

v = hsv[2]

The next few statements look very much like the original and are essentially simple math that happens to be done on whole arrays instead of single values. The only odd part is using "astype(int)" to convert to int instead of the int() builtin function that was used for scalars.

i = (h*6.0).astype(int)

f = (h*6.0) - i

p = v*(1.0 - s)

q = v*(1.0 - s*f)

t = v*(1.0 - s*(1.0-f))

But following this there are many if/then else statements which had looked like:

if i%6 == 0: return v, t, p

if i == 1: return q, v, p

if i == 2: return p, v, t

if i == 3: return p, q, v

if i == 4: return t, p, v

if i == 5: return v, p, q

Unfortunately we can't really do something quite like an if statement with numpy that will be evaluated on a per-pixel (array entry) basis. The old scalar logic assumed that if a condition was met, it could do something for that case, and return. But for arrays we are going to have to do things quite differently. The above is essentially a case statement, so we instead use the numpy choose() function like this:

r = i.choose( v, q, p, p, t, v )

g = i.choose( t, v, v, q, p, p )

b = i.choose( p, p, t, v, v, q )

The choose function goes through each entry in the "i" array, and uses it's value as an index into the set of possible return values (v,q,p,p,t,v in the first case). So if i[0] is 0, then the first item v[0] is assigned to r[0]. If i[0] was 4 then t[0] would be assigned to r[0]. Thus the six if statements are handled as a choose() with six possible return results, and the choose is evaluated for each array entry (pixel).

This works reasonably well for this very structured case. We then finish by merging the red, green and blue values into an 3 band array:

rgb = numpy.asarray([r,g,b]).astype(numpy.byte)

One thing missing in my rendition of this function is the saturation == 0 case:

if s == 0.0: return v, v, v

It is very difficult to have the pixels for which the saturation is zero handled differently than all the rest since we can't actually return control to the caller for only some pixels (array entries). In this particular situation it seems that this was just an optimization and if s is zero the result will be v,v,v anyways so I just omitted this statement.

But this helps highlight the difficulty of taking different logic paths for different entries in the array using numpy. The rgb_to_hsv() logic was even more involved. The original was:

def rgb_to_hsv(r, g, b):

maxc = max(r, g, b)

minc = min(r, g, b)

v = maxc

if minc == maxc: return 0.0, 0.0, v

s = (maxc-minc) / maxc

rc = (maxc-r) / (maxc-minc)

gc = (maxc-g) / (maxc-minc)

bc = (maxc-b) / (maxc-minc)

if r == maxc: h = bc-gc

elif g == maxc: h = 2.0+rc-bc

else: h = 4.0+gc-rc

h = (h/6.0) % 1.0

return h, s, v

This became:

def rgb_to_hsv( rgb ):

r = rgb[0]

g = rgb[1]

b = rgb[2]

maxc = numpy.maximum(r,numpy.maximum(g,b))

minc = numpy.minimum(r,numpy.minimum(g,b))

v = maxc

minc_eq_maxc = numpy.equal(minc,maxc)

# compute the difference, but reset zeros to ones to avoid divide by zeros later.

ones = numpy.ones((r.shape[0],r.shape[1]))

maxc_minus_minc = numpy.choose( minc_eq_maxc, (ones, maxc-minc) )

s = (maxc-minc) / numpy.maximum(ones,maxc)

rc = (maxc-r) / maxc_minus_minc

gc = (maxc-g) / maxc_minus_minc

bc = (maxc-b) / maxc_minus_minc

maxc_is_r = numpy.equal(maxc,r)

maxc_is_g = numpy.equal(maxc,g)

maxc_is_b = numpy.equal(maxc,b)

h = numpy.zeros((r.shape[0],r.shape[1]))

h = numpy.choose( maxc_is_b, (h,4.0+gc-rc) )

h = numpy.choose( maxc_is_g, (h,2.0+rc-bc) )

h = numpy.choose( maxc_is_r, (h,bc-gc) )

h = numpy.mod(h/6.0,1.0)

hsv = numpy.asarray([h,s,v])

return hsv

The key mechanism here is the computation of true/value masks (maxc_is_r, maxc_is_g, maxc_is_b) which are used with choose() statements to overlay different results on the pixels that match particular conditions. It may not be immediately evident, but we are essentially evaluating all the equations for all pixels even though the values are only going to be used for some of them. Quite wasteful.

The final result, hsv_merge.py, works nicely but I am once again left frustrated with the unnatural handling of logic paths in numpy. It took me over three hours to construct this small script. Most of that time was spent poring over the numpy reference trying to figure out how to do things that are dead easy in scalar python.

I love Python, but to be honest I still can't say the same for numpy.

Subscribe to:
Posts (Atom)