Monday, January 4, 2010

hsv_merge.py

Recently Trent Hare indicated an interest in an extension to the gdaldem program to produce a color relief image mixed with a hillshade.

The gdaldem utility is a utility program for producing various products from DEMs (elevation models). It was originally written by Matthew Perry, with contributions by Even Rouault, Howard Butler, and Chris Yesson to incorporate it into GDAL proper. It can produce color relief maps, hillshades, slope, aspect and various roughness values.

But what Trent wanted was a product with both a hillshade and a color relief image merged. The color relief shows overall elevation with color, and the hillshade gives a sense of local structure. He has been doing this with ImageMagick to produce the product from the hillshade and relief produced by gdaldem, but this resulted in the georeferencing being lost. It is accomplished by transforming the relief image to HSV (hue, saturation and value) color space, and then replacing the "value" portion with the hillshade greyscale image. The value is really the intensity or brightness. This is a common technique and is often used to merge high resolution greyscale imagery with lower resolution RGB image to produce a result with meaningful color and high resolution.

I reviewed the gdaldem.cpp code with the thought of adding a new mode, but it turns out the program is fairly complicated and a new hillshade/relief mode would likely be hard to maintain properly. So instead I set out to create a utility that could do the rgb/greyscale merge in hsv space as a standalone utility. When possible I prefer to do non-core utilities as python instead of C++. It feels lighter and easier for folks to munge for other purposes.

In Python we generally try to do image processing using the Python numpy package, so that is what I set out to do. For some things numpy lets us treat arrays as if they were scalars, and operate on the whole array with a single statement. So the "scalar" function to convert hsv to rgb looks like:


def hsv_to_rgb(h, s, v):
if s == 0.0: return v, v, v
i = int(h*6.0) # XXX assume int() truncates!
f = (h*6.0) - i
p = v*(1.0 - s)
q = v*(1.0 - s*f)
t = v*(1.0 - s*(1.0-f))
if i%6 == 0: return v, t, p
if i == 1: return q, v, p
if i == 2: return p, v, t
if i == 3: return p, q, v
if i == 4: return t, p, v
if i == 5: return v, p, q
# Cannot get here


I was able to change this into a numpy based function that looked fairly similar in parts, but somewhat different in other areas:


def hsv_to_rgb( hsv ):

h = hsv[0]
s = hsv[1]
v = hsv[2]

i = (h*6.0).astype(int)
f = (h*6.0) - i
p = v*(1.0 - s)
q = v*(1.0 - s*f)
t = v*(1.0 - s*(1.0-f))

r = i.choose( v, q, p, p, t, v )
g = i.choose( t, v, v, q, p, p )
b = i.choose( p, p, t, v, v, q )

rgb = numpy.asarray([r,g,b]).astype(numpy.byte)

return rgb

The first step just breaks a single hsv array into three component arrays and is just bookkeeping:

h = hsv[0]
s = hsv[1]
v = hsv[2]

The next few statements look very much like the original and are essentially simple math that happens to be done on whole arrays instead of single values. The only odd part is using "astype(int)" to convert to int instead of the int() builtin function that was used for scalars.

i = (h*6.0).astype(int)
f = (h*6.0) - i
p = v*(1.0 - s)
q = v*(1.0 - s*f)
t = v*(1.0 - s*(1.0-f))

But following this there are many if/then else statements which had looked like:

if i%6 == 0: return v, t, p
if i == 1: return q, v, p
if i == 2: return p, v, t
if i == 3: return p, q, v
if i == 4: return t, p, v
if i == 5: return v, p, q

Unfortunately we can't really do something quite like an if statement with numpy that will be evaluated on a per-pixel (array entry) basis. The old scalar logic assumed that if a condition was met, it could do something for that case, and return. But for arrays we are going to have to do things quite differently. The above is essentially a case statement, so we instead use the numpy choose() function like this:

r = i.choose( v, q, p, p, t, v )
g = i.choose( t, v, v, q, p, p )
b = i.choose( p, p, t, v, v, q )

The choose function goes through each entry in the "i" array, and uses it's value as an index into the set of possible return values (v,q,p,p,t,v in the first case). So if i[0] is 0, then the first item v[0] is assigned to r[0]. If i[0] was 4 then t[0] would be assigned to r[0]. Thus the six if statements are handled as a choose() with six possible return results, and the choose is evaluated for each array entry (pixel).

This works reasonably well for this very structured case. We then finish by merging the red, green and blue values into an 3 band array:

rgb = numpy.asarray([r,g,b]).astype(numpy.byte)

One thing missing in my rendition of this function is the saturation == 0 case:

if s == 0.0: return v, v, v

It is very difficult to have the pixels for which the saturation is zero handled differently than all the rest since we can't actually return control to the caller for only some pixels (array entries). In this particular situation it seems that this was just an optimization and if s is zero the result will be v,v,v anyways so I just omitted this statement.

But this helps highlight the difficulty of taking different logic paths for different entries in the array using numpy. The rgb_to_hsv() logic was even more involved. The original was:


def rgb_to_hsv(r, g, b):
maxc = max(r, g, b)
minc = min(r, g, b)
v = maxc
if minc == maxc: return 0.0, 0.0, v
s = (maxc-minc) / maxc
rc = (maxc-r) / (maxc-minc)
gc = (maxc-g) / (maxc-minc)
bc = (maxc-b) / (maxc-minc)
if r == maxc: h = bc-gc
elif g == maxc: h = 2.0+rc-bc
else: h = 4.0+gc-rc
h = (h/6.0) % 1.0
return h, s, v

This became:

def rgb_to_hsv( rgb ):

r = rgb[0]
g = rgb[1]
b = rgb[2]

maxc = numpy.maximum(r,numpy.maximum(g,b))
minc = numpy.minimum(r,numpy.minimum(g,b))

v = maxc

minc_eq_maxc = numpy.equal(minc,maxc)

# compute the difference, but reset zeros to ones to avoid divide by zeros later.
ones = numpy.ones((r.shape[0],r.shape[1]))
maxc_minus_minc = numpy.choose( minc_eq_maxc, (ones, maxc-minc) )

s = (maxc-minc) / numpy.maximum(ones,maxc)
rc = (maxc-r) / maxc_minus_minc
gc = (maxc-g) / maxc_minus_minc
bc = (maxc-b) / maxc_minus_minc

maxc_is_r = numpy.equal(maxc,r)
maxc_is_g = numpy.equal(maxc,g)
maxc_is_b = numpy.equal(maxc,b)

h = numpy.zeros((r.shape[0],r.shape[1]))
h = numpy.choose( maxc_is_b, (h,4.0+gc-rc) )
h = numpy.choose( maxc_is_g, (h,2.0+rc-bc) )
h = numpy.choose( maxc_is_r, (h,bc-gc) )

h = numpy.mod(h/6.0,1.0)

hsv = numpy.asarray([h,s,v])

return hsv

The key mechanism here is the computation of true/value masks (maxc_is_r, maxc_is_g, maxc_is_b) which are used with choose() statements to overlay different results on the pixels that match particular conditions. It may not be immediately evident, but we are essentially evaluating all the equations for all pixels even though the values are only going to be used for some of them. Quite wasteful.

The final result, hsv_merge.py, works nicely but I am once again left frustrated with the unnatural handling of logic paths in numpy. It took me over three hours to construct this small script. Most of that time was spent poring over the numpy reference trying to figure out how to do things that are dead easy in scalar python.

I love Python, but to be honest I still can't say the same for numpy.

11 comments:

  1. Hello Frank,
    you are certainly not a novice in numerical computing. Can you trace your difficulties down to certain points where you encounter difficulties?
    I think sending the concerns to the Numpy Discussion list could help. People there are normally very open for improvement suggestions.

    ReplyDelete
  2. There are functions in the standard library that will let you convert RGB arrays to HSV arrays (and back) in a functional style:

    >>> from colorsys import rgb_to_hsv
    >>> from itertools import starmap
    >>> r = (255, 0)
    >>> g = (255, 0)
    >>> b = (255, 0)
    >>> h, s, v = zip(*starmap(colorsys.rgb_to_hsv, zip(r, g, b)))
    >>> h, s, v
    ((0.0, 0.0), (0.0, 0.0), (255, 0))

    zip() is its own inverse:

    >>> [x] == zip(*zip(x))
    True

    And since numpy arrays implement the iterator protocol, you can use them instead of the tuples in my example.

    ReplyDelete
  3. Hi!

    Thank you for sharing this kind of tools with the community :)

    I have tried to run your python script from FWTools shell (version 2.4.6) but I got this message error "ImportError: No module named osgeo"

    Can you give some hints on how to run this script?

    Thanks,

    CC

    ReplyDelete
  4. Guys,

    I don't know how to respond directly to a comment (yes, I'm lame!) so I'll respond in a generic comment.

    Timmie, I think my issues with numpy are fairly fundamental, and it would be rude for me to come in spouting about how they ought to change their fundamental approach nearly 15 years after the project was established. I had thought about comparing numpy modelling to the modelling environment I developed while at PCI but time ran short and the post was plenty long!

    Sean, Chris Schmidt pointed the colorsys module out to me, and that was the code I converted to be numpy based. I didn't actually try it, but I assume invoking the colorsys algorithms for each pixel would be prohibatively slow so I didn't even pursue that option.

    CC, FWTools has a very antique GDAL Python binding environment and also is using Numeric instead of numpy. So unfortunately it will not likely support the script without non-trivial changes. For GDAL 1.7 we have pretty much decided to jettison support for the old GDAL bindings and Numeric.

    ReplyDelete
  5. gdaldem was originally from GRASS, and code that will be in GDAL 1.7 is derived from the pre-GPL GRASS code. Perry started with post-GPL GRASS code and we misappropriated it. There's a bug in GDAL's Trac that describes it in all its gory details.

    ReplyDelete
  6. try to use the formula described in ImageMagick's pegtop_light.which is much faster as it is a single formula and better results (for me). I wrote a python code to do that:

    http://rnovitsky.blogspot.com/2010/04/using-hillshade-image-as-intensity.html

    ReplyDelete
  7. Maybe you have figured out how to handle large images using ImageMagik but a GDAL solution fixes the two issues with the original ImageMagik "composite" solution:

    1.) Handle huge files over 4GB. I think we were even breaking ImageMagik at 2GB merges.

    2.) Maintain geospatial header on output colorshade. (The original script which used ImageMagik did maintain the geoheader by listing the header using listgeo and added it back in using geotifcp but that added a couple annoying extra steps).

    ReplyDelete
  8. In rgb_to_hsv, the 6th line should look like this:

    maxc_minus_minc = numpy.choose( minc_eq_maxc, (maxc-minc, ones))

    I.e. when the equality is false, pick choice 0 which is the difference. When the equality is true, pick choice 1.

    ReplyDelete
  9. it can be done only with gdal, using alpha channel and black or white image

    ReplyDelete
  10. "[Trent Hare] has been doing this with ImageMagick to produce the product from the hillshade and relief produced by gdaldem, but this resulted in the georeferencing being lost."

    Do you have the command to do this via ImageMagick ?

    A look for such imagemagick solution to improve the workflow for creating wikipedia maps. On ourside, we don't need georeferencing in the final bitmap.

    ReplyDelete
  11. "[Trent Hare] has been doing this with ImageMagick to produce the product from the hillshade and relief produced by gdaldem, but this resulted in the georeferencing being lost."

    Do you have the command to do this via ImageMagick ?

    A look for such imagemagick solution to improve the workflow for creating wikipedia maps. On ourside, we don't need georeferencing in the final bitmap.

    ReplyDelete