Clipping rasters can be trivial with a desktop GIS like QGIS or with command line tools like GDAL. However, I recently ran into a situation where I needed to clip large rasters in an automated, online Python process. It simply wouldn't do to interrupt the procedure and clip them myself. The Python bindings for GDAL/OGR are pretty neat but they are very low-level; how could I use Python to clip the rasters as part of a continuous Python session?
Jared Erickson posted an excellent tutorial on this topic, one of many in his "Python GDAL/OGR Cookbook". Even this simple example hints at how complicated something like clipping can be with the low-level GDAL/OGR API. However, there are still some limitations to the example he provided:
- It only works for single clipping features.
- It only works for multiple-band rasters.
- The clipping features must be bounded within the raster.
Here I'll present a solution to the second and third points: Clipping a raster, stacked or single-band, with clipping features that extend beyond its bounds.
Improvement: Support for Single-Band Images
This is an easy fix.
We simply need to use a try...catch
sequence of statements to anticipate the error in our NumPy indexing when a single-band raster is presented.
# Multi-band image?
try:
clip = rast[:, ulY:lrY, ulX:lrX]
# Nope: Must be single-band
except IndexError:
clip = rast[ulY:lrY, ulX:lrX]
Improvement: Out-of-Bounds Clipping Features
The second part is not so easy. The clipping in Jared's example is based on NumPy arrays and therefore must be conformal. One thing that can happen when the clipping features extend "above" a raster's extent is that the estimated pixel coordinate for the upper-left corner become negative--and negative pixels don't make sense! We have to catch this case, remember the negative pixel coordinate, and set it to zero.
# If the clipping features extend out-of-bounds and ABOVE the raster...
if gt[3] < maxY:
# In such a case... ulY ends up being negative--can't have that!
iY = ulY
ulY = 0
However, in doing this, the clipping features are effectively "pushed down" (southwards) because we prevented the upper-left corner coordinate from being negative.
To compensate, we must pull the clipping features "back up" (northwards).
Note that here we need to add an else
clause for the case when the clipping features don't extend above the raster.
# If the clipping features extend out-of-bounds and ABOVE the raster...
if gt[3] < maxY:
# The clip features were "pushed down" to match the bounds of the
# raster; this step "pulls" them back up
premask = image_to_array(raster_poly)
# We slice out the piece of our clip features that are "off the map"
mask = np.ndarray((premask.shape[-2] - abs(iY), premask.shape[-1]), premask.dtype)
mask[:] = premask[abs(iY):, :]
mask.resize(premask.shape) # Then fill in from the bottom
# Most importantly, push the clipped piece down
gt2[3] = maxY - (maxY - gt[3])
else:
mask = image_to_array(raster_poly)
Finally, we have to deal with the converse problem: clipping features that extend below the raster's bounds. This is trickier. What I've done is created a larger NumPy array onto which the clipping features are projected.
# Clip the image using the mask
try:
clip = gdalnumeric.choose(mask, (clip, nodata))
# If the clipping features extend out-of-bounds and BELOW the raster...
except ValueError:
# We have to cut the clipping features to the raster!
rshp = list(mask.shape)
if mask.shape[-2] != clip.shape[-2]:
rshp[0] = clip.shape[-2]
if mask.shape[-1] != clip.shape[-1]:
rshp[1] = clip.shape[-1]
mask.resize(*rshp, refcheck=False)
clip = gdalnumeric.choose(mask, (clip, nodata))
The Complete Source Code
Here's a complete Python function for clipping a raster. It follows Jared's example but expands on the comments throughout and includes the two improvements I've described. It does not yet support clipping by multiple features at once. Also, I haven't evaluated (nor come across the problem of) its performance with clipping features that extend only to the side of the raster; not sure if this would be a problem.
The function's API consists of two required arguments: The raster to clip and the file path to the clipping features (e.g., a Shapefile). An optional GDAL GeoTransform can be provided. Also, the NoData value can be set.
from osgeo import gdal, gdalnumeric, ogr
from PIL import Image, ImageDraw
import os
import numpy as np
def clip_raster(rast, features_path, gt=None, nodata=-9999):
'''
Clips a raster (given as either a gdal.Dataset or as a numpy.array
instance) to a polygon layer provided by a Shapefile (or other vector
layer). If a numpy.array is given, a "GeoTransform" must be provided
(via dataset.GetGeoTransform() in GDAL). Returns an array. Clip features
must be a dissolved, single-part geometry (not multi-part). Modified from:
http://pcjericks.github.io/py-gdalogr-cookbook/raster_layers.html
#clip-a-geotiff-with-shapefile
Arguments:
rast A gdal.Dataset or a NumPy array
features_path The path to the clipping features
gt An optional GDAL GeoTransform to use instead
nodata The NoData value; defaults to -9999.
'''
def array_to_image(a):
'''
Converts a gdalnumeric array to a Python Imaging Library (PIL) Image.
'''
i = Image.fromstring('L',(a.shape[1], a.shape[0]),
(a.astype('b')).tostring())
return i
def image_to_array(i):
'''
Converts a Python Imaging Library (PIL) array to a gdalnumeric image.
'''
a = gdalnumeric.fromstring(i.tobytes(), 'b')
a.shape = i.im.size[1], i.im.size[0]
return a
def world_to_pixel(geo_matrix, x, y):
'''
Uses a gdal geomatrix (gdal.GetGeoTransform()) to calculate
the pixel location of a geospatial coordinate; from:
http://pcjericks.github.io/py-gdalogr-cookbook/raster_layers.html#clip-a-geotiff-with-shapefile
'''
ulX = geo_matrix[0]
ulY = geo_matrix[3]
xDist = geo_matrix[1]
yDist = geo_matrix[5]
rtnX = geo_matrix[2]
rtnY = geo_matrix[4]
pixel = int((x - ulX) / xDist)
line = int((ulY - y) / xDist)
return (pixel, line)
# Can accept either a gdal.Dataset or numpy.array instance
if not isinstance(rast, np.ndarray):
gt = rast.GetGeoTransform()
rast = rast.ReadAsArray()
# Create an OGR layer from a boundary shapefile
features = ogr.Open(features_path)
if features.GetDriver().GetName() == 'ESRI Shapefile':
lyr = features.GetLayer(os.path.split(os.path.splitext(features_path)[0])[1])
else:
lyr = features.GetLayer()
# Get the first feature
poly = lyr.GetNextFeature()
# Convert the layer extent to image pixel coordinates
minX, maxX, minY, maxY = lyr.GetExtent()
ulX, ulY = world_to_pixel(gt, minX, maxY)
lrX, lrY = world_to_pixel(gt, maxX, minY)
# Calculate the pixel size of the new image
pxWidth = int(lrX - ulX)
pxHeight = int(lrY - ulY)
# If the clipping features extend out-of-bounds and ABOVE the raster...
if gt[3] < maxY:
# In such a case... ulY ends up being negative--can't have that!
iY = ulY
ulY = 0
# Multi-band image?
try:
clip = rast[:, ulY:lrY, ulX:lrX]
except IndexError:
clip = rast[ulY:lrY, ulX:lrX]
# Create a new geomatrix for the image
gt2 = list(gt)
gt2[0] = minX
gt2[3] = maxY
# Map points to pixels for drawing the boundary on a blank 8-bit,
# black and white, mask image.
points = []
pixels = []
geom = poly.GetGeometryRef()
pts = geom.GetGeometryRef(0)
for p in range(pts.GetPointCount()):
points.append((pts.GetX(p), pts.GetY(p)))
for p in points:
pixels.append(world_to_pixel(gt2, p[0], p[1]))
raster_poly = Image.new('L', (pxWidth, pxHeight), 1)
rasterize = ImageDraw.Draw(raster_poly)
rasterize.polygon(pixels, 0) # Fill with zeroes
# If the clipping features extend out-of-bounds and ABOVE the raster...
if gt[3] < maxY:
# The clip features were "pushed down" to match the bounds of the
# raster; this step "pulls" them back up
premask = image_to_array(raster_poly)
# We slice out the piece of our clip features that are "off the map"
mask = np.ndarray((premask.shape[-2] - abs(iY), premask.shape[-1]), premask.dtype)
mask[:] = premask[abs(iY):, :]
mask.resize(premask.shape) # Then fill in from the bottom
# Most importantly, push the clipped piece down
gt2[3] = maxY - (maxY - gt[3])
else:
mask = image_to_array(raster_poly)
# Clip the image using the mask
try:
clip = gdalnumeric.choose(mask, (clip, nodata))
# If the clipping features extend out-of-bounds and BELOW the raster...
except ValueError:
# We have to cut the clipping features to the raster!
rshp = list(mask.shape)
if mask.shape[-2] != clip.shape[-2]:
rshp[0] = clip.shape[-2]
if mask.shape[-1] != clip.shape[-1]:
rshp[1] = clip.shape[-1]
mask.resize(*rshp, refcheck=False)
clip = gdalnumeric.choose(mask, (clip, nodata))
return (clip, ulX, ulY, gt2)