This is the 2nd post in a series of 3 on 2D convex hull algorithms. The first covered the Jarvis March and here I’ll be covering the Graham Scan. The implementation of the Graham Scan is short, but sweet. It handles degenerate cases very well. The next post will cover Chan’s algorithm.
Graham Scan
Although it may not look it at first glance, the Graham Scan is similar to the Jarvis March. We start with an extreme point and then find the edges of the hull iteratively, one at a time. The main difference is that by sorting the points first, we remove the need to spend O(n) time searching for the next hull point, though at the cost of making some “mistakes” (adding points to the hull that aren’t hull points). Instead, we spend our effort on updating the hull at each iteration, fixing previous mistakes. It turns out that fixing the mistakes is pretty easy and we can find the entire hull in only O(n) comparisons (rather then the O(nh) of the Jarvis March). The time complexity of the algorithm actually comes from having to first sort the points in O(n log n) time. Now, much as we did with the Jarvis March, we will require a function that tells us the “turn” of 3 points (that is, whether p,q,r form a LEFT, RIGHT or straight (NONE) turn):
TURN_LEFT, TURN_RIGHT, TURN_NONE = (1, -1, 0)
def turn(p, q, r):
return cmp((q[0] - p[0])*(r[1] - p[1]) - (r[0] - p[0])*(q[1] - p[1]), 0)
The original incantation of the Graham Scan sorted the points in angular order relative to some initial extreme point (eg. sorted in CCW order around the furthest left point). The convex hull is then constructed iteratively by going through the sorted list of points, one by one, each time adding the point to the previous hull. However, adding a point to the previous hull in the Graham Scan is not as simple as it was in the Jarvis March. In the Jarvis March, we knew our previous hull points were a subset of our final hull points, so to “update” the previous hull, we simply add the point found at each iteration to the end of our list of hull points. However, in the Graham Scan, updating our list of hull points at each iteration is not as simple.
The hull we have at the start of iteration i is actually the complete convex hull of the first i-1 points in our sorted list. Updating the hull requires us to find the tangents between the previous hull and point pi, and replacing all hull edges between these 2 tangents with the tangents themselves. This may seem hard at first, but because the points were sorted in CCW order around the first point, p1, it is actually quite simple.
We know the point pi lies to the left of p1pi-1 and all other points pk, 1 < k < i-1, lie to the right of p1pi-1, so one of the tangents will have to be the edge pip1. Knowing this, we can find the other point of tangency by starting at pi-1 (the last point in the list of hull points) and working backwards. The point of tangency is the first point pj, j<i, where pj-1, pj, pi form a left (CCW) turn. After finding the point pj, we simply remove all points between this and the other point of tangency, p1 (ie. we simply remove all points after pj in our hull list). The point pi is then simply appended to the end of our list of hull points.
In Python, we can implement a function to perform this step (adding a point r to the hull) fairly simply: we just keep popping the last item off our list while hull[-2], hull[-1], r don’t form a left turn.
def _keep_left(hull, r):
while len(hull) > 1 and turn(hull[-2], hull[-1], r) != TURN_LEFT:
hull.pop()
# We check that hull[-1] != r to handle degenerate cases (ie. multisets)
if not hull or hull[-1] != r:
hull.append(r)
return hull
What’s nice is that we only backtrack through the list, starting at the last point, by as many points as we will remove, +1 more point pj (which is where we stop). After we remove these points, they will never be looked at again, since they are no longer in the hull. This means, if we account for the +1 point we don’t remove, we will never look at more than 2n hull points during the entire run of the Graham Scan to find the tangents. Since we only go through the sorted list once, we require only O(n) time to find the convex hull, but O(n log n) time to sort the list first.
Putting it together, in Python, we could simply sort the list by angular order, using the turn function as the cmp function, then iterate through the sorted list of points add the points one at a time (with _keep_left). However, a nice (implementation) thing about the Graham Scan is that we can get away with sorting by lexicographical order, instead of angular order. Pretend we added a point at (0, ∞) to the point set. Then sorting by angular order about this point is equivalent to simply sorting by the x-coordinate. The convex hull of this modified point set is actually just the lower hull of our original point set. We can use this idea to split the algorithm into 3 parts. First we sort by x-coordinates (actually, lexicographical order to handle degenerecies) and find the lower hull. Then we reverse this sorted list and find the upper hull. Finally, we merge the upper and lower hull together. Doing this in Python, using the _keep_left function, is actually really straightforward. We simply have to keep in mind that the first point of the lower hull is also the last point of the upper hull and vice versa
def convex_hull(points):
"""Returns points on convex hull of an array of points in CCW order."""
points = sorted(points)
l = reduce(_keep_left, points, [])
u = reduce(_keep_left, reversed(points), [])
return l.extend(u[i] for i in xrange(1, len(u) - 1)) or l
I posted the full source for the Graham Scan in Python (all 20 lines) as a Gist.