Quantcast
Channel: Hacker News
Viewing all articles
Browse latest Browse all 25817

Why recursive data structures?

$
0
0

In this essay, we are going to look at recursive algorithms, and how sometimes, we can organize an algorithm so that it resembles the data structure it manipulates, and organize a data structure so that it resembles the algorithms that manipulate it.

When algorithms and the data structures they manipulate are isomorphic, the code we write is easier to understand for exactly the same reason that code like template strings and regular expressions are easy to understand: The code resembles the data it consumes or produces.

We’ll finish up by observing that we also can employ optimizations that are only possible when algorithms and the data structures they manipulate are isomorphic.

Here we go.


GEB recursive

GEB Recursive, © 2006 Alexandre Duret-Lutz, some rights reserved


an exercise: rotating a square

Here is a square composed of elements, perhaps pixels or cells that are on or off. We could write them out like this:

⚪️⚪️⚪️⚪️⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚫️⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️⚫️⚪️⚪️⚪️
⚪️⚪️⚫️⚫️⚫️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️⚪️⚪️⚪️⚪️

Consider the problem of rotating our square. There is an uncommon, but particularly delightful way to do this. First, we cut the square into four smaller squares:

⚪️⚪️⚪️⚪️ ⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️ ⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚫️ ⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️ ⚫️⚪️⚪️⚪️

⚪️⚪️⚫️⚫️ ⚫️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️ ⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️ ⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️ ⚪️⚪️⚪️⚪️

Now, we rotate each of the four smaller squares 90 degrees clockwise:

⚪️⚪️⚪️⚪️ ⚫️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️ ⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️ ⚪️⚪️⚪️⚪️
⚪️⚫️⚪️⚪️ ⚪️⚪️⚪️⚪️

⚪️⚪️⚪️⚪️ ⚪️⚪️⚪️⚫️
⚪️⚪️⚪️⚪️ ⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚫️ ⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚫️ ⚪️⚪️⚪️⚪️

Finally, we move the squares as a whole, 90 degrees clockwise:

⚪️⚪️⚪️⚪️ ⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️ ⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚫️ ⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚫️ ⚪️⚫️⚪️⚪️

⚪️⚪️⚪️⚫️ ⚫️⚪️⚪️⚪️ 
⚪️⚪️⚪️⚪️ ⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️ ⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️ ⚪️⚪️⚪️⚪️

Then reassemble:

⚪️⚪️⚪️⚪️⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚫️⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚫️⚪️⚫️⚪️⚪️
⚪️⚪️⚪️⚫️⚫️⚪️⚪️⚪️ 
⚪️⚪️⚪️⚪️⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️⚪️⚪️⚪️⚪️

How do we rotate each of the four smaller squares? Exactly the same way. For example,

⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚫️
⚪️⚪️⚪️⚫️

Becomes:

⚪️⚪️ ⚪️⚪️
⚪️⚪️ ⚪️⚪️

⚪️⚪️ ⚪️⚫️
⚪️⚪️ ⚪️⚫️

By rotating each smaller square, it becomes:

⚪️⚪️ ⚪️⚪️
⚪️⚪️ ⚪️⚪️

⚪️⚪️ ⚪️⚪️
⚪️⚪️ ⚫️⚫️

And we rotate all for squares to finish with:

⚪️⚪️ ⚪️⚪️
⚪️⚪️ ⚪️⚪️

⚪️⚪️ ⚪️⚪️
⚫️⚫️ ⚪️⚪️

Reassembled, it becomes this:

⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️
⚫️⚫️⚪️⚪️

How would we rotate the next size down?

⚪️⚪️
⚫️⚫️

Becomes:

⚪️ ⚪️

⚫️ ⚫️

Rotating an individual dot is a NOOP, so all we have to do is rotate the four dots around, just like we do above:

⚫️ ⚪️

⚫️ ⚪️

Reassembled, it becomes this:

⚫️⚪️
⚫️⚪️

Voila! Rotating a square consists of dividing it into four “region” squares, rotating each one clockwise, then moving the regions one position clockwise. It brings whirling dervishes to mind.


recursion, see recursion

In From Higher-Order Functions to Libraries And Frameworks, we had a look at multirec, a recursive combinator.

functionmapWith(fn){returnfunction*(iterable){for(constelementofiterable){yieldfn(element);}};}functionmultirec({indivisible,value,divide,combine}){returnfunctionmyself(input){if(indivisible(input)){returnvalue(input);}else{constparts=divide(input);constsolutions=mapWith(myself)(parts);returncombine(solutions);}}}

With multirec, we can write functions that perform computation using divide-and-conquer algorithms. multirec handles the structure of divide-and-conquer, we just have to write four smaller functions that implement the parts specific to the problem we are solving.

In computer science, divide and conquer (D&C) is an algorithm design paradigm based on multi-branched recursion. A divide and conquer algorithm works by recursively breaking down a problem into two or more sub-problems of the same or related type, until these become simple enough to be solved directly. The solutions to the sub-problems are then combined to give a solution to the original problem.—Wikipedia

We’ll implement rotating a square using multirec. Let’s begin with a naïve representation for squares, a two-dimensional array. For example, we would represent the square:

⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚫️
⚪️⚪️⚪️⚫️

With this array:

[['⚪️','⚪️','⚪️','⚪️'],['⚪️','⚪️','⚪️','⚪️'],['⚪️','⚪️','⚪️','⚫️'],['⚪️','⚪️','⚪️','⚫️']]

To use multirec, we need four pieces:

  1. An indivisible predicate function. It should report whether an array is too small to be divided up. It’s simplicity itself: (square) => square.length === 1.
  2. A value function that determines what to do with a value that is indivisible. For rotation, we simply return what we are given: (something) => something
  3. A divide function that breaks a divisible problem into smaller pieces. Our function will break a square into four regions. We’ll see how that works below.
  4. A combine function that puts the result of rotating the smaller pieces back together. Our function will take four region squares and put them back together into a big square.

As noted, indivisible and value are trivial. We’ll call our functions hasLengthOne, and, itself:

consthasLengthOne=(square)=>square.length===1;constitself=(something)=>something;

divide involves no more than breaking arrays into halves, and then those halves again. We’ll write a divideSquareIntoRegions function for this:

constfirstHalf=(array)=>array.slice(0,array.length/2);constsecondHalf=(array)=>array.slice(array.length/2);constdivideSquareIntoRegions=(square)=>{constupperHalf=firstHalf(square);constlowerHalf=secondHalf(square);constupperLeft=upperHalf.map(firstHalf);constupperRight=upperHalf.map(secondHalf);constlowerRight=lowerHalf.map(secondHalf);constlowerLeft=lowerHalf.map(firstHalf);return[upperLeft,upperRight,lowerRight,lowerLeft];};

Our combine function, rotateAndCombineArrays, makes use of a little help from some functions we saw in an essay about generators:

functionsplit(iterable){constiterator=iterable[Symbol.iterator]();const{done,value:first}=iterator.next();if(done){return{rest:[]};}else{return{first,rest:iterator};}};function*join(first,rest){yieldfirst;yield*rest;};function*zipWith(fn,...iterables){constasSplits=iterables.map(split);if(asSplits.every((asSplit)=>asSplit.hasOwnProperty('first'))){constfirsts=asSplits.map((asSplit)=>asSplit.first);constrests=asSplits.map((asSplit)=>asSplit.rest);yield*join(fn(...firsts),zipWith(fn,...rests));}}constconcat=(...arrays)=>arrays.reduce((acc,a)=>acc.concat(a));constrotateAndCombineArrays=([upperLeft,upperRight,lowerRight,lowerLeft])=>{// rotate[upperLeft,upperRight,lowerRight,lowerLeft]=[lowerLeft,upperLeft,upperRight,lowerRight];// recombineconstupperHalf=[...zipWith(concat,upperLeft,upperRight)];constlowerHalf=[...zipWith(concat,lowerLeft,lowerRight)];returnconcat(upperHalf,lowerHalf);};

Armed with hasLengthOne, itself, divideSquareIntoRegions, and rotateAndCombineArrays, we can use multirec to write rotate:

constrotate=multirec({indivisible:hasLengthOne,value:itself,divide:divideSquareIntoRegions,combine:rotateAndCombineArrays});rotate([['⚪️','⚪️','⚪️','⚪️'],['⚪️','⚪️','⚪️','⚪️'],['⚪️','⚪️','⚪️','⚫️'],['⚪️','⚪️','⚪️','⚫️']])//=>([['⚪️','⚪️','⚪️','⚪️'],['⚪️','⚪️','⚪️','⚪️'],['⚪️','⚪️','⚪️','⚪️'],['⚫️','⚫️','⚪️','⚪️']])

Voila!


accidental complexity

Rotating a square in this recursive manner is very elegant, but our code is encumbered with some accidental complexity. Here’s a flashing strobe-and-neon hint of what it is:

constfirstHalf=(array)=>array.slice(0,array.length/2);constsecondHalf=(array)=>array.slice(array.length/2);constdivideSquareIntoRegions=(square)=>{constupperHalf=firstHalf(square);constlowerHalf=secondHalf(square);constupperLeft=upperHalf.map(firstHalf);constupperRight=upperHalf.map(secondHalf);constlowerRight=lowerHalf.map(secondHalf);constlowerLeft=lowerHalf.map(firstHalf);return[upperLeft,upperRight,lowerRight,lowerLeft];};

divideSquareIntoRegions is all about extracting region squares from a bigger square, and while we’ve done our best to make it readable, it is rather busy. Likewise, here’s the same thing in rotateAndCombineArrays:

constrotateAndCombineArrays=([upperLeft,upperRight,lowerRight,lowerLeft])=>{// rotate[upperLeft,upperRight,lowerRight,lowerLeft]=[lowerLeft,upperLeft,upperRight,lowerRight];// recombineconstupperHalf=[...zipWith(concat,upperLeft,upperRight)];constlowerHalf=[...zipWith(concat,lowerLeft,lowerRight)];returnconcat(upperHalf,lowerHalf);};

rotateAndCombineArrays is a very busy function. The core thing we want to talk about is actually the rotation: Having divided things up into four regions, we want to rotate the regions. The zipping and concatenating is all about the implementation of regions as arrays.

We can argue that this is necessary complexity, because squares are arrays, and that’s just what we programmers do for a living, write code that manipulates basic data structures to do our bidding.

But what if our implementation wasn’t an array of arrays? Maybe divide and combine could be simpler? Maybe that complexity would turn out to be unnecessary after all?


Recursive Chessboard

Recursive Chessboard, © 2007 fdecomite, some rights reserved


isomorphic data structures

When we have what ought to be an elegant algorithm, but the interface between the algorithm and the data structure ends up being as complicated as the rest of the algorithm put together, we can always ask ourselves, “What data structure would make this algorithm stupidly simple?”

The answer can often be found by imagining a data structure that looks like the algorithm’s basic form. If we follow that heuristic, our data structure would be recursive, rather than ‘flat.’ Since we do all kinds of work sorting out which squares form the four regions of a bigger square, our data structure would describe a square as being composed of four region squares.

Such a data structure already exists, it’s called a quadtree. Squares are represented as four regions, each of which is a smaller square or a cell. A simple implementation is a “Plain Old JavaScript Object” (or “POJO”) with properties for each of the regions. If the property contains a string, it’s cell. If it contains another POJO, it’s a quadtree.

A square that looks like this:

⚪️⚫️⚪️⚪️
⚪️⚪️⚫️⚪️
⚫️⚫️⚫️⚪️
⚪️⚪️⚪️⚪️

Is composed of four regions, the ul (“upper left”), ur (“upper right”), lr (“lower right”), and ll (“lower left”), something like this:

Thus, for example, the ul is:

⚪️⚫️
⚪️⚪️

And the ur is:

⚪️⚪️
⚫️⚪️

And so forth. Each of those regions is itself composed of four regions. Thus, the ul of the ul is ⚪️, and the ur of the ul is ⚫️.

The quadtree could be expressed in JavaScript like this:

constquadTree={ul:{ul:'⚪️',ur:'⚫️',lr:'⚪️',ll:'⚪️'},ur:{ul:'⚪️',ur:'⚪️',lr:'⚪️',ll:'⚫️'},lr:{ul:'⚫️',ur:'⚪️',lr:'⚪️',ll:'⚪️'},ll:{ul:'⚫️',ur:'⚫️',lr:'⚪️',ll:'⚪️'}};

Now to our algorithm. Rotating a quadtree is simpler than rotating an array of arrays. First, our test for indivisibility is now whether something is a string or not:

constisString=(something)=>typeofsomething==='string';

The value of an indivisible cell remain the same, itself.

Our divide function is simple: quadtrees are already divided in the manner we require, we just have to turn them into an array of regions:

constquadTreeToRegions=(qt)=>[qt.ul,qt.ur,qt.lr,qt.ll];

And finally, our combine function reassembles the rotated regions into a POJO, rotating them in the process:

constregionsToRotatedQuadTree=([ur,lr,ll,ul])=>({ul,ur,lr,ll});

And here’s our function for rotating a quadtree:

constrotateQuadTree=multirec({indivisible:isString,value:itself,divide:quadTreeToRegions,combine:regionsToRotatedQuadTree});

Let’s put it to the test:

rotateQuadTree(quadTree)//=>({ul:{ll:"⚪️",lr:"⚫️",ul:"⚪️",ur:"⚫️"},ur:{ll:"⚪️",lr:"⚫️",ul:"⚪️",ur:"⚪️"},lr:{ll:"⚪️",lr:"⚪️",ul:"⚫️",ur:"⚪️"},ll:{ll:"⚪️",lr:"⚪️",ul:"⚪️",ur:"⚫️"}})

If we reassemble the square by hand, it’s what we expect:

⚪️⚫️⚪️⚪️
⚪️⚫️⚪️⚫️
⚪️⚫️⚫️⚪️
⚪️⚪️⚪️⚪️

separation of concerns

Of course, all we’ve done so far is moved the “faffing about” out of our code and we’re doing it by hand. That’s bad: we don’t want to retrain our eyes to read quadtrees instead of flat arrays, and we don’t want to sit at a computer all day manually translating quadtrees to flat arrays and back.

If only we could write some code to do it for us… Some recursive code…

Here’s a function that recursively turns a two-dimensional array into a quadtree:

constisOneByOneArray=(something)=>Array.isArray(something)&&something.length===1&&Array.isArray(something[0])&&something[0].length===1;constcontentsOfOneByOneArray=(array)=>array[0][0];constregionsToQuadTree=([ul,ur,lr,ll])=>({ul,ur,lr,ll});constarrayToQuadTree=multirec({indivisible:isOneByOneArray,value:contentsOfOneByOneArray,divide:divideSquareIntoRegions,combine:regionsToQuadTree});arrayToQuadTree([['⚪️','⚪️','⚪️','⚪️'],['⚪️','⚫️','⚪️','⚪️'],['⚫️','⚪️','⚪️','⚪️'],['⚫️','⚫️','⚫️','⚪️']])//=>({ul:{ul:"⚪️",ur:"⚪️",lr:"⚫️",ll:"⚪️"},ur:{ul:"⚪️",ur:"⚪️",lr:"⚪️",ll:"⚪️"},lr:{ul:"⚪️",ur:"⚪️",lr:"⚪️",ll:"⚫️"},ll:{ul:"⚫️",ur:"⚪️",lr:"⚫️",ll:"⚫️"}})

Naturally, we can also write a function to convert quadtrees back into two-dimensional arrays again:

constisSmallestActualSquare=(square)=>isString(square.ul);constasTwoDimensionalArray=({ul,ur,lr,ll})=>[[ul,ur],[ll,lr]];constregions=({ul,ur,lr,ll})=>[ul,ur,lr,ll];constcombineFlatArrays=([upperLeft,upperRight,lowerRight,lowerLeft])=>{constupperHalf=[...zipWith(concat,upperLeft,upperRight)];constlowerHalf=[...zipWith(concat,lowerLeft,lowerRight)];returnconcat(upperHalf,lowerHalf);}constquadTreeToArray=multirec({indivisible:isSmallestActualSquare,value:asTwoDimensionalArray,divide:regions,combine:combineFlatArrays});quadTreeToArray(arrayToQuadTree([['⚪️','⚪️','⚪️','⚪️'],['⚪️','⚫️','⚪️','⚪️'],['⚫️','⚪️','⚪️','⚪️'],['⚫️','⚫️','⚫️','⚪️']]))//=>([["⚪️","⚪️","⚪️","⚪️"],["⚪️","⚫️","⚪️","⚪️"],["⚫️","⚪️","⚪️","⚪️"],["⚫️","⚫️","⚫️","⚪️"]])

And thus, we can take a two-dimensional array, turn it into a quadtree, rotate the quadtree, and convert it back to a two-dimensional array again:

quadTreeToArray(rotateQuadTree(arrayToQuadTree([['⚪️','⚪️','⚪️','⚪️'],['⚪️','⚫️','⚪️','⚪️'],['⚫️','⚪️','⚪️','⚪️'],['⚫️','⚫️','⚫️','⚪️']])))//=>([["⚫️","⚫️","⚪️","⚪️"],["⚫️","⚪️","⚫️","⚪️"],["⚫️","⚪️","⚪️","⚪️"],["⚪️","⚪️","⚪️","⚪️"]])

but why?

Now, we argued above that we’ve neatly separated the concerns by making three separate functions, instead of interleaving dividing two-dimensional squares into regions, rotating regions, and then reassembling two-dimensional squares.

But the converse side of this is that what we’re doing is now a lot less efficient: We’re recursing through our data structures three separate times, instead of once. And frankly, multirec was designed such that the divide function breaks things up, and the combine function puts them back together, so these concerns are already mostly separate once we use multirec instead of a bespoke recursive function.

One reason to break the logic up into three separate functions would be if we want to do lots of different kinds of things with quadtrees. Besides rotating quadtrees, what else might we do?

Well, we might want to superimpose one image on top of another. This could be part of an image editing application, where we have layers of images and want to superimpose all the layers to derive the finished image for the screen. Or we might be implementing Conway’s Game of Life, and might want to ‘paste’ a pattern like a glider onto a larger universe.

Let’s go with a very simple implementation: We’re only editing black-and-white images, and each ‘pixel’ is either a ⚪️ or ⚫️. If we use two-dimensional arrays to represent our images, we need to iterate over every ‘pixel’ to perform the superimposition:

constsuperimposeCell=(left,right)=>(left==='⚫️'||right==='⚫️')?'⚫️':'⚪️';constsuperimposeRow=(left,right)=>[...zipWith(superimposeCell,left,right)];constsuperimposeArray=(left,right)=>[...zipWith(superimposeRow,left,right)];constcanvas=[['⚪️','⚪️','⚪️','⚪️'],['⚪️','⚪️','⚪️','⚪️'],['⚪️','⚪️','⚪️','⚫️'],['⚪️','⚪️','⚪️','⚫️']];constglider=[['⚪️','⚪️','⚪️','⚪️'],['⚪️','⚫️','⚪️','⚪️'],['⚫️','⚪️','⚪️','⚪️'],['⚫️','⚫️','⚫️','⚪️']];superimposeArray(canvas,glider)//=>([['⚪️','⚪️','⚪️','⚪️'],['⚪️','⚫️','⚪️','⚪️'],['⚫️','⚪️','⚪️','⚫️'],['⚫️','⚫️','⚫️','⚫️']])

Seems simple enough. How about superimposing a quadtree on a quadtree?


two trees

Two trees, © 2013 Jon Bunting, some rights reserved


recursive operations on pairs of quadtrees

We can use multirec to superimpose one quadtree on top of another: Our function will take a pair of quadtrees, using destructuring to extract one called left and the other called right:

constsuperimposeQuadTrees=multirec({indivisible:({left,right})=>isString(left),value:({left,right})=>right==='⚫️'?right:left,divide:({left,right})=>[{left:left.ul,right:right.ul},{left:left.ur,right:right.ur},{left:left.lr,right:right.lr},{left:left.ll,right:right.ll}],combine:([ul,ur,lr,ll])=>({ul,ur,lr,ll})});quadTreeToArray(superimposeQuadTrees({left:arrayToQuadTree(canvas),right:arrayToQuadTree(glider)}))//=>([['⚪️','⚪️','⚪️','⚪️'],['⚪️','⚫️','⚪️','⚪️'],['⚫️','⚪️','⚪️','⚫️'],['⚫️','⚫️','⚫️','⚫️']])

Again, this feels like faffing about just so we can be recursive. But we are in position to do something interesting!


optimizing recursive algorithms with isomorphic data structures

Many images have large regions that are entirely white or black. When superimposing one region on another, if either region is entirely white, we know the result must be the same as the other region. When superimposing one region on another, if either region is entirely black, the result must be entirely black.

We can use the quadtree’s hierarchal representation to exploit this. We’ll store some extra information in each quadtree, its colour: If it is entirely white, its colour will be white. If it is entirely black, its colour will be black. And if it contains a mix of white and black cells, its colour will be a question mark.

constisOneByOneArray=(something)=>Array.isArray(something)&&something.length===1&&Array.isArray(something[0])&&something[0].length===1;constcontentsOfOneByOneArray=(array)=>array[0][0];constdivideSquareIntoRegions=(square)=>{constupperHalf=firstHalf(square);constlowerHalf=secondHalf(square);constupperLeft=upperHalf.map(firstHalf);constupperRight=upperHalf.map(secondHalf);constlowerRight=lowerHalf.map(secondHalf);constlowerLeft=lowerHalf.map(firstHalf);return[upperLeft,upperRight,lowerRight,lowerLeft];};constcolour=(something)=>{if(something.colour!=null){returnsomething.colour;}elseif(something==='⚪️'){return'⚪️';}elseif(something==='⚫️'){return'⚫️';}else{throw"Can't get the colour of this thing";}};constcombinedColour=(...elements)=>elements.reduce((acc,element=>acc===element?element:'❓'))constregionsToQuadTree=([ul,ur,lr,ll])=>({ul,ur,lr,ll,colour:combinedColour(ul,ur,lr,ll)});constarrayToQuadTree=multirec({indivisible:isOneByOneArray,value:contentsOfOneByOneArray,divide:divideSquareIntoRegions,combine:regionsToQuadTree});arrayToQuadTree([['⚪️','⚪️'],['⚪️','⚪️']]).colour//=> "⚪️"arrayToQuadTree([['⚪️','⚪️'],['⚪️','⚫️']]).colour//=> "❓"arrayToQuadTree([['⚫️','⚫️'],['⚫️','⚫️']]).colour//=> "⚫️"arrayToQuadTree([['⚪️','⚪️','⚪️','⚪️'],['⚪️','⚫️','⚪️','⚪️'],['⚫️','⚪️','⚪️','⚪️'],['⚫️','⚫️','⚫️','⚪️']]).colour//=> "❓"

Now, we can take advantage of every region’s computed colour when we superimpose “coloured” quadtrees:

consteitherAreEntirelyColoured=({left,right})=>colour(left)!=='❓'||colour(right)!=='❓';constsuperimposeColoured=({left,right})=>{if(colour(left)==='⚪️'||colour(right)==='⚫️'){returnright;}elseif(colour(left)==='⚫️'||colour(right)==='⚪️'){returnleft;}else{throw"Can't superimpose these things";}};constdivideTwoQuadTrees=({left,right})=>[{left:left.ul,right:right.ul},{left:left.ur,right:right.ur},{left:left.lr,right:right.lr},{left:left.ll,right:right.ll}];constcombineColouredRegions=([ul,ur,lr,ll])=>({ul,ur,lr,ll,colour:combinedColour(ul,ur,lr,ll)});constsuperimposeColouredQuadTrees=multirec({indivisible:eitherAreEntirelyColoured,value:superimposeColoured,divide:divideTwoQuadTrees,combine:combineColouredRegions});

We get the same output, but now instead of comparing every cell whenever we superimpose quadtrees, we compare entire regions at a time. If either is “entirely coloured,” we can return the other one without recursively drilling down to the level of individual pixels.

There is no savings if both quadtrees are composed of a fairly evenly spread mix of black and white pixels (e.g. a checkerboard pattern), but in cases where there are large expanses of white or black, the difference is substantial.

In the case of comparing the 4x4 canvas and glider images above, the superimposeArray function requires sixteen comparisons. The superimposeQuadTrees function requires twenty comparisons. But the superimposeColouredQuadTrees function requires just seven comparisons.

If we were writing an image manipulation application, we’d provide much snappier behaviour using coloured quadtrees to represent images on screen.

The interesting thing about this optimization is that it is tuned to the characteristics of both the data structure and the algorithm: It is not something that is easy to perform in the algorithm without the data structure, or in the data structure without the algorithm.

And it’s not the only optimization. Remember our ‘whirling regions’ implementation of rotateQuadTree? Here’s rotateColouredQuadTree:

constisEntirelyColoured=(something)=>colour(something)!=='❓';constrotateColouredQuadTree=multirec({indivisible:isEntirelyColoured,value:itself,divide:quadTreeToRegions,combine:regionsToRotatedQuadTree});

Any region that is entirely white or entirely black is its own rotation, so no further dividing and conquering need be done. For images that have large areas of blank space, the “whirling regions” algorithm is not just aesthetically delightful, it’s faster than a brute-force transposition of array elements.

Optimizations like this can only be implemented when the algorithm and the data structure are isomorphic to each other.


Game of Life 6

Detail from “Game of Life 6,” © Windell Oskay, some rights reserved


why!

So back to, “Why convert data into a structure that is isomorphic to our algorithm.”

The first reason to do so, is that the code is clearer and easier to read if we convert, then perform operations on the data structure, and then convert it back (if need be).

The second reason do do so, is that if we want to do lots of different operations on the data structure, it is much more efficient to keep it in the form that is isomorphic to the operations we are going to perform on it.

The example we saw was that if we were building a hypothetical image processing application, we could convert an image into quad trees, then rotate or superimpose images at will. We would only need to convert our quadtrees when we need to save or display the image in a rasterized (i.e. array-like) format.

And third, we saw that once we embraced a data structure that was isomorphic to the form of the algorithm, we could employ elegant optimizations that are impossible (or ridiculously inconvenient) when the algorithm and data structure do not match.

Separating conversion from operation allows us to benefit from all three reasons for ensuring that our algorithms and data structures are isomorphic to each other.


afterward

Have an observation? Spot an error? You can open an issue, discuss this on hacker news or reddit, or even edit this post yourself.

p.s. Thank you for reading this far. Here is your reward, An Algorithm for Compressing Space and Time. And hey! If you like this kind of thing, JavaScript Allongé is exactly the kind of thing you’ll like.


notes


Viewing all articles
Browse latest Browse all 25817

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>