Algorithm treatises focus on traditional data structures. The linked list with its nodes. The binary tree. The dictionary. Of course, the array and the string. But they all skip the most basic data structure of all. The bitmap.
I love Jon Bentley’s first chapter in Programming Pearls. The chapter itself is called cracking the oyster. And the first pearl you discover is amazing. I don’t want to give it all away right away. Def buy the book. It’s a must read.
And it turns out that there are many applications of that data type, which we’ll get into in the next blog. But the fundamental mind bending trick here is as follows:
Imagine that you need to represent an array of integers. And imagine that the integers are in a total range of 0 to, say, ten million. Now let us say you need to sort these integers and do it quickly and with very limited memory space. Right now, as a programmer, the first thing you might be asking is 16-bit integers? No, that’s too small and can only hold about 65000 ints. 32-bit integers? 64-bit integers? What’s the system? What’s the system’s natural size of an int? Probably representing my data in 32 bit integers would work. I could represent the numbers from 0 to 10 million that way. And if I have, let’s say, 1 million numbers, then I need 1 million * 32 bits each (4 bytes each) or a total of 4 megabytes to represent my data.
Now clear your mind and ask a different question:
If I had a ten million bit-long array. Of bits. And the bit was set (1) if the value was in my array of integers, and not set (0) if it was not my array, then could I represent my data this way?
Yes!
And how much space would it take? Well, we already said so: 10 million bits. Or about 1.25 megabytes. Less than a third as much space.
Can you use the bitmap data structure to represent more complex data? More on that next time.