The Priority Queue is a fundamental data structure, but one that is relatively new to the game.
Is it a data structure? Or is it just a way of looking at queues? and arrays?
In this entry, we’re using Priority Queues as a way to hold only the small number of max or min values from a streaming input.
It has a number of applications, including Huffman data compression and Dijkstra’s algorithm. So if you want to make a billion dollars by developing a new data compression method, you will want to understand the Priority Queue.
Imagine that you have a franchise of movie theaters with hundreds of transactions per day at each location and thousands of locations. And that you are interested in which theaters are doing the best business and which movies are doing the best. Sounds like a common use case, right?
Extend the example a bit: You are receiving the data in real time from each theater to your phone. On a given summer weekend evening you’re receiving a million transactions on every each quarter hour — shortly after the movies in your multiplexes start. And you are receiving all this data on your phone. It is being streamed to you as space-delimited data with each line containing the movie theater, the movie, the date and the dollar value of the gross receipts for that movie, and a carriage return at the end of each line.
Like this:
ohiocineplex14 Rocky2 4/12/2022 9:15 $500
indianacineplex23 Rocky3 4/12/2022 9:15 $900
Not only do you not want to store all the data, you can’t. If you store the data every night, you find that the memory allocation in your app gets filled and your app crashes. In any case, you don’t really want to see all the data. All you really want is the data from the movies that are bringing in the most money and you want to know the theaters that are doing the best.
So lets create a Priority Queue that keeps the income from the top 5 movies or the top 5 theater locations. Let us call the dictionary that holds the income from a specific movie at a specific location <MovieReceipts>.
With our Priority Queue, we want to find the largest k items in a stream of n items, where those items are of type <MovieReceipts>.
To do this, the general concept is to create a Priority Queue where we can delete the minimum. Once we have this, we can read the data stream, and add incoming items to the Priority Queue. If the number of items in the Priority Queue exceeds k, then we delete the minimum item.
Consider the elementary implementations first. In this implementation, we’ll create a Priority Queue as an array. And when the number of entries in the Priority Queue exceeds the max, we will iterate through the array to remove the minimum. Most of the code is parsing the lines.
First, here’s a list of results. Copy these to a text file called movieReceipts.txt and move it to the home folder.
ohiocineplex14 Rocky 4/12/2022 $1400
ohiocineplex14 Rocky2 4/12/2022 $1500
ohiocineplex14 Rocky3 4/12/2022 $200
ohiocineplex14 Rocky4 4/12/2022 $16000
ohiocineplex14 Rocky 4/12/2022 $1300
ohiocineplex14 Rocky2 4/12/2022 $14500
ohiocineplex14 Rocky3 4/12/2022 $2050
ohiocineplex14 Rocky4 4/12/2022 $160300
ohiocineplex14 Rocky 4/12/2022 $140
ohiocineplex14 Rocky2 4/12/2022 $500
ohiocineplex14 Rocky3 4/12/2022 $900
ohiocineplex14 Rocky4 4/12/2022 $1000
ohiocineplex14 Rocky 4/12/2022 $144500
ohiocineplex14 Rocky2 4/12/2022 $150
ohiocineplex14 Rocky3 4/12/2022 $20
ohiocineplex14 Rocky4 4/12/2022 $1600
Then create a Mac console app like we’ve done before. The first func we’ll add — ‘runPriorityQueue()’ — will read in the file and parse the lines one by one. It’s not a real streaming input, but since we’re parsing the lines one at a time, it will behave like one:
public func runPriorityQueue() {
// Input the path
print(“Please enter the name of a file in your home directory:”)
let filePath = readLine() ?? “”
let home = FileManager.default.homeDirectoryForCurrentUser
let maybeFile = home.appendingPathComponent(filePath)
let fileManager = FileManager.default
do {
let contents = try fileManager.contentsOfDirectory(atPath: home.path)
let urls = contents.map { home.appendingPathComponent($0) }
if urls.contains(maybeFile) {
print(“found the file”)
}
} catch {
print(“couldn’t find file”)
}
var inString = “”
do {
inString = try String(contentsOf: maybeFile)
let s = String(inString)
// read in the lines one by one:
var lowerIndexInt = 0
var upperIndexInt = s.count
var lowerIndex = s.startIndex
var upperIndex = s.endIndex
let end = String.Index(utf16Offset: upperIndexInt, in: s)
var done: Bool = false
while !done {
// Create a substring to the first n
let start = String.Index(utf16Offset: lowerIndexInt, in: s)
let substring = String(s[start..<end])
lowerIndex = substring.startIndex
upperIndex = substring.firstIndex(of: “n”) ?? substring.endIndex
var line: String
if upperIndex == substring.endIndex {
line = substring
done = true
} else {
line = String(substring[lowerIndex…upperIndex])
lowerIndexInt = lowerIndexInt + (upperIndex.utf16Offset(in: substring) + 1)
}
// Add that substring to the Priority Queue
addToPriorityQueue(movieReceipt: line)
}
} catch {
assertionFailure(“Failed reading from URL: (maybeFile), Error: ” + error.localizedDescription)
}
print(“final PQ = (unorderedPriorityQueue)”)
}
This function will parse the lines and then add them one at a time to the Priority Queue — unorderedPriorityQueue
var unorderedPriorityQueue = [String]()
let priorityQueueMaxSize = 6
// Add a new value to the Priority Queue and delete the minimum if the size of the Priority Queue exceeds the max:
public func addToPriorityQueue(movieReceipt: String) {
unorderedPriorityQueue.append(movieReceipt)
if unorderedPriorityQueue.count > priorityQueueMaxSize {
deleteMin()
}
}
After we exceed the priorityQueueMaxSize, we’ll call the code that deletes the minimum value:
// Find the minimum value in the Priority Queue and delete it:
public func deleteMin() {
var minIndex: Int = 0
for (index, _) in unorderedPriorityQueue.enumerated() {
if compare(s1: unorderedPriorityQueue[index], s2: unorderedPriorityQueue[minIndex]) {
minIndex = index
}
}
unorderedPriorityQueue = exchangeWithEnd(a1: unorderedPriorityQueue, s1: minIndex)
unorderedPriorityQueue.remove(at: unorderedPriorityQueue.count – 1)
}
public func exchangeWithEnd(a1: [String], s1: Int) -> [String] {
var a = a1
let temp = a[s1]
a[s1] = a[a.count – 1]
a[a.count – 1] = temp
return a
}
// Compare two income values:
public func compare(s1: String, s2: String) -> Bool {
return getValue(s: s1) < getValue(s: s2)
}
// Parse the income Float value:
public func getValue(s: String) -> Float {
var lowerIndex = s.firstIndex(of: “$”) ?? s.startIndex
lowerIndex = s.index(after: lowerIndex)
let upperIndex = s.firstIndex(of: “n”) ?? s.endIndex
var str: String
if upperIndex == s.endIndex {
str = String(s[lowerIndex..<upperIndex])
} else {
str = String(s[lowerIndex..<upperIndex])
}
let val = Float(str) ?? 0.0
return val
}
And that’s it. You can run this by calling runPriorityQueue() from the main.swift file, and then enter the file name ‘movieReceipts.txt’ in the console below the prompt.
Please enter the name of a file in your home directory:
We’ve not discussed any applications yet. But you can already start to imagine how you might process the adjacent pixel values from an image while running a compression algorithm — put the adjacent values in a priority queue, process them and move on. Next time we’ll get into more complex — and faster — variations.