Day 1: Historian Hysteria – Advent of Code 2024 in Racket
Published on 04 Dec 2024 by Ashpool.
We are naitsu, and we welcome you to our new blog! We plan to cover a variety of tech-related topics on this website over time, so stay tuned via RSS or Telegram to learn more about who we are and what we do.
My name is Ashpool, and today I’m starting out with a new series of posts about solving the year 2024 Advent of Code puzzles in a programming language called Racket.
If you love programming for fun and don’t know what Advent of Code is, you should totally check it out! It’s a coding-themed advent calendar that takes place every December. Every day between the 1st and the 25th there’s a unique two-part programming challenge to solve, and the nice thing about it is that you can use any programming language or tool you like. Yes, really! The website only expects you to input the answer, and how you compute that answer is up to you. Heck, you can even do it by hand!
There’s also a competition aspect to it. If you are really, really fast you can try and compete for a place on the leaderboard, which means trying to be among the first 100 people to solve the problem each day. But that’s not what I’m planning to do this time, as I’m still feeling my way around Racket and the ecosystem. Instead, what I’m going to try to achieve in this series of posts is teaching Racket by example, memorising it better myself in the process, as well as brushing up on my algorithms.
Much of the code discussed in these articles is available in my GitHub repository. Note that I might have updated the code on GitHub since writing this article. All Racket code snippets in this article are released into the Public Domain.
Let’s get started with our first problem!
Table of Contents
Part One
Problem statement (in a nutshell)
We are given two whitespace-separated columns of integers. We need to produce a series of pairs of integers as follows.
Match the smallest number from the left column with the smallest number from the right column, taking them to produce a pair. Then, match the smallest remaining number from the left column with the smallest remaining number from the right column. Produce another pair. Continue the process until there is no more numbers in either column.
Having done that, we need to compute the sum of distances for the new series of pairs. A pair’s distance is simply how far apart the two integers are on the number line. Calculate that for each pair, add all the values together and you have the answer.
Solution
The solution for this part seems straightforward.
- Read the numbers in each column into a separate list.
- Sort each column (list) of integers in ascending order. By doing so, we will match each element on the left with its corresponding element on the right, from smallest to largest.
- Calculate the distance for each pair.
- Sum up all the distances.
Code
Simplified version
|
|
This straightforward implementation is the first iteration of my solution. It only uses native lists to pass series of values around, thus lacking laziness. Let’s go over it line-by-line.
|
|
This line just tells Racket that the program is written in plain Racket, and not some other module language. Because Racket is a “language for building languages”, programs built with Racket may be written in different languages created on top of it.
|
|
This little function takes one string argument: str
, and returns a list of numbers contained in that string (assuming they are whitespace-separated). It does so in the following order:
- Use
string-split
to obtain a list of whitespace-separated tokens from the whole string.- So, for example, a string
"123 456"
becomes a list'("123" "456")
.
- So, for example, a string
- Use
map
to apply a built-in functionstring->number
to each element of the list returned fromstring-split
, and save all results in a new list, returning it.- Through this operation,
'("123" "456")
becomes'(123 456)
. Each string in the list gets converted to a number. Note that this does not change the original list, instead building a new list.
- Through this operation,
|
|
This is the beginning of the main
function. The let*
syntax allows us to perform step-by-step computation using local binding. It’s a different syntax from plain let
, and unlike let
, it allows every step of the computation to access the results of the previous steps.
|
|
First, we read all lines from the current input port (in-lines
), save them in a list (sequence->list
), and apply the string->numbers
function we implemented earlier to each element of the list using map
, which returns the results as a new list. We then bind the list of pairs of numbers to the name pairs
.
Later in the program we pass main
to the with-input-from-file
function.
|
|
This function leverages dynamic binding to set the parameter current-input-port
to a newly created input port pointing to the file named "input"
, and calls main
inside it with the parameter set. This parameter then gets picked up by the in-lines
function, which uses it as the source from where to read lines. with-input-from-file
also takes care of closing the input file after main
is done with it.
Remember, we can only access the variable pairs
here because we used let*
for our step-by-step program. If we used plain let
, we wouldn’t be able to do it.
|
|
In these lines we split the list of pairs into two lists, each of which represents a column. We do that using map
and the functions first
and second
, which take the first and the second element of each pair respectively. We then sort
each column using <
as a comparator function, which means sorting them in ascending order.
|
|
Here we define an ad-hoc function named pair->distance
which we later use to calculate the distance between the numbers in each pair. We use λ
(which is a built-in alias for lambda
) to define a function that takes two arguments: id-left
and id-right
. All that function does is find the absolute value of the second argument subtracted from the first argument.
|
|
A “hidden superpower” of map
is that it can work on several lists in parallel if the mapped function takes more than one argument. We use that in this instance, mapping the two-argument function pair->distance
onto two lists: ids-left
and ids-right
. The result is a single list of calculated distances.
We then invoke apply
to call the function +
with the contents of distances
as its arguments, which effectively adds all the numbers in it together. And by returning the result from let*
, we arrive at our answer.
“Lazy” version
Two things I like about Racket are its support for lazy evaluation and generic data structures. We will be relying on some third-party libraries for this, sure, but we don’t have to. The program just wouldn’t be as generic if we used the built-in streams.
Here’s the original version of the program I used to get my answer to the puzzle.
|
|
The underlying logic is exactly the same, so I won’t be going over it in much detail. I will, however, highlight the differences to the simplified version.
Notice how in the original program we had to construct intermediary lists in memory several times:
- after reading the lines from the file,
- after mapping
string->numbers
, - after extracting the columns,
- after calculating the distances.
By using a lazy data structure (a stream in this case), we effectively eliminate all intermediary lists except for one step (after extracting the columns). And even that we don’t have to do explicitly: sort
from the relation/order
library supports lazy data structures, but it will collect all the values in memory before sorting because sorting requires access to the entire series of values. So in this case we can’t build a program that is lazy all the way up to calculating the final answer, but we can eliminate intermediary lists where possible.
In order to achieve this, instead of using sequence->list
, we use ->stream
from relation/type
, which is a generic function that supports many types of sequences and converts them to a stream. Thus we delay converting the series of lines into a list, saving some memory. Note the prefix type:
before ->stream
: we chose that prefix when we required relation/type
above. The colon in this case is not a special syntax, it is merely a conventional character to end prefixes with.
Why can’t we work with the result of in-lines
immediately, without casting it to a stream? Well, we plan on using some functions that require the data structure to be a generic collection, that is, to satisfy functor?
. Oddly, native Racket sequences don’t implement that interface, while streams do. In general, when working with data series, we don’t want to think what type of collection it is: sequence, stream, list, vector, or anything else. We want to use generic concepts like map, not special functions like sequence-map
or stream-map
. So we use generic functions seq:map
and ord:sort
instead of native map
and sort
.
Another thing I did was eliminate an extra line of code where I declared a lambda named pair->distance
. Instead I used compose
to compose functions abs
and -
in-line for some more compact code.
Finally, I replaced the application of +
using apply
with the use of foldl
function from data/collection
(which is re-exported by seq
). There is a generic apply
in data/collection
, but it would need to construct an intermediate collection. Meanwhile, foldl
operates on one element at a time and has a constant memory footprint.
Part Two
Problem statement (in a nutshell)
Again, we are given a series or pairs of integers that actually represents two series of integers displayed side-by-side. After splitting it into the two columns it represents, for each item in left column we need to find how many times (if any) it occurs in the right column. Then, we need to multiply the item itself by the number of times it accurs in the right column. By doing that for every item in the left column and adding all the values together, we arrive at our answer.
Solution
- Read the numbers in each column into a separate series.
- For the right column, build a hash table mapping each item that occurs at least once to the count of occurences of that item.
- For each item in the left column, use the hash table to look up the number of times it occurs in the right list. Find that item’s score by multiplying the item by the count of occurences. If it does not occur in the right column, the score is zero.
- Sum up all the scores over all elements of the left column.
Code
|
|
The beginning of the main
function is exactly the same as in the first part: we read the lines into a stream an map string->numbers
onto it. Then we split the columns into separate series, except this time there’s no need to sort the series.
The interesting bit here is the hash table that we build in the function item-counts
.
|
|
Here we used a mutable hash table for efficiency, starting out with an empty hash table returned from make-hash
. We then go over all items in an input sequence seq
, and for each item we update the hash table as follows.
- Find out if the key (item) is already present in the hash. If not, use
0
. - Add one to the value corresponding to the key (or zero).
- Store the new value in the hash table under the same key.
The nice thing about hash-update!
is that Racket doesn’t have to look up the value twice to update it, unlike some other languages.
Further in the main
function we define a function id-score
to multiple the ID by the its number of occurences in the right column, multiplying by 0
if the value was not found in the right column. We then map this function onto ids-left
, and finally add all the scores together using foldl
as before.
Conclusion
The problems themselves are what you’d expect from the first day of Advent of Code – not too difficult. So with this post I took some time to introduce the reader to some features of the Racket language and the ecosystem that we will be using in the future solutions.
I hope you learned something. Again, please subscribe to our RSS and/or Telegram to be the first to learn about our progress with this project and others. I’ll see you in the next posts!
Thanks to ftvkyo for reviewing my solution and helping me stay motivated. Check out their website, there’s some pretty cool stuff on there.
Ashpool, naitsu team.
Last updated: 06 Dec 2024.