Day 1: Historian Hysteria – Advent of Code 2024 in Racket

Published on 04 Dec 2024 by Ashpool.

We are naitsu, and we welcome you to our new blog! We plan to cover a variety of tech-related topics on this website over time, so stay tuned via RSS or Telegram to learn more about who we are and what we do.

My name is Ashpool, and today I’m starting out with a new series of posts about solving the year 2024 Advent of Code puzzles in a programming language called Racket.

If you love programming for fun and don’t know what Advent of Code is, you should totally check it out! It’s a coding-themed advent calendar that takes place every December. Every day between the 1st and the 25th there’s a unique two-part programming challenge to solve, and the nice thing about it is that you can use any programming language or tool you like. Yes, really! The website only expects you to input the answer, and how you compute that answer is up to you. Heck, you can even do it by hand!

There’s also a competition aspect to it. If you are really, really fast you can try and compete for a place on the leaderboard, which means trying to be among the first 100 people to solve the problem each day. But that’s not what I’m planning to do this time, as I’m still feeling my way around Racket and the ecosystem. Instead, what I’m going to try to achieve in this series of posts is teaching Racket by example, memorising it better myself in the process, as well as brushing up on my algorithms.

Much of the code discussed in these articles is available in my GitHub repository. Note that I might have updated the code on GitHub since writing this article. All Racket code snippets in this article are released into the Public Domain.

Let’s get started with our first problem!

Table of Contents

Part One

Problem statement (in a nutshell)

We are given two whitespace-separated columns of integers. We need to produce a series of pairs of integers as follows.

Match the smallest number from the left column with the smallest number from the right column, taking them to produce a pair. Then, match the smallest remaining number from the left column with the smallest remaining number from the right column. Produce another pair. Continue the process until there is no more numbers in either column.

Having done that, we need to compute the sum of distances for the new series of pairs. A pair’s distance is simply how far apart the two integers are on the number line. Calculate that for each pair, add all the values together and you have the answer.

Solution

The solution for this part seems straightforward.

  1. Read the numbers in each column into a separate list.
  2. Sort each column (list) of integers in ascending order. By doing so, we will match each element on the left with its corresponding element on the right, from smallest to largest.
  3. Calculate the distance for each pair.
  4. Sum up all the distances.

Code

Simplified version

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
#lang racket

(define (string->numbers str)
  (map string->number (string-split str)))

(define (main)
  (let*
      ([pairs (map string->numbers (sequence->list (in-lines)))]
       [ids-left (sort (map first pairs) <)]
       [ids-right (sort (map second pairs) <)]
       [pair->distance (λ (id-left id-right) (abs (- id-left id-right)))]
       [distances (map pair->distance ids-left ids-right)]
       [distances-sum (apply + distances)])
    distances-sum))

(with-input-from-file "input" main)

This straightforward implementation is the first iteration of my solution. It only uses native lists to pass series of values around, thus lacking laziness. Let’s go over it line-by-line.

1
#lang racket

This line just tells Racket that the program is written in plain Racket, and not some other module language. Because Racket is a “language for building languages”, programs built with Racket may be written in different languages created on top of it.

3
4
(define (string->numbers str)
  (map string->number (string-split str)))

This little function takes one string argument: str, and returns a list of numbers contained in that string (assuming they are whitespace-separated). It does so in the following order:

  1. Use string-split to obtain a list of whitespace-separated tokens from the whole string.
    • So, for example, a string "123 456" becomes a list '("123" "456").
  2. Use map to apply a built-in function string->number to each element of the list returned from string-split, and save all results in a new list, returning it.
    • Through this operation, '("123" "456") becomes '(123 456). Each string in the list gets converted to a number. Note that this does not change the original list, instead building a new list.
6
7
(define (main)
  (let*

This is the beginning of the main function. The let* syntax allows us to perform step-by-step computation using local binding. It’s a different syntax from plain let, and unlike let, it allows every step of the computation to access the results of the previous steps.

8
      ([pairs (map string->numbers (sequence->list (in-lines)))]

First, we read all lines from the current input port (in-lines), save them in a list (sequence->list), and apply the string->numbers function we implemented earlier to each element of the list using map, which returns the results as a new list. We then bind the list of pairs of numbers to the name pairs.

Later in the program we pass main to the with-input-from-file function.

16
(with-input-from-file "input" main)

This function leverages dynamic binding to set the parameter current-input-port to a newly created input port pointing to the file named "input", and calls main inside it with the parameter set. This parameter then gets picked up by the in-lines function, which uses it as the source from where to read lines. with-input-from-file also takes care of closing the input file after main is done with it.

Remember, we can only access the variable pairs here because we used let* for our step-by-step program. If we used plain let, we wouldn’t be able to do it.

 9
10
       [ids-left (sort (map first pairs) <)]
       [ids-right (sort (map second pairs) <)]

In these lines we split the list of pairs into two lists, each of which represents a column. We do that using map and the functions first and second, which take the first and the second element of each pair respectively. We then sort each column using < as a comparator function, which means sorting them in ascending order.

11
       [pair->distance (λ (id-left id-right) (abs (- id-left id-right)))]

Here we define an ad-hoc function named pair->distance which we later use to calculate the distance between the numbers in each pair. We use λ (which is a built-in alias for lambda) to define a function that takes two arguments: id-left and id-right. All that function does is find the absolute value of the second argument subtracted from the first argument.

12
13
14
       [distances (map pair->distance ids-left ids-right)]
       [distances-sum (apply + distances)])
    distances-sum))

A “hidden superpower” of map is that it can work on several lists in parallel if the mapped function takes more than one argument. We use that in this instance, mapping the two-argument function pair->distance onto two lists: ids-left and ids-right. The result is a single list of calculated distances.

We then invoke apply to call the function + with the contents of distances as its arguments, which effectively adds all the numbers in it together. And by returning the result from let*, we arrive at our answer.

“Lazy” version

Two things I like about Racket are its support for lazy evaluation and generic data structures. We will be relying on some third-party libraries for this, sure, but we don’t have to. The program just wouldn’t be as generic if we used the built-in streams.

Here’s the original version of the program I used to get my answer to the puzzle.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
#lang racket

(require
 (prefix-in seq: seq)
 (prefix-in type: relation/type)
 (prefix-in ord: relation/order))

(define (string->numbers str)
  (map string->number (string-split str)))

(define (main)
  (let*
      ([pairs (seq:map string->numbers (type:->stream (in-lines)))]
       [ids-left (ord:sort ord:< (seq:map first pairs))]
       [ids-right (ord:sort ord:< (seq:map second pairs))]
       [distances (seq:map (compose abs -) ids-left ids-right)])
    (seq:foldl + 0 distances)))

(with-input-from-file "input" main)

The underlying logic is exactly the same, so I won’t be going over it in much detail. I will, however, highlight the differences to the simplified version.

Notice how in the original program we had to construct intermediary lists in memory several times:

  • after reading the lines from the file,
  • after mapping string->numbers,
  • after extracting the columns,
  • after calculating the distances.

By using a lazy data structure (a stream in this case), we effectively eliminate all intermediary lists except for one step (after extracting the columns). And even that we don’t have to do explicitly: sort from the relation⁠/⁠order library supports lazy data structures, but it will collect all the values in memory before sorting because sorting requires access to the entire series of values. So in this case we can’t build a program that is lazy all the way up to calculating the final answer, but we can eliminate intermediary lists where possible.

In order to achieve this, instead of using sequence->list, we use ->stream from relation/type, which is a generic function that supports many types of sequences and converts them to a stream. Thus we delay converting the series of lines into a list, saving some memory. Note the prefix type: before ->stream: we chose that prefix when we required relation/type above. The colon in this case is not a special syntax, it is merely a conventional character to end prefixes with.

Why can’t we work with the result of in-lines immediately, without casting it to a stream? Well, we plan on using some functions that require the data structure to be a generic collection, that is, to satisfy functor?. Oddly, native Racket sequences don’t implement that interface, while streams do. In general, when working with data series, we don’t want to think what type of collection it is: sequence, stream, list, vector, or anything else. We want to use generic concepts like map, not special functions like sequence-map or stream-map. So we use generic functions seq:map and ord:sort instead of native map and sort.

Another thing I did was eliminate an extra line of code where I declared a lambda named pair->distance. Instead I used compose to compose functions abs and - in-line for some more compact code.

Finally, I replaced the application of + using apply with the use of foldl function from data/collection (which is re-exported by seq). There is a generic apply in data/collection, but it would need to construct an intermediate collection. Meanwhile, foldl operates on one element at a time and has a constant memory footprint.

Part Two

Problem statement (in a nutshell)

Again, we are given a series or pairs of integers that actually represents two series of integers displayed side-by-side. After splitting it into the two columns it represents, for each item in left column we need to find how many times (if any) it occurs in the right column. Then, we need to multiply the item itself by the number of times it accurs in the right column. By doing that for every item in the left column and adding all the values together, we arrive at our answer.

Solution

  1. Read the numbers in each column into a separate series.
  2. For the right column, build a hash table mapping each item that occurs at least once to the count of occurences of that item.
  3. For each item in the left column, use the hash table to look up the number of times it occurs in the right list. Find that item’s score by multiplying the item by the count of occurences. If it does not occur in the right column, the score is zero.
  4. Sum up all the scores over all elements of the left column.

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#lang racket

(require
 (prefix-in seq: seq)
 (prefix-in type: relation/type))

(define (string->numbers str)
  (map string->number (string-split str)))

(define (item-counts seq)
  (let ([counts (make-hash)])
    (for ([item seq])
      (hash-update! counts item add1 0))
    counts))

(define (main)
  (let*
      ([pairs (seq:map string->numbers (type:->stream (in-lines)))]
       [ids-left (seq:map first pairs)]
       [ids-right (seq:map second pairs)]
       [ids-right-counts (item-counts ids-right)]
       [id-score (λ (id) (* id (hash-ref ids-right-counts id 0)))]
       [ids-left-scores (seq:map id-score ids-left)])
    (seq:foldl + 0 ids-left-scores)))

(with-input-from-file "input" main)

The beginning of the main function is exactly the same as in the first part: we read the lines into a stream an map string->numbers onto it. Then we split the columns into separate series, except this time there’s no need to sort the series.

The interesting bit here is the hash table that we build in the function item-counts.

10
11
12
13
14
(define (item-counts seq)
  (let ([counts (make-hash)])
    (for ([item seq])
      (hash-update! counts item add1 0))
    counts))

Here we used a mutable hash table for efficiency, starting out with an empty hash table returned from make-hash. We then go over all items in an input sequence seq, and for each item we update the hash table as follows.

  1. Find out if the key (item) is already present in the hash. If not, use 0.
  2. Add one to the value corresponding to the key (or zero).
  3. Store the new value in the hash table under the same key.

The nice thing about hash-update! is that Racket doesn’t have to look up the value twice to update it, unlike some other languages.

Further in the main function we define a function id-score to multiple the ID by the its number of occurences in the right column, multiplying by 0 if the value was not found in the right column. We then map this function onto ids-left, and finally add all the scores together using foldl as before.

Conclusion

The problems themselves are what you’d expect from the first day of Advent of Code – not too difficult. So with this post I took some time to introduce the reader to some features of the Racket language and the ecosystem that we will be using in the future solutions.

I hope you learned something. Again, please subscribe to our RSS and/or Telegram to be the first to learn about our progress with this project and others. I’ll see you in the next posts!

Thanks to ftvkyo for reviewing my solution and helping me stay motivated. Check out their website, there’s some pretty cool stuff on there.

Ashpool, naitsu team.

Last updated: 06 Dec 2024.