getting it done

june 2025

code is a hammer and i wack every nail with it

i write one off scripts fairly often, sometimes to sanity check my math, sometimes to solve a mundane task, sometimes both. here's a few.

visiting portugal, and the game of azul

a year ago i visited lisbon. i brought back lots of postcards of portugal's famous azulejo tiles. and, without my knowledge, i also brought back something else - a consistent starting advantage while playing azul.

let me explain. azul is a board game about laying tiles, named after portuguese azulejos. like most board games, first to play gives a signficant advantage. unlike most board games, azul has a deterministic, and special way to decide who the starting player is:

"the player who most recently visited portugal takes the starting player marker"

and that was me. for the entire year after my trip, no one among the azul peeps visited portugal.

finally, people were tired of me going first, and a new house rule was created. it looks like this:

while no starting player: generate a random country has anyone been there? yes! starting player = most recent visitor no :( keep going

for this, we needed a random country generator. options online generate one country at a time, and then give you a 10 second advertisement. i spent a few mins writing a script with a simple for loop that prints out random.choice(countries) 10 times. then i wrote an alias in my .zshrc that runs the script.

~ azul Australia Serbia Chile Bosnia and Herzegovina Palau Vatican City Peru Laos Tanzania Zimbabwe

archival translation, and the leetcode question

my roommate sofia works on archives, and part of her job is translation. a few months ago, she showed me two excel sheets, one containing a spanish vocabulary, the other containing an english vocabulary, and gave me a problem:

take the identifiers in english sheet and compare them to the identifiers in spanish sheet. if the identifiers are exist in both sheets, ignore. if the identifiers are only present in english sheet, keep the pre-label, alt-label, scope note and history note. in other words, we want all terms in the english sheet that have not yet been translated into spanish.

leetcode pandas easy. i thought to myself. should just be a filter.

and that was it. read in the excel files with pandas, do a filter by the identifiers, take only a subset of columns, and then write it back into an excel sheet.

import pandas as pd spanish = pd.read_excel('data/spanish.xlsx') english = pd.read_excel('data/english.xlsx') result = english[~english['identifier'].isin(spanish['Identifier'])] result = result[['identifier', 'prefLabel', 'altLabel', 'description', 'historyNote']] result.reset_index(drop=True, inplace=True) result.to_excel('data/english_but_not_spanish.xlsx', index=False)

the prime trick, and number theory

i went to budapest last summer to study math. one of the two classes i took involved solving math puzzles. though i spent most of my time staring at 8 by 8 boards or muttering about patterns to deterministically crack open a door whose state is unknown, i also spent some time coding - there were many problems that required hand calculating long sums or products, or brute force computing a sequence until a pattern emerged - and i was certainly not known for my patience.

one of the first problems given to us was called the prime trick, and our problem was to figure out the magic sauce to it:

"here's a magic trick, think of at most 7 prime numbers which are greater than 10. calculate the squares of primes and add up these squares, tell me the sum and i will tell you how many numbers were added up."

needless to say, spoilers to the solution ahead, pause now if you want an attempt.

usually with problems like these, the conditions stated are both necessary and sufficient. this problem is tricky, however, because the conditions stated are actually sufficient but not strictly necessary. in fact, this trick works as long as all numbers are odd and larger or equal to 3 — nothing to do with primes, nothing to do with them being greater than 10.

*sigh*

the hint to this problem lies in the words at most 7, and the answer can be found through reasoning with basic number theory.

all primes larger than 10 are odd, and can be written as \( 2n + 1\), \( n \in \mathbb{Z}^+\).

square that, and you get: \( (2n + 1)^2 = 4n^2 + 4n + 1 = 4n(n+1) + 1 \)

\( n (n + 1)\) will always be even, and \( 4n (n + 1) \) will always be a multiple of 8 (one more than the magic number 7).

so \( (2n + 1)^2 \equiv 1\pmod{8} \), and the sum of \( k \) primes larger than 10 is \( \equiv k \pmod{8} \). and the congruency will not "cycle" because we capped the number of squares of primes to be at most 7.

of course none of this came naturally. instead, i caught a vibe that the answer had something to do with the sum being congruent to something modulo something from the at most 7 prime numbers part, and brute forced to a solution via code.

here's the code, with docstrings only for obvious helper functions for brevity.

def squared_primes(lowest, highest): """Computes all primes between lowest and highest, and squares them.""" def sum_of_items(items, k): """ Computes all combinations of `k` elements of the list `items`, and returns the sum of elements for each. """ # compute all squares of primes to some arbitrary upper limit (40) squares = squared_primes(10, 40) # for all possible elements in the sum (1-7) for i in range(7): # take all possible combinations of i + 1 elements of the list # of squared primes, sum each combination, and compute its value # modulo 8. print(np.mod(np.array(sums_of_items(squares, i + 1)), 8))

i reran this code many times, swapping out the magic modulus manually with increasing numbers starting from 2, and eventually, with 8 as the modulus:

misc git:(main) ✗ python3 math/sum_of_primes.py [1 1 1 1 1 1 1 1] [2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2] [3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3] [4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4] [5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5] [6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6] [7 7 7 7 7 7 7 7]

bingo! the sum of \( k \) squared primes fitting the conditions always gave \(k \) modulo 8. and performing the magic trick was simply a matter of computing the remainer of the sum modulo 8. what was left was coming up with the number theory as to why, which went way faster than the initial fumbling and bumbling taken to find the pattern.

homework! :(

the first homework in my theory of computation class required us to count.

  1. how many strings of length 5 over the alphabet {0, 1} contain the substring 01 but not the substring 11?
  2. how many strings of length 5 over the alphabet {0, 1} contain the substring 11 but not the substring 01?
  3. how many strings of length 5 over the alphabet {0, 1} contain the substring 11 but not the substring 00?

i didn't want to do this by hand, and write out all 32 possible length 5 strings, and manually check against condition 1 and 2, and agonize and check again to make sure i didn't make a mistake. so i wrote a script. the homework also required showing your work, so i added a screenshot of my code for good measure, to ensure that full credit.

from itertools import product all_bin = [''.join(p) for p in product(["0", "1"], repeat=5)] a = [s for s in all_bin if "01" in s and "11" not in s] b = [s for s in all_bin if "11" in s and "01" not in s] c = [s for s in all_bin if "11" in s and "00" not in s] print(all_bin) print(a, f"count: {len(a)}") print(b, f"count: {len(b)}") print(c, f"count: {len(c)}")

finder

i use zotero to manage papers and citations, and the papers i save go into a local directory for future reading when i'm not connected to the interwebs. i customized the name format the papers are saved with, which includes the full title and the first few authors, so searching for papers by keyword could be done trivially in my file system.

then, i found myself pulling up the same papers a lot, like a lot. so i wrote a shell script that takes a keyword, looks through the directory zotero saves to, and opens the closest match. the shell script looks like this:

#!/bin/bash keyword=$1 # find the top match with find (recursively) and grep (case-insensitive) match=$(find "$HOME/papers" -type f -iname "*$keyword*" | head -n 1) # open if a match is found if [ -n "$match" ]; then open "$match" else echo "no paper found! :(" fi

there's lots to be desired about the search algorithm, but it remains a surprisingly handy tool, especially when all i remember is a single, distinctive keyword. i have a similar tool for digital textbooks, so pulling up a chapter to a forgotten concept is as simple as typing in book bishop into my terminal.