# The longest interstate walk

Meandering through US states and territories with Julia, never looking back. Well, just their USPS codes.

Meandering through US states and territories with Julia, never looking back. Well, just their USPS codes.

This is not going to be as didactic (and thus not nearly as long) as my last post, but I thought I'd share my Julia code for this week's Riddler Classic.

# The puzzle

The idea is to chain USPS 2-letter codes for US states & territories, overlapping 1 character in between, and not repeating any code in a chain. You can think of it as states/territories share a border if they share a letter in their 2-letter code. Or as they put it:

The challenge is to find the longest string of letters in which (1) every pair of consecutive letters is a two-letter state or territory abbreviation, and (2) no state abbreviation occurs more than once. For example, Guam, Utah and Texas can be combined into the valid four-letter string GUTX. Another valid string is ALAK (Alabama, Louisiana and Alaska), while ALAL (Alabama, Louisiana and Alabama) is invalid because it includes the same state, Alabama, twice.

They also provided a link to a USPS page with a list of codes.

# Julia solution

I first just copy/pasted the codes into a raw string, and parsed it. I also added the codes for reaching the armed forces.

tbl = """
State/Possession

Abbreviation

Alabama

AL

AK
…

Wisconsin

WI

Wyoming

WY
"""

codes = vcat(map(strip, split(tbl, "\n\n")[4:2:end]), ["AA", "AP", "AE"])


Because these are always 2-letter codes, the simplest solution is probably to use a Char-to-Char multimap. I opted to use a Trie/prefix tree:

using DataStructures

top = prefix[1:2]
if length(results[top]) < length(prefix)
println("Replacing $(results[top]) ->$prefix.")
results[top] = prefix
end
c = string(prefix[end])
s = subtrie(t, c)
s == nothing && return
for k in keys(s)
t = deepcopy(t)
delete!(t, string(c,k))
end
end


This function takes results to keep track of the longest observed sequence, prefix — the current state of the string, and t, the Trie with the currently available codes. It builds a sub-trie based on the end of prefix, and recursively calls itself with a pruned trie.

The search gets kicked off with this function:

function getlongest(codes)
results = DefaultDict{String,String}("")
for first in codes
t = Trie(codes)
delete!(t, first)
end
results
end


Both of these functions rely on deleting a key from the Trie, which actually, is not provided in this library. In Julia however, it's easy to create new methods for types you don't own:

function Base.delete!(t::Trie{T}, k::AbstractString) where T
node = subtrie(t, k)
if node != nothing node.is_key = false end
t
end


# Solution

There's not much to do from here. Running:

@time results = getlongest(codes)
# Abridged output:
#
#  Replacing  -> FM.
#  Replacing FM -> FMT.
#  Replacing FMT -> FMTX.
#  Replacing FMTX -> FMTNM.
#  Replacing FMTNM -> FMTNMP.
#  Replacing FMTNMP -> FMTNMPA.
#  Replacing FMTNMPA -> FMTNMPAP.
#  Replacing FMTNMPAP -> FMTNMPAPR.
#  Replacing FMTNMPAPR -> FMTNMPAPRI.
#  Replacing FMTNMPAPRI -> FMTNMPAPRID.
#  Replacing FMTNMPAPRID -> FMTNMPAPRIDE.
#  Replacing FMTNMPAPRIDE -> FMTNMPAPRIDCA.
#  Replacing FMTNMPAPRIDCA -> FMTNMPAPRIDCAK.
#  Replacing FMTNMPAPRIDCAK -> FMTNMPAPRIDCAKY.
#  Replacing FMTNMPAPRIDCAKY -> FMTNMPAPRIDCAKSD.
#  Replacing FMTNMPAPRIDCAKSD -> FMTNMPAPRIDCAKSCT.
#  Replacing FMTNMPAPRIDCAKSCT -> FMTNMPAPRIDCAKSCOK.
#  Replacing FMTNMPAPRIDCAKSCOK -> FMTNMPAPRIDCAKSCOHI.
#  Replacing FMTNMPAPRIDCAKSCOHI -> FMTNMPAPRIDCAKSCOHIA.
#  Replacing FMTNMPAPRIDCAKSCOHIA -> FMTNMPAPRIDCAKSCOHIAA.
#  Replacing FMTNMPAPRIDCAKSCOHIAA -> FMTNMPAPRIDCAKSCOHIAAL.
#  Replacing FMTNMPAPRIDCAKSCOHIAAL -> FMTNMPAPRIDCAKSCOHIAALA.
#  Replacing FMTNMPAPRIDCAKSCOHIAALA -> FMTNMPAPRIDCAKSCOHIAALAE.
#  Replacing FMTNMPAPRIDCAKSCOHIAALAE -> FMTNMPAPRIDCAKSCOHINVAALA.
#  Replacing FMTNMPAPRIDCAKSCOHINVAALA -> FMTNMPAPRIDCAKSCOHINVAALAE.
#  Replacing FMTNMPAPRIDCAKSCOHINVAALAE -> FMTNMPAPRIAKSDCAALASCOHINVA.
#  Replacing FMTNMPAPRIAKSDCAALASCOHINVA -> FMTNMPAPWAKSDCAALARIASCOHINJ.
#  Replacing FMTNMPAPWAKSDCAALARIASCOHINJ -> FMTNMPAPWAKSDCAALARIASCOHINVA.
#  Replacing FMTNMPAPWAKSDCAALARIASCOHINVA -> FMTNMPAPWVAKSDCAALARIASCOHINVI.
#  Replacing FMTNMPAPWVAKSDCAALARIASCOHINVI -> FMPAPWAKSDCAALARIASCTNMNVINCOHI.
#  Replacing FMPAPWAKSDCAALARIASCTNMNVINCOHI -> FMPAPWVAKSDCAALARIASCTNMNVINCOHI.
#
#  1631.254820 seconds (4.47 G allocations: 591.410 GiB, 11.13% gc time)


You can see this was probably not a very good approach, with 4 billion allocations (probably 1 for each deepcopy). Note that actual memory usage did not exceed ~1GB.

As you might have guessed, I only showed the output for the actual longest walk starting from FM (Federated States of Micronesia, and provider of the TLD of your favorite podcast's domain, I'm sure).

To find the longest sequence:

maximum(v -> (length(v), v), values(results))
# Output:
#  (32, "FMPAPWVAKSDCAALARIASCTNMNVINCOHI")


## Itinerary

This is quite the mileage run:

1. FM: Federated Sates of Micronesia
2. MP: Northern Mariana Islands
3. PA: Pennsylvania
4. AP: Armed Forces Pacific
5. PW: Palau
6. WV: West Virginia
7. VA: Virginia
9. KS: Kansas
10. SD: South Dakota
11. DC: District of Columbia
12. CA: California
13. AA: Armed Forces Americas
14. AL: Alabama
15. LA: Louisiana
16. AR: Arkansas
17. RI: Rhode Island
18. IA: Iowa
19. AS: American Samoa
20. SC: South Carolina
21. CT: Connecticut
22. TN: Tennessee
23. NM: New Mexico
24. MN: Minnesota