The 2nd part of this week's Perl challenge is to find the longest word you can spell using only the two-letter postal abbreviations for the US states. This is my solution, and the solution for a variant of the challenge.
Here's the wording of the challenge:
Using only the official postal (2-letter) abbreviations for the 50 U.S. states, write a script to find the longest English word you can spell?
I've got the state codes in a file, so the first thing is to load them into an array:
use File::Slurper qw/ read_lines /;
my @states = read_lines('state-codes.txt');
I've got a word list, with one word per line, so I can similarly load those:
my @words = read_lines('word-list.txt');
We're only interested in words that are made up of state codes. We can make a regular expression for that:
my $regexp = join('|', @states);
my @words = grep { /^($re)+$/ }
read_lines('word-list.txt');
State codes are all two letters, so we know that the string is going to have an even number of letters in it. The regular expression will enforce that, but applying an even-length check before applying the regular expression check will be faster:
my $regexp = join('|', @states);
my @words = grep { /^($re)+$/ }
grep { length($_) % 2 == 0 }
read_lines('word-list.txt');
It takes less than a second to run my solution anyway, so I didn't keep the even-length check.
We want to find the longest such word, so we can sort into order of decreasing length, and then the first string we find is the answer:
sort { length($b) <=> length($a) }
Putting all of those together, I came up with this:
use File::Slurper qw/ read_lines /;
my @states = read_lines('state-codes.txt');
my $re = join('|', @states);
my @words = grep { /^($re)+$/ }
sort { length($b) <=> length($a) }
read_lines('word-list.txt');
print $words[0], "\n";
Which, for my word list, produces cascarilla ("the aromatic bark of a West Indian shrub").
The above solution lets you use states more than once. But imagine you have a bag of 50 scrabble-like tiles, one for each state, and you have to make the longest word you can with these.
As it happens, the longest word found, cascarilla, doesn't use any state more than once. But what if we're listing all of the words we can make with the tiles?
For this, split each word into pairs of letters, and then remove any duplicates. If the resulting list is no different, then the word's good.
For removing duplicates my first thought was a hash:
foreach my $word (@words) {
my @pairs = $word =~ /([a-z][a-z])/g;
my @uniqpairs = keys %{ { map { $_ => 1 } @pairs } };
print "$word\n" if @pairs == @uniqpairs;
}
That line's a bit hairy, so my next thought was to use uniqstr
from List::Util:
use List::Util qw/ uniqstr /;
foreach my $word (@words) {
my @pairs = $word =~ /([a-z][a-z])/g;
my @uniqpairs = uniqstr @pairs;
print "$word\n" if @pairs == @uniqpairs;
}
This removes 20 words that were found by the first solution.
If we had those 50 tiles, and were only allowed words of four letters or longer, what combination of words would let you use up the most tiles?
comments powered by Disqus