As part of my Scala-Docker-Postgres project I wanted to do something a little more useful than the ā€œHello World on steroidsā€ app I have done so far.

I thought it might be fun to make a Scrabble-style dictionary, where the user inputs any combination of characters and the application returns a list of valid English words that can be constructed from those letters.

For that, I decided to make a CSV file that I can import into Postgres containing the columns word and canonical_word

So for a row here’s how it would look:

ID |   word     |  canonical_word
---------------------------------
1  |  aardvark  |    aaadkrrv

The idea is when a user submits any collection of characters, we just alphabetize the collection and search against a list of all alphabetized or ā€œcanonical wordsā€ and return the English equivalents.

Since Scala runs on the JVM we have access to the famous java.io library which makes parsing the list a piece of cake.

here’s the list of words I’m using: https://github.com/dwyl/english-words/blob/master/words.txt

url="https://raw.githubusercontent.com/dwyl/english-words/master/words.txt"

curl $url >> words.txt

touch WordMapper.scala

in WordMapper.scala:

import java.io._
import scala.io.Source

val inputFile = "words.txt"
val outputFile = new File("dict.csv")
val fileWriter = new BufferedWriter(new FileWriter(outputFile))

for (word <- Source.fromFile(inputFile).getLines) {
  val canonical = word.split("").sorted.mkString("")
  fileWriter.write(s"""$word, $canonical\n""")
}

fileWriter.close

Back in the bash terminal:

scala WordMapper.scala

tail dict.csv

# zymotic, cimotyz
# zymotically, acillmotyyz
# zymotize, eimotyzz
# zymotoxic, cimootxyz
# zymurgies, egimrsuyz
# zymurgy, gmruyyz
# zythem, ehmtyz
# zythum, hmtuyz
# zyzzyva, avyyzzz
# zyzzyvas, asvyyzz

Nice and easy. Now we have a file ready to import into Postgres.