Case Study: Word Play

This discussion will focus on solving word puzzles which involve searching for words with special properties.

The first thing is to download a word list we can use for these exercises. Download the file CROSSWD.TXT and rename it words.txt.

This file includes official crosswords as part of the Moby Project on Wikipedia, which consists of 113,809 official crosswords.

This is a plain text file that can be easily opened with a text editor or read in by Python.

Reading from a Text File

open() function

To read a text file you must first open it. Use the open() function that takes the name of the file to open and returns a file object you can use to read the file.

If the open() function was successful in opening the file, the next thing is to read from it.

You can read a single line using the .readline() method of the file object. The method reads a line until it encounters the newline character and returns the line as a string.

The second print invocation uses the repr() function that gives you a representation of the string that the interpreter can use.

The important thing here is its output. Notice the \n at the end of the string. This is the newline character on my Mac. Your output may include \r, the carriage return, as well.

The newline is part of the returned string, so keep that in mind.

The file object keeps track of where it is in the file, so if you call the .readline() method again, you will get the next word:

strip() method

Often you will have to process the string read in without the newline character. To remove it use the .strip() method.

close() method

Once you are finished reading from the file you can close it using the .close() method.

Reading a file using a for

The file object can be iterated over using a for statement. Let's see how that is done.

Hands-on exercises

Here are some exercises you should attempt on your own. I have provided solutions at the end of this discussion, but try to refer to them only if you get really stuck.

Note: all these exercises should use the words.txt file.

"Good luck young grasshopper."


ex1: words20.py

Write a function called wordsOfLength() that reads words.txt and prints only the words with more than 20 characters (not counting whitespace).

Output:

counterdemonstrations
hyperaggressivenesses
microminiaturizations

ex2: wordsSansE.py

In 1939 Ernest Vincent Wright published a 50,000-word novel called Gadsby that does not containn the letter "e". Since "e" is the most common letter in English, that's not easy to do.

Write a function called hasNoE() that returns True if the given word doesn't have the letter "e" in it.

The program should print the percentage of the words in the list that have no "e".

Output:

33.1% of the words have no e in them.

ex3: forbiddenLetters.py

Write a function isAvoiding() that takes a word and a string of forbidden letters, and that returns True if the word doesn't use any of the forbidden letters.

The program should prompt the user to enter a string of forbidden letters and then print the number of words that don't contain any of them.

Output:

37641 words avoid the letter(s): e

ex4: permissibleLetters.py

Write a function named usesOnly() that takes a word and a string of letters, and that returns True if the word contains only letters in the list.

Output:

188 words use only the letters: acefhlo
5 words use only the letters: aeiou

ex5: requiredLetters.py

Write a function named usesAll() that takes a word and a string of required letters, and that returns True if the word uses all the required letters at least once. How many words are there that use all the vowels aeiou? How about aeiouy?

Output:

598 words use all of the letters: aeiou
42 words use all of the letters: aeiouy

ex6: abecedarian.py

Write a function isAbecedarian() that returns True if the letters in a word appear in alphabetical order (double letters are okay).

How many abecedarian words are there?

Output:

Found 596 abecedarian words.

Solution - ex1: words20.py


Solution - ex2: wordsSansE.py


Solution - ex3: forbiddenLetters.py


Solution - ex4: permissibleLetters.py


Solution - ex5: requiredLetters.py


Solution - ex6: abecedarian.py

Practice problems

Create a separate Python source file (.py) in VSC to complete each exercise.

Note: Use the file words.txt from our discussion above for these excercises.


p1: consecutivePairs.py

This question is based on a Puzzler that was broadcast on the radio program Car Talk [http://www.cartalk.com/content/puzzlers]:

Give me a word with three consecutive double letters.

I'll give you a couple of words that almost qualify, but don't. For example, the word committee:, c-o-m-m-i-t-t-e-e. It would be great except for the i that sneaks in there.

Or Mississippi: M-i-s-s-i-s-s-i-p-p-i. If you could take out those i's it would work. But there is a word that has three consecutive pairs of letters and to the best of my knowledge this may be the only word. Of course there are probably 500 more, but I can only think of one. What is the word?

Write a program to find it.


p2: palindromicOdometer.py

Here's another Car Talk Puzzler:

"I was driving on the highway the other day and I happened to notice my odometer. Like most odometer, it shows six digits, in whole miles only. So, if my car had 300,000 miles, for example, I'd see 3-0-0-0-0-0.

Now, what I saw that day was very interesting. I noticed that the last 4 digits were palindromic; that is, they read the same forward as backward. For example, 5-4-4-5 is a palindrome, so my odometer could have read 3-1-5-4-4-5.

One mile later, the last 5 numbers were palindromic. For example, it could have read 3-6-5-4-5-6. One mile after that, the middle 4 out of 6 numbers were palindromic. And you ready for this? One mile later, all 6 were palindromic!

The question is, what was on the odometer when I first looked?"

Write a Python program that tests all the six-digit numbers and prints any numbers that satisfy these requirements.

[Strings] [TOC] [Lists]