2007 North American Computational Linguistics Olympiad

The Contest










Information about Sample Problems

By the tradition established at the Moscow Olympics, the linguistic problems offered at the International Olympiad were all new and self-sufficient, i. e., requiring no special knowledge and no previous training in linguistics, and were dedicated to interesting phenomena of several languages. The Olympiad included two contests. The first contest was individual, each participant was given five problems, while the second one was a team contest with three problems for each team.

The North American Computational Linguistics Olympiad problems will also be new and self-sufficient. In addition to the similar linguistic problems used in other contests, NAMCLO will also have a few computationally focused problems. Example problems are shown below.


Sample Problem


Babylonian

The earliest known writing system originated over 5000 years ago in what is now Iran, Iraq and other parts of Western Asia. This writing system, called "cuneiform," was used by the ancient Persian kings to make their decrees known, and to audit the tax returns of their many subjects. The characters were inscribed on clay or stone tablets using wedge-like instruments. Although many inscriptions have survived, the writing system was not deciphered by modern scholars until 1846.


In this problem you will carry out the kind of work that these scholars had to do to decipher the cuneiform writing system. The following (see image on the right) is an actual fragment from a Babylonian educational document that was discovered in 1811. It was this document that allowed scholars to unlock the number system used by the ancient Babylonians. From this, scholars were able to extend their understanding to the entire writing system. Many of the characters are illegible because of the ravages of time. Nevertheless, it is possible to figure out what the missing characters should be. Your job is to fill in the missing characters:



Stumped? Click here for the solution to this problem.


Phonebook

"Last names in the Dictionary"
Some words in your dictionary also appear as last names in your phonebook. For example, "brooks", "brown", "butler", "hall" and "wright" are in your distionary and "Brooks", "Brown", "Butler", "Hall" and "Wright" are all common last names in the US.
You would like to make a list of all such words. The inefficient way would be to go through the dictionary in order: for each dictionary word, you open the phone book, look up that word, add it to your list if you find it as a last name, and close the phone book again.

a) Why is it more efficient to keep the phone book open between word look-ups?

b) What if you had a friend to help you (and two copies of the dictionary and phone book)? How can the two of you divide up safely and finish twice as fast?

c) What if there are three of you instead of two?

Stumped? Click here for the solution to this problem.



For more problems, click here





Olympiad Locations

Organizing Committee

Pittsburgh area (hosted by Carnegie Mellon University)
contact: Lori Levin, lslcs.cmu.edu
Lori Levin (General Chair), Carnegie Mellon University
 
Philadelphia area (hosted by U. of Pennsylvania)
contact: Mitch Marcus, mitchcis.upenn.edu
Thomas Payne (General Chair), University of Oregon
 
Boston area (hosted by Brandies Univeristy, Cambridge)
contact: James Pustejovsky, boston.olympiadgmail.com
Dragomir R. Radev (Program Chair), University of Michigan
 
Ithaca area (hosted by Cornell University)
contact: Claire Cardie, cardiecs.cornell.edu
William Lewis (Outreach Chair), University of Washington
 
Online participation
contact: Dragomir R. Radev, radevumich.edu
James Pustejovsky (Sponsorship Chair), Brandeis University
Barbara Di Eugenio (Follow-up Chair), University of Illinois at Chicago
Supported by NSF                                             Website Developed by The LINGUIST List                                                          The Association for Computational Linguistics                               Google
                                                                                                                                                                                                                NAACL