Simple NLP – 1
This is the dog, that worried the cat, that killed the rat, that ate the malt, that lay in the house that Jack built.
Mother Goose, The house that Jack built
This is the malt that the rat that the cat that the dog worried killed ate.
Victor H. Yngve (1960)
Both quotes are taken from Speech and Language Processing by Jurafsky and Martin
A Brief History – the Beginning
1949
It was 1949. The Second World War had been over for 4 years, yet the whole world was still recovering from its wounds. A lot of the first-generation computer scientists, including Warren Weaver, Alan Turing, and Claude Shannon worked on cryptography during the WWII. Their work helped the allies decipher important information which led to their final victory. Thus, those computer scientists were also on the cutting edge in communication, linguistics, as well as math. However, there was one problem: there were no longer more German or Japanese secret messages for them to work on. The 55-year-old Warren Weaver decided to put his knowledge to good use. Inspired by his cryptography research during the war, he proposed that all human languages are encrypted and they can be “decrypted” or translated automatically into other languages by machines, just like what was done during WWII. German was essentially encrypted English. Encouraged by his fellow mathematicians and scientists, he publishes his idea in a memorandum called “Translation”, from which the modern fields of Natural Language Processing(NLP) and machine translation were born.
Realizing the potential impact of NLP, the researches were funded by the US government in the hope that they can have a working system in a decade or two. After all, if a two-year old can understand languages, surely it can’t be difficult for a machine built and taught by the smartest minds in the world? The 1940s to 1960s period was filled with infinite trust in science and its ability to bring progress to human society, whether in the past or in the future.
ELIZA
After a few years and millions of dollars, a chat bot named ELIZA was developed that stood for the best piece of NLP technology at the time. You can try a demo by visiting this page. A sample dialog can be found below:
Person : Men are all alike.
ELIZA : In what way?
Person : They’re always bugging us about something or other.
ELIZA : Can you think of a specific example?
Person : Well, my boyfriend made me come here.
ELIZA : Your boyfriend made you come here?
Person : He says I’m depressed much of the time.
ELIZA : I am sorry to hear that you are depressed.
Person : It’s true. I am unhappy.
ELIZA : Do you think coming here will help you not to be unhappy?
Person : I need some help, that much seems certain.
ELIZA : What would it mean to you if you got some help?
Person : Perhaps I could learn to get along with my mother.
ELIZA : Tell me more about your family
If you spend 5 minutes playing around with it, you’ll soon notice that something was not quite right. ELIZA tend to repeat itself a lot.
ELIZA works by finding simple patterns and key words in the provided inputs. For example, if ELIZA sees the word “alike” or “same”, the reply will be “In what way?”. If it sees the pattern “My {family-member} is Y”, ELIZA will ask “Who else in your family is Y?”. Whenever it cannot find the pre-programmed key words and have no idea what you just said, it will reply “I see.”. It is a program designed specifically for the Turing Test without actually understanding anything about the conversation. The only thing it needs to do was to trick humans into thinking that they are talking with another human across the screen. Natural language was simply too much for a computer at that time to really handle. After all, computers were built first to, well, compute. They were good at math but not too much into talking.
Turing Test Diagram:
The Turing Test is a test for machine intelligence. If the judge cannot tell whether the response comes from a machine or from a human, then the machine passes the test.
What was preventing the best scientists from inventing a translation machine? As the researchers approach this unknown field, they found two great walls standing between human language capacity and that of a computer, namely ambiguity and infinity.
You: Call me a cab.
Bot: Ok, you are a cab.
Ambiguity
The inherent ambiguity in language is everywhere, yet we seldomly notice it, until you work with a machine. One joke popular among linguists goes like this: “One morning I shot an elephant in my pajamas. How he got in my pajamas, I don’t know.” Here, the joke plays around with the ambiguity in English. Am I the one wearing my pajamas or is that the elephant? Both explainations are possible, albeit the second one is only more likely in a Disney movie where elephants wear pajamas. We humans don’t even realize the ambiguity.
Here’s another example. If you speak any language other than English, I invite you to pull up your favourite translation software or website and type in the following sentence: “I can’t believe my 15-year-old son grew another foot last summer.” I haven’t found one single translation system that will give me the right translation on this sentence yet. (If you have, I’ll be super interested to know. And for those confused, “foot” here is used as a length unit instead of as a physical part of an animal.)
Infinity
I call the second obstacle “infinity” to suggest the high complexity of any human language system. Different from ambiguity where more than one explainations can makes sense and we’re asked to pick the one that the speaker most likely meant, infinity is more like finding the potentially correct ones out of the other millions of nonsense ones.
When I say millions, I literally mean millions. Consider an English sentence with 15 words. Assuming that each word has 3 different possible meanings and/or part of speech, we’re facing 315, which is around 14 million possibilities. And that’s just for one sentence. For comparison, the IBM 7090 in 1960 selling for $2.9 million ran at 0.1 Million FLOPS. Reasonable approximations will tell you that it will take from around 2 minutes to 1 hour to process just one sentence at that rate.
What further complicate things is that sentences can be arbitrarily long. One example would be the sentence quoted at the very beginning of this blog: “This is the dog, that worried the cat, that killed the rat, that ate the malt, that lay in the house that Jack built.” In theory, a sentence can be constructed infinitely long by adding additional parts onto it, although in practice people seldomly do so.
The NLP Ice Age
These two huge walls, ambiguity and infinity, still stand tall and firm before every NLP researcher today, casting a long shadow over the NLP community. In 1966, the Automatic Language Processing Advisory Committee (Alpac) “publishes a report on Machine Translation concluding that years of research haven’t produced useful results” and the US government decided to cease funding NLP researches (Cited from Machine Translation’s Past and Future). It was a bleak time for NLP.
The question I’ll try to answer in my next blog is, how and when did things get better? Thanks for reading.
Citation
I’ve gathered information and drew inspiration from the following sources:
Speech and Language Processing by Daniel Jurafsky & James H. Martin
Portland University ECE 478/578/678 slides
Introduction to Computational Linguistics and Natural Language Processing by Mark Johnson
Natural language processing: a historical review by Karen Sparck Jones
2. Natural Language Processing (NLP) – Tees
Machine Translation’s Past and Future
In the future, I will add on new citations in addition to this list.
Image sources: