Autumn in the Heavenly Kingdom

May 20, 2018 ~ Jerry Li ~ Leave a comment

Autumn in the Heavenly Kingdom

This book is a history about the Taiping Civil War during the Qing dynasty of China. Even though I was born Chinese, I know little about this piece of history. There aren’t many famous novels about that period that I’m aware of. In school I remember that Taiping Civil war is usually covered within 20 minutes of time. This is shocking considering that the Taiping Civil War is arguably the most bloody war in human kind if counting by number of death caused by the war. It succeeded even World War II depending on how death is counted. So the motivation to read this book is by pure curiosity. Why is the Taiping Rebellion got so little attention in both World history and Chinese history?

This book has three almost parallel narratives, one from Zeng Guofan, one from the Heavenly king Hong Xiuquan, and the third from the foreigners in China.

Zeng Guofan was a typical Confucian scholar who got his job through the Keju, or the examination to select government officials. He was appointed to be a general by chance and frankly speaking he was not good at war. What he was good at was to use people who are good at waging wars. He was also good at organizing campaigns and he created a new system to select military officials and to recruit soldiers.

Hong Xiuquan was once also aiming to be an official. After he failed in the exam though, he had illusions where the God visited him. His personality changed completely after that. He got in touch with other foreign Christians in the area and started preaching. But his “version” of Christianity was different from other sects, in that he puts himself as the son of the God. Many Chinese emperors made such a claim in the past and therefore he became a king. His speech was very persuasive and his administration is family oriented – key positions controlled by his brothers and cousins.

I was surprised to learn that the Heavenly Kingdom was in fact Christian. When I studied that period of history, I never had that sort of impression. I just thought that it was like other rebellions in the past where people are pissed about bad government and just rose up. I was even more surprised that other European countries would actually align with the Qing dynasty on this matter – especially shortly after they had wars with the Qing dynasty.

The attitude of the west towards the Heavenly Kingdom was heavily influenced by the generals in charge of trading ports.Those generals despised Chinese, so even though the Heavenly Kingdom was very friendly to foreigners due to common religious belief, the reports of generals depicted the Heavenly Kingdom as an evil rebellion that brought disruption to trade. It is hard to imagine that one person can influence the foreign view of China so much, but that happened.

Another thing that remember most clearly from the reading is the reason for the foreign army to burn down the Summer palace. It was depicted in Modern Chinese history as the day when foreigners basically stepped and spitted on China’s half-dead body. But in the foreigner’s view it was completely different. It was an act that prevented further damage to civilians and an act to destroy the pride of the Qing emperor. Soldiers usually plunder after they won a war and they either do that to civilians or in this case to royal palaces. But the Summer Palace was destroyed because the Qing emperor refused negotiation even after Beijing was taken over. Qing emperor tortured and killed the messengers sent to him. So the destruction of the Summer palace was two fold– one caused by the greed of soldiers and the other caused by the useless pride of the Qing dynasty.

I did not finish this book, but it gave me interesting perspectives to the society then – the same period as the American civil war. Life was hard at that time. People can die in millions. Cannabalism is not unusual due to hunger. Superstitions, prejudice, and pride drove the events that shaped Chinese history forever. I feel that the world would be different if Taiping Rebellion had succeeded. Almost half of China was under the Heavenly Kingdom at one time! China will become a Christian country, a thing I would never have imagined could be possible. China will have a half-crazy king who says he is the son of a foreign God. Yet people followed Hong Xiuquan, partly because Hong was Chinese instead of Manchurian, and partly because of his charisma.

I wish I can finish the book but it’s just too heavy and I can kind of see how Heavenly Kingdom failed without foreign support. I wish the book could be more concise so that I can fully experience a totally different world 150 years ago.

Amazon Link

Simple NLP – Early Machine Translation

September 7, 2017 ~ Jerry Li ~ Leave a comment

Simple NLP – 3

Author: Jerry Li

“Siri, what is zero divided by zero?”

“Imagine that you have zero cookies, and you split them evenly among zero friends, how many cookies does each person get? See? It doesn’t make sense. And Cookie Monster is sad that there are no cookies. And you are sad that you have no friends.”

A Brief History – Early Machine Translation

Last time we talked about the birth and the initial difficulties in NLP research from 1940s to 1960s. NLP Research continued in other countries after United States government withdrew funding in the US. According to “Machine translation: a concise history”, the METEO System was developed in Canada for translating weather forecasts between English and French. It was formally adapted in the 1980s to replace junior translators. Systran was a translation system developed in 1968 to support translation among European languages. It was first installed in 1970 and used in intergovernmental institutions including NATO and International Atomic Energy Authority, as well as international companies such as General Motor.

If you remember ELIZA the chat bot from last time, it may seem like Machine Translation systems will never be feasible in such a short period of time. How did METEO System and Systran work? Did the US government greatly misjudged the level of difficulty of NLP? To answer those questions, we need to know how early machine translation systems works.

Machine Translation in Three Steps

The very first automatic translation systems usually translate sentences one by one (which is still true for most translation software today). For each sentence, it consists of three separate steps:

Extract syntactic (grammar) information from the sentence.
Reassemble the syntactic structure per the target language.
Translate each word in source language into target language.

The system resembles a simplified translation procedure of a bilingual human. Since those three steps will reappear again and again in NLP systems developed later, let’s go through them one by one.

Suppose we are working for the Canadian government on translating their weather reports. We are given the sentence “The weather in Montreal is cloudy.” and we would like to translate that into French. I personally know no French and I am not a native English speaker, so I’ll do it the slow way. What I have in hand was an English to French dictionary, an introductory grammar book, and a friend of mine George who knows a little French.

Extract syntactic (grammar) information from the sentence

Part of Speech

The first thing I need is the Part of Speech(POS) for each word. Words are separated into different POS based on how they function grammatically in language, like Nouns, Verbs, Adjectives, etc. I go through the dictionary and look up the first word “The”. It tells me that “The” is a Determiner, a part of speech that also includes “a” and “an”. I go to the next word “weather” and do the same thing. Soon I labeled all part of speech for the sentence: “The/DT weather/NN in/IN Montreal/NNP is/VBZ cloudy/JJ ./.”. Don’t worry if you don’t know what all those DT, NN, and other tags stands for, because I didn’t either. Those are abbreviations of English Parts of Speech. For a complete list of English POS tags please visit here.

Parse Tree

The next thing I need is a parse tree for the sentence. A parse tree represents the grammar structure of a given piece of text. To make things easier, let’s first take a look at parse trees for Math expressions, which are, believe it or not, actually much simpler than those for language. For computing x + y * z, you know that you first need to compute y * z before doing the addition. The parse tree indicates whether to do the addition first or the multiplication first. It works the same way when applied to natural languages – it tells you things like which noun is the adjective modifying and which is the subject of the sentence.

The correct parse tree for x + y z* is on the left. The right one represents (x + y) z* instead.

Going back to our example “The weather in Montreal is cloudy.” , I know intuitively the word “The” and the word “weather” should go together, because “The” is the determiner for the noun “weather”. According to the grammar book, Determiner + Noun => Noun Phrase, or if you write it in the fancy way like a linguist, DT + NN =>NP. Similarly, I know “in” and “Montreal” should go together and the grammar book told me Preposition + Noun Phrase => Preposition Phrase. I repeat the process until I reach the final step: Noun Phrase + Verb Phrase + . => A complete sentence! The whole parse tree drawn out looks something like the graph below:

Parse tree for “The weather in Montreal is cloudy.”

Reassemble the syntactic structure into that of the target language

Now I need to turn the parse tree into French sentence structure. I ask my friend George: “What is the corresponding sentence structure for this parse tree?” He looked at it for a second, then kindly replied “Same thing as in English. You know there’s a thing called Google Translate, right?” Ignoring his comment, I copied down “The/DT weather/NN in/IN Montreal/NNP is/VBZ cloudy/JJ ./.”. If you ignore the English words and just look at the POS tags, those will be the sentence structure in French, which happens to be the same as in English for this sentence.

Translate each word in source language into target language

I take out my English to French dictionary and look up the French word for “The”. It gave me “La”. Then I went on to “weather” and it gave me “météo”. I repeated until I have a translation for all the words. It ended up looking like: “La/DT météo/NN à/IN Montréal/NNP est/VBZ nuageuse/JJ ./.”. After I remove the POS, it becomes “La météo à Montréal est nuageuse.” and it looks correct.

Finite State Machine and Dictionaries

In the sections above, we showed how the three-step machine translation system works. In addition, we introduced two important linguistic concepts: Part of Speech (POS) and parse tree. POS tells us whether a word is a Noun, a Verb, or something else. A parse tree is a structure that resembles the relations among words we have in our mind. Additionally, in each of the three steps we’ve made some assumptions about the resources we have at hand.

In the first step, we assumed that we have a dictionary that we can look up a word for its POS. Previously we talked about that words can have different meanings and/or part of speech, but because we are only translating for one specific topic, as in the case of the METEO System, words and their POS usually have little ambiguity. Therefore, we can store a POS dictionary in the machine without having to worry about choosing the correct POS out of many other possible ones.

Next, after tagging the POS for each word, we constructed a parse tree of the sentence. As humans, we know which word goes along with which – the correct parse tree makes sense. But for a machine, it has to rely on rules and logic, and it has to have an internal state to record things like “Did I see a verb? Is the sentence missing an object? I just saw a ‘the’ so I should be expecting a noun next.”. If a machine contains internal states and operates according to a set of rules and logics that allows transition among states, it is called a Finite-State Machine (FSM).

The best examples for understanding FSM are in video game AI. Take Pacman for example, the FSM controlling each enemy unit in Pacman basically has only 7 states and it decides to chase or avoid Pacman based on whether Pacman ate a pill or not. “Chase Pacman” and “avoid Pacman” are states and “Pacman ate a pill” is a transition rule. Once a transition rule is satisfied, it will take the machine to a new state. The FSM for language is more complicated than Pacman AI, but it works the same way – in each step the machine starts in your current state, checks the rules and see which one fits the current situation, proceeds to the next state, check the rules again, and repeat this until a terminal state is reached. FSM is widely used in Computer Science and you should look it up if interested.

Pacman AI FSM

Going back to machine translation, we just used the FSM to generate a parse tree. Then in the second step, we introduced my friend George who derived the French grammar structure from a parse tree. Sadly, the machine translation system did not have any friends, but the same job can also be done by an FSM. Previously we used an FSM to construct a parse tree from a sentence. Now the task is just the other way around. The input now is the parse tree and the output is the sentence order written in the target language. There isn’t much technical difference one way or the other.

Lastly, I used a English-French dictionary. Similar to the POS dictionary we used in the first step, the machine can store and use such dictionary easily, without worrying about ambiguity issues. We’ve went through all the resources we’ve used in our three-step translation system. Now a machine can use FSM and dictionaries to do translation for us.

Downsides of FSM

Comparing ELIZA to machine translation systems like METEO, we can see that replacing pattern matching with FSM worked. It was compact enough to be fit into a computer at that time and the system was robust enough to handle tasks confined to a certain domain. But were there downsides of the FSM? Why can’t we apply the same system to a wider range of tasks other than translating weather reports or government documents?

One of the main problem for FSM is that it needs programmers and scientists to write the rules and states. Furthermore, the quality of the rules and states directly correlate with the ultimate quality of the output. If the rules are too specific, they will only work for a small number of sentences. Researchers will need to write lots of rules to cover all cases. On the other hand, if the rules are too general, then the system will have a higher rate of error due to the large number of possible situations the rule will be erroneously applied to. It is hard even for experts to find the right balance between quality and quantity. Furthermore, if you want to expand the same system to a broader range of tasks, there will be more possible inputs, which means more ambiguity, more rules to be written to handle them, and more human effort.

Example AI bug from YouTube

Again, I will use video game AI to illustrate the downsides of FSM. AI in video games needs to handle a lot of complicated situations, which is why they often have bugs. I once saw an AI character in an RPG world who all of a sudden lost its dreams, and start repeatedly running right into the wall for a few seconds, back off, and start running into the wall again. Here’s how a bad FSM-based AI would cause such a bug:

Diagram of a badly designed FSM causing an AI bug

Bounded by the inherent limits in FSM-based NLP systems, an early translation software can only be applied to a specific area. If I want to translate weather reports, I can use software A, but if I want to translate say restaurant menu, I have to switch to software B. Struggling, some Computer Scientists started to wonder: “Can I ask the machines to generate the rules for us when I lie on the couch and chill?” The funny thing is, those people later became the pioneers in Machine Learning. They brought life to NLP as we know of it today and their work will be the topic for my next blog.

P.S.

Formally, the system that used FSM was called Augmented Transition Networks (by Throne, Woods).

Citation

Machine translation: a concise history – John Hutchins

THE METEO SYSTEM

I used the Stanford Parser for parsing the example sentence. I used the Syntax Tree Generator webpage to generate Syntax Tree graphs. I used gif.com to generate the gif from this AI bug video.

Image source:

Pacman image

Math parse tree

Cookie monster image

Simple NLP – Language Invention

August 23, 2017 ~ Jerry Li ~ Leave a comment

Simple NLP – 2

Author: Jerry Li

Previous Blog: A Brief History – the Beginning

Linguistics – Language Invention

If I tell you that language can be invented, you probably won’t be surprised. You see all the time examples of how some parts of a language are changed, like new phrases being invented on social media. You and your friends may have some secret phrases referring to a memorable moment. J. R. R. Tolkien, the author of The Lord of the Rings, constructed languages that only exists in his fictional world, like Elvish, Dwarvish, and others. It’s really cool for one single person to be able to develop such a rich system of languages just for his novels, but what’s even more amazing is that Tolkien is not the only one with the ability to invent a language. In fact, everyone can, if they are put under the right environment at the right age.

Elvish on the One Ring from The Lord of the Rings

The Ring

What is language?

Before delving into the story of how a language can be created, let’s first talk about what exactly is language. If language is just a system for communication, then do other animals have language? Dogs bark to each other. Birds can sing. Bees have dancing as well as chemicals to tell others about food sources or enemies that could be miles away. So, what’s the difference? It is widely believed that only humans, with the largest brain to body weight ratio in all species, has the ability to learn and invent such a complex communication system. Even though some animals, like Kanzi and Chantek, learned human sign languages after training, their language ability only compares to that of a 4-year-old child. Sentience is the prerequisite of using a language. A language is a system expressive enough to communicate our sophisticated thoughts.

Chantek the orangutan who learned American Sign Language

Chantek

Is Language Innate?

Humans have lots of things other animals do not enjoy. We can cook, write, do Math, farm, etc. Those are mostly considered technologies instead of innate human abilities. Is language one of those technologies discovered by someone through accident or is it more like the ability to walk – innate to all humans without formal teaching? So far, the innateness of language ability is still under debate among scientists, but there are a few evidences pointing out that at least some knowledge about language is written in our brain and not acquired through learning.

Universal Traits in All Human Languages

Isn’t that kind of amazing that no matter where we are born, we all walk more or less the same way? This fact indicates that there is something universal about how muscles and the brain work together when we walk. Similarly, for languages, if we can find some universal traits that are shared by all languages, then it is an indication that at least something is written in our gene about the language we use.

The Ethnologue catalogue of world languages, which is one of the best linguistic resources, says that there are around 6909 living languages in the world (From Number of languages). All of them are different in one way or another. For instance, the grammar structure of a sentence is not necessarily the same in different languages. In Japanese, the verb is put at the end of the sentence, but in English, the verb will be put between the subject and the object. Here’s an example:

I ate an apple.

私はリンゴを食べた。

I (Topic marker) apple (object marker) ate.

As another example, according to Linguistics Society, in Welsh, the usual order is for the verb to come first, followed by the subject, followed in turn by the object:

The student bought the book.

prynodd y myfyriwr y llyfr

bought the student the book

Pronunciations of a language seems quite arbitrary as well, proven by the fact that different languages usually call animals by different names. In addition, words were probably pronounced differently centuries ago than they are now. Grammar is no exception. If you have ever learned grammar or tries to explain grammar to a non-native speaker, you’ll soon find out that some of it too seems arbitrary. “That’s just how English works.” I was often told when I learned English.

So out of the 6909 living languages, and even more dead ones, are there any traits shared by all of them?

There are. In fact, scientists have found a quite a few. Here are four of them:

All languages have nouns, verbs, objects, and pronouns (like I, we, they).
All languages have at least two vowels (vowels are like a, i, u, e, o etc.).
All languages have at least three sizes of grammatical units: word, phrase, and clause.
“If a language distinguishes dual number (a grammatical category indicating “two”) in pronouns, it also distinguishes plural number.” (From scribd.com).

Those facts seems to suggest that the basic structure of a human language is written in our gene. This is the central idea to Universal grammar proposed by Noam Chromsky, one of the most famous linguists in history. Even more, when brand new languages are invented, they follow the same commonalities.

Inventing a Language

In the 16th century, not long after the (re-)discovery of the New World by Columbus, the brutal slave trade that abducted 10 million Africans began. Those unfortunate African slaves who would spend the rest of their lives on American plantations came from all over Africa and did not speak the same language among themselves. Indeed, slaves from different regions or tribes were placed together exactly because they cannot communicate with one another – no communication, no rebellion. To work with other slaves and receive orders from their masters, most first-generation African slaves picked up pieces of language from the slave masters. These were short phrases, words, and sentence fragments that had limited vocabulary and no unified grammar structure.

This kind of broken language is called a pidgin. Pidgin languages can be found everywhere around the world when two or more groups of people interact with one another without a common language base. Words are borrowed from other languages and are adapted to serve new purposes.

According to eyeofhawaii.com, in Hawaiian pidgin language, the word “brah” (which is also used in contemporary English slangs) means “brother” and the word “cockaroach” means “to steal”.

Back to the American plantations. When slaves married, usually also between two people speaking different mother tongue, the couple communicated with each other and with their children using a pidgin. All the children growing up listening to pidgin then did something that fascinated linguists.

When the children heard those fragmented words and phrases, they spontaneously tried to fill in the missing grammar parts. As an example, if their parents’ pidgin sentence seemed to be missing an object that was implied, the kids would fill them in. If the parents did not know the word for some object, the kids tried to reuse and combine other words. According to The Language Instinct, simple verbs in pidgin language such as “go”, “stay”, and “came” are used systematically in Hawaiian Creole grammar as auxiliaries, prepositions, case markers, and relative pronouns. Moreover, “The English past tense ending -ed may have evolved from the verb do: He hammered was originally something like He hammer-did.”

Furthermore, when the slaves’ children got together, their languages started to merge and form a new language. What if the kids didn’t like how a word sound to them? They just came up with a new one and started using it. If they found a grammar to be counterintuitive, they spoke in whatever grammar that felt right to them. This kind of language is called creole – language spoken by people whose mother tongue is pidgin. Those children invented their own brand new language within a generation, with their unique set of words, new grammar, and new group of people to speak it.

Image of a slave family

Those children of slaves are the inventors of new languages. *

Creole language has made a deep impact on how we use language today. Some linguists believe that the Black English widely spoken today among African Americans, also known as Black Vernacular English(BVE), was probably an English-based creole language. Phrases like “Don’t nobody know the answer, Ain’t nothing going on.” is grammatically correct in BVE, but not in standard English. Creole Languages are excellent examples of how languages can be borrowed, created, and adapted constantly. And guess what, all Creole languages also adhere to the same set of traits common to all other human languages, even though most of their inventors did not even go to elementary school.

In the next blog of the Linguistics series, I plan to show some interesting facts about children learning language, which will provide some insights to how human perceive language and how we can use that information for NLP research. If you have topics that you’d like to read about, just let me know. Thanks for reading!

Citation

The Language Instinct

History of African American English in the U.S.

Pidgin language example

Image source:

Simple NLP – the Beginning

August 16, 2017August 16, 2017 ~ Jerry Li ~ Leave a comment

Simple NLP – 1

Author: Jerry Li

Previous Blog

This is the dog, that worried the cat, that killed the rat, that ate the malt, that lay in the house that Jack built.

Mother Goose, The house that Jack built

This is the malt that the rat that the cat that the dog worried killed ate.

Victor H. Yngve (1960)

Both quotes are taken from Speech and Language Processing by Jurafsky and Martin

A Brief History – the Beginning

1949

It was 1949. The Second World War had been over for 4 years, yet the whole world was still recovering from its wounds. A lot of the first-generation computer scientists, including Warren Weaver, Alan Turing, and Claude Shannon worked on cryptography during the WWII. Their work helped the allies decipher important information which led to their final victory. Thus, those computer scientists were also on the cutting edge in communication, linguistics, as well as math. However, there was one problem: there were no longer more German or Japanese secret messages for them to work on. The 55-year-old Warren Weaver decided to put his knowledge to good use. Inspired by his cryptography research during the war, he proposed that all human languages are encrypted and they can be “decrypted” or translated automatically into other languages by machines, just like what was done during WWII. German was essentially encrypted English. Encouraged by his fellow mathematicians and scientists, he publishes his idea in a memorandum called “Translation”, from which the modern fields of Natural Language Processing(NLP) and machine translation were born.

WWII Cryptography

Realizing the potential impact of NLP, the researches were funded by the US government in the hope that they can have a working system in a decade or two. After all, if a two-year old can understand languages, surely it can’t be difficult for a machine built and taught by the smartest minds in the world? The 1940s to 1960s period was filled with infinite trust in science and its ability to bring progress to human society, whether in the past or in the future.

ELIZA

After a few years and millions of dollars, a chat bot named ELIZA was developed that stood for the best piece of NLP technology at the time. You can try a demo by visiting this page. A sample dialog can be found below:

Person : Men are all alike.

ELIZA : In what way?

Person : They’re always bugging us about something or other.

ELIZA : Can you think of a specific example?

Person : Well, my boyfriend made me come here.

ELIZA : Your boyfriend made you come here?

Person : He says I’m depressed much of the time.

ELIZA : I am sorry to hear that you are depressed.

Person : It’s true. I am unhappy.

ELIZA : Do you think coming here will help you not to be unhappy?

Person : I need some help, that much seems certain.

ELIZA : What would it mean to you if you got some help?

Person : Perhaps I could learn to get along with my mother.

ELIZA : Tell me more about your family

If you spend 5 minutes playing around with it, you’ll soon notice that something was not quite right. ELIZA tend to repeat itself a lot.

ELIZA works by finding simple patterns and key words in the provided inputs. For example, if ELIZA sees the word “alike” or “same”, the reply will be “In what way?”. If it sees the pattern “My {family-member} is Y”, ELIZA will ask “Who else in your family is Y?”. Whenever it cannot find the pre-programmed key words and have no idea what you just said, it will reply “I see.”. It is a program designed specifically for the Turing Test without actually understanding anything about the conversation. The only thing it needs to do was to trick humans into thinking that they are talking with another human across the screen. Natural language was simply too much for a computer at that time to really handle. After all, computers were built first to, well, compute. They were good at math but not too much into talking.

Turing Test Diagram:

The Turing Test is a test for machine intelligence. If the judge cannot tell whether the response comes from a machine or from a human, then the machine passes the test.

Turing Test Diagram

What was preventing the best scientists from inventing a translation machine? As the researchers approach this unknown field, they found two great walls standing between human language capacity and that of a computer, namely ambiguity and infinity.

You: Call me a cab.

Bot: Ok, you are a cab.

Ambiguity

The inherent ambiguity in language is everywhere, yet we seldomly notice it, until you work with a machine. One joke popular among linguists goes like this: “One morning I shot an elephant in my pajamas. How he got in my pajamas, I don’t know.” Here, the joke plays around with the ambiguity in English. Am I the one wearing my pajamas or is that the elephant? Both explainations are possible, albeit the second one is only more likely in a Disney movie where elephants wear pajamas. We humans don’t even realize the ambiguity.

Here’s another example. If you speak any language other than English, I invite you to pull up your favourite translation software or website and type in the following sentence: “I can’t believe my 15-year-old son grew another foot last summer.” I haven’t found one single translation system that will give me the right translation on this sentence yet. (If you have, I’ll be super interested to know. And for those confused, “foot” here is used as a length unit instead of as a physical part of an animal.)

Infinity

I call the second obstacle “infinity” to suggest the high complexity of any human language system. Different from ambiguity where more than one explainations can makes sense and we’re asked to pick the one that the speaker most likely meant, infinity is more like finding the potentially correct ones out of the other millions of nonsense ones.

When I say millions, I literally mean millions. Consider an English sentence with 15 words. Assuming that each word has 3 different possible meanings and/or part of speech, we’re facing 3¹⁵, which is around 14 million possibilities. And that’s just for one sentence. For comparison, the IBM 7090 in 1960 selling for $2.9 million ran at 0.1 Million FLOPS. Reasonable approximations will tell you that it will take from around 2 minutes to 1 hour to process just one sentence at that rate.

What further complicate things is that sentences can be arbitrarily long. One example would be the sentence quoted at the very beginning of this blog: “This is the dog, that worried the cat, that killed the rat, that ate the malt, that lay in the house that Jack built.” In theory, a sentence can be constructed infinitely long by adding additional parts onto it, although in practice people seldomly do so.

The NLP Ice Age

These two huge walls, ambiguity and infinity, still stand tall and firm before every NLP researcher today, casting a long shadow over the NLP community. In 1966, the Automatic Language Processing Advisory Committee (Alpac) “publishes a report on Machine Translation concluding that years of research haven’t produced useful results” and the US government decided to cease funding NLP researches (Cited from Machine Translation’s Past and Future). It was a bleak time for NLP.

The question I’ll try to answer in my next blog is, how and when did things get better? Thanks for reading.

Citation

I’ve gathered information and drew inspiration from the following sources:

Speech and Language Processing by Daniel Jurafsky & James H. Martin

Portland University ECE 478/578/678 slides

Introduction to Computational Linguistics and Natural Language Processing by Mark Johnson

Natural language processing: a historical review by Karen Sparck Jones

2. Natural Language Processing (NLP) – Tees

Machine Translation’s Past and Future

In the future, I will add on new citations in addition to this list.

Image sources:

Turing Test

WWII Cryptography Enigma machine

Simple NLP – Preface

August 16, 2017August 16, 2017 ~ Jerry Li ~ Leave a comment

Simple NLP – 0

Author: Jerry Li

Hi there! This is the introduction. Click HERE to jump to the first chapter.

中文版本请点击这里

日本語のバージョンを読んでみたいの方があれば、遠慮なく声をかけてください！*

Preface

There are tons of news floating around on AI these days. Indeed, it has become a trend to associate any company, product, or even YouTube idols with Machine Learning or some sort of AI. And yet there seems to be a gap between the general public and the AI or Machine Learning field that to some extent help build the frenzy yet mysterious atmosphere around the field.

AI in the past	AI now

While it is good to see my friends interested in AI one way or another, I remember just as clearly people (including myself) complaining that AI does not make sense. For researchers, scientists, and engineers, that comment may be about artificial neural nets, a Machine Learning model inspired by neural biology. For others, it is often about the field in general. Yes, AI nowadays can (finally) learn things from data, but not much effort went into explaining in a language that anyone can understand what AI can or cannot learn and why.

I hope to fill in the knowledge gap between the public and people doing cutting edge work in research or industry. This series of blogs are intended for people with interest in Machine Learning and/or AI, but don’t want to spend 4+ years in college or the whole weekend taking online courses on the subject. No prior experience in Math, Computer Science, or Linguistics is needed. However, it is not intended to prepare you for working in the field. I chose to omit most technical stuff to make the reading easy to understand. My intention is that after you’ve read the blog, hopefully you can say “Ha, I know a little bit more about how Siri works now!”.

I choose Natural Language Processing (NLP) as the topic for this series, both because of my personal interest and because it is probably one of the most common piece of AI people use daily. NLP is a field of research on interaction between computers and human natural languages Like English, Spanish, Hindi, etc. (Sadly, C++ and Python do not count, but they will when machines finally take over the world as Elon Musk has predicted. Muwhahaha…) You’ve likely heard of or used personal assistants on smart phones like Google Now and Siri. If you have learned a foreign language or travel abroad, you probably have used translation software. Even something as simple as word counter in Microsoft Word uses NLP technology. (Counting number of words or sentences is not trivial even in English, in which words are usually separated by space.)

Here are some topics I am planning to talk about (not necessarily in order):

History of NLP
Linguistics (the fun part at least)
Latest development in research and industry
Deep (and shallow) Learning in NLP
Other NLP Applications
Other topics you are interested in

P.S.

I would not call myself an expert in the field and there will likely be mistakes I’ve made in writing my blogs. If you’ve found one, please feel free to let me know. Also, if you have any suggestions/questions/concerns, don’t hesitate to reach out! All writers like feedbacks from their readers.

One last thing, feel free to repost this series to elsewhere if you like it. Just citing the author would be enough. Thanks!

Image sources:

HAL9000

Kizuna Ai

Autumn in the Heavenly Kingdom

Simple NLP – 3

A Brief History – Early Machine Translation

Machine Translation in Three Steps

Extract syntactic (grammar) information from the sentence

Part of Speech

Parse Tree

The correct parse tree for x + y * z is on the left. The right one represents (x + y) * z instead.

Parse tree for “The weather in Montreal is cloudy.”

Reassemble the syntactic structure into that of the target language

Translate each word in source language into target language

Finite State Machine and Dictionaries

Pacman AI FSM

Downsides of FSM

Example AI bug from YouTube

Diagram of a badly designed FSM causing an AI bug

Citation

Simple NLP – 2

Linguistics – Language Invention

Elvish on the One Ring from The Lord of the Rings

What is language?

Chantek the orangutan who learned American Sign Language

Is Language Innate?

Universal Traits in All Human Languages

Inventing a Language

Image of a slave family

Citation

Simple NLP – 1

A Brief History – the Beginning

1949

ELIZA

Turing Test Diagram:

Ambiguity

Infinity

The NLP Ice Age

Citation

Simple NLP – 0

Preface

The correct parse tree for x + y z* is on the left. The right one represents (x + y) z* instead.