Tuesday, April 19, 2016

Natural Language Processing. Brief intro

For the last year,  I'm working with Natural Language Processing (mostly with Deep Learning).  And I've decided to write a set of blog posts with the description of the most trend ideas in the field. So, let's start from the very beginning.

Natural Language Processing is a field at the intersection of computer science, Artificial Intelligence and linguistic. The main goal of NLP is to "understand" natural language in order to perform some useful tasks, like question answering.

Some examples of NLP applications:
  • Spell checking, keyword search, finding synonyms
  • Extracting information from websites such as time, product price, dates, location, people or company names
  • Classifying texts 
  • Texts summarisation
  • Finding similar texts
  • Sentimental analysis 
  • Machine translation
  • Search
  • Spoken dialog systems
  • Complex query answering
  • Speech recognition
Texts could be analyzed on different levels: phonemes, morphemes, words, sub-sentences, sentences, paragraphs and whole documents. 
From linguistic point of view, analysis could be done on these levels:
  • Syntax (what is grammatical)
  • Semantic (what does it mean)
  • Pragmatics(what does it do)
There are a lot of smart algorithms, which were developed for various tasks:

NLP is hard. First of all, because of:
  • ambiguity - more than one possible(precise) interpretation (e.g. "Foreigners are hunting dogs"), 
  • vagueness - does not specify full information
  • uncertainty -  due to imperfect statistical mod

In mid-2010 Neural Nets become successful in NLP.  Why did it happen?
I'll describe the main ideas of deep learning techniques for NLP  in the next post :)