Bootcamp

Bootcamp is meant to provide a machine learning/deep learning/natural language processing context as fast as possible given some basic requisites.

Requirements

Before diving into learning it is good that you refresh your knowledge of:

Python 3.x.: Python Tutorial, A collection of not-so-obvious Python stuff
Linear algebra: Computational Linear Algebra for Coders
Probality and statistics: Book

Online tutorials and courses

Start by shorter tutorials to get an overall understanding of the area. I recommend you to go over these tutorials without getting into details.

Udacity’s Deep Learning by Google

Basic and fast paced but you can take it in one/two days and get an overall feeling of DL.
Kaggle’s Machine Learning tutorial for beginners.

Now you are ready for a more challenging one:

Practical Deep Learning For Coders, Part 1

Once you master the essential elements you can move into deeper waters to also grasp the NLP elements we need. So far, I recommend you to take one of these:

Oxford Deep NLP 2017 course
Stanford’s CS224N Natural Language Processing with Deep Learning videos

So far, I prefer Oxford’s but I am still deciding.

Books

In practice, it is unlikely that you have the time to read whole books. I am listing some here that I personally like and you could use as reference of support when taking the tutorials.

If you are in a “hurry”.

Yoav Goldberg (2017) Neural Network Methods for Natural Language Processing. Synthesis Lectures on Human Language Technologies, Morgan and Claypool Publishers link

Books with online materials

S. Raschka and V. Mirjalili (2017) Python Machine Learning, 2nd Edition GitHub repo

Reference books

Hastie, Tibshirani and Friedman (2009) The Elements of Statistical Learning (2nd edition) Springer-Verlag. web

A solid book to the foundations of machine learning.
Grégoire Montavon, Geneviève B. Orr and Klaus-Robert Müller (2012) Neural Networks: Tricks of the Trade (Second edition). Springer LNCS 7700.

Different applications of neural networks. I liked the book when I read it.
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT press. online in html-ish form and as pdf

It is a little dense for a begginer but very complete. It includes algebra and probabilistic “refreshers” at the beginning.
Dan Jurafsky and James H. Martin (2017) Speech and Language Processing 3rd edition in progress available

A classic NLP book, the draft of the 3rd edition is online with some chapters missing.
Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press. 2008. online

Classical CL/IR book with all the basics.

Libraries to master at this level

Numpy/scipy
Matplotlib/seaborn
scikit-learn
pandas
Tensorflow and Keras
Pytorch
Jupyter/IPython notebooks
Natural Language Toolkit

See Technical setup for details on installing.

Machine learning skill checklist

Machine Learning basics
- Linear regression,
- linear classification and logistic regression,
- experimental methodology: train/test/validate,
- cross-validation,
- performance assessment: Machine Learning and Performance Evaluation — Overcoming the Selection Bias, Video,
- reporting results and visualization, and
- parameter optimization, grid search, its challenges.
Multilayer Perceptrons (MLPs) (slides):
- Need for more than one layer of neurons,
- gradient descent and error backpropagation,
- stochastic gradient descent, and
- designing neural networks, choice of activation functions on each layer.
Deep Learning (slides):
- Why can’t you train a plain MLP with many layers: vanishing gradients,
- Going deep one layer at a time: stacked auto-encoders.
Convolutional Neural Networks (CNNs):
- Why MLPs can’t handle images?
- Notion of weight sharing.
- Convolutional layer.
- Pooling layer.
Recurrent Neural Networks (RNNs)
- Basics from MLPs to RNNs.
- RNN challenges.
- Long short-term memories (LSTMs).
- Teacher forcing.
- Professor forcing.
- Recursive neural networks.

Computational linguistics and natural language processing check list

Note: We have taken some comments and links from Steve’s Glossary.

CL/NLP main concepts
- information retrieval
- NLP overview wikipedia
- bag of words
- n-grams See Bigrams, except that n-grams are groups of any number of adjacent characters, words, etc. The large the groups you analyze, the more specific information you can get; however, the corpus size you need goes up even faster.
- Term frequency - inverse document frequency (TD-IDF).
- Corpus (dataset) pre-processing.
CL/NLP problems
- Sentence segmentation, part of speech tagging, and parsing: Natural language processing can be used to analyze parts of a sentence to better understand the grammatical construction of the sentence.
- Deep analytics: Deep analytics involves the application of advanced data processing techniques in order to extract specific information from large or multi-source data sets. Deep analytics is particularly useful when dealing with precisely targeted or highly complex queries with unstructured and semi-structured data. Deep analytics is often used in the financial sector, the scientific community, the pharmaceutical sector, and biomedical industries. Increasingly, however, deep analysis is also being used by organizations and companies interested in mining data of business value from expansive sets of consumer data.
- Machine Translation: Natural language processing is increasingly being used for machine translation programs, in which one human language is automatically translated into another human language.
- Named entity extraction: In data mining, a named entity definition is a phrase or word that clearly identifies one item from a set of other items that have similar attributes. Examples include first and last names, age, geographic locations, addresses, phone numbers, email addresses, company names, etc. Named entity extraction, sometimes also called named entity recognition, makes it easier to mine data. NER.
- Co-reference resolution: In a chunk of text, co-reference resolution can be used to determine which words are used to refer to the same objects.
- Automatic summarization: Natural language processing can be used to produce a readable summary from a large chunk of text. For example, one might us automatic summarization to produce a short summary of a dense academic article.

See https://en.wikipedia.org/wiki/Natural-language_processing#Major_evaluations_and_tasks for a more detailed list of problems. Descriptions taken from https://www.kdnuggets.com/2015/12/natural-language-processing-101.html

Application of statistical/machine learning for NLP
- Latent Dirichlet allocation
- Conditional Random Fields
- Hidden Markov Models (HMM)
- Support Vector Machines
- Naïve Bayes classifier.
Metric embedding
- Understanding word embedding - http://blog.aylien.com/overview-word-embeddings-history-word2vec-cbow-glove/
- word2vec, resources
- glovec,
- doc2vec,
- pre-trained word embeddings, Facebook embeddings,
- add yours here
“Deep” NLP
- NER and the Road to Deep Learning
- Using CNNs for text classification, common architectures, meaning of 1-D convolution
- Recurrent neural networks and LSTM for text classification.
Character-level representation
- Limitations of word-based representation: typos, suffixes, etc.

learning-dl-nlp-notes

An opinionated meta-tutorial on machine learning, deep learning and natural language processing