University of OXford & University of camBRiDGE
...is a collective term for characteristics that the two institutions share.
Our publications are sneak peeks! ...happy learning kidz!
Natural Language Processing, 2nd Edition
By Bruno Goncalves
TIME TO COMPLETE:5h 23m
TOPICS:Natural Language Processing
PUBLISHED BY:Addison-Wesley Professional
PUBLICATION DATE:October 2021
LECTURE DATE: 10/2021
LECTURE LINK: https://learning.oreilly.com | https://informit.com
https://t.ly/Cxul
| 5 Hours of Video Instruction
| Overview
| Natural Language Processing LiveLessons covers thefundamentals of
| Natural Language Processing in a simple and intuitive
| way,empowering you to add NLP to your toolkit. Using the powerful
| NLTK package, itgradually moves from the basics of text
| representation, cleaning, topicdetection, regular expressions, and
| sentiment analysis before moving on to theKeras deep learning
| framework to explore more advanced topics such as
| textclassification and sequence-to-sequence models. After
| successfully completingthese lessons you'll be equipped with a
| fundamental and practical understandingof state-of-the-art Natural
| Language Processing tools and algorithms.
| About the Instructor
| Bruno Goncalves is a senior data scientist working at the
| intersection of data science and finance who has been programming
| in Python since 2005. For the past 10 years, his work has focused
| on NLP, computational linguistics applications, and social
| networks.
| Skill Level
Intermediate
| Learn How To
Represent text
Clean text
Understand named entity recognition
Model topics
Conduct sentiment analysis
Utilize text classification
Understand word2vec word embeddings
Define GloVe
Transfer learning
Apply language detection
| Who Should Take This Course
Data scientists with an interest in natural language
| processing
| Course Requirements
Basic algebra, calculus, and statistics, plus programming
| experience
| Lesson Descriptions
| Lesson 1, Text Representations: The first step in any NLP
| application is the tokenization and representation of text through
| one-hot encodings and bag of words. Naturally, not all words are
| meaningful, so the next step is to remove meaningless stopwords
| and identify the most relevant words for your application using
| TF-IDF. The next step is to identify n-grams. Finally, you learn
| how word embeddings can be used as semantically meaningful
| representations and finalize things with a practical demo.
| Lesson 2, Text Cleaning: Lesson 2 builds on the text
| representations of Lesson 1 by applying stemming and lemmatization
| to identify the roots of words and reduce the size of the
| vocabulary. Next comes deploying regular expressions to identify
| words fitting specific patterns. The lesson finishes up by demoing
| these techniques.
| Lesson 3, Named Entity Recognition: In named entity recognition
| you develop approaches to tag words by the part of speech to which
| they correspond. You also identify meaningful groups of words by
| chunking and chinking before recognizing the named entities that
| are the subject of your text. The lesson ends with a demonstration
| of the entire pipeline from raw text to named entities.
| Lesson 4, Topic Modeling: Lesson 4 is about developing ways of
| identifying what the main subject or subjects of a text are. It
| begins by exploring explicit semantic analysis to find documents
| mentioning a specific topic and then turns to clustering documents
| according to topics. Latent semantic analysis provides yet another
| powerful way to extract meaning from raw text, as does latent-
| Dirichlet allocation. Non-negative matrix factorization enables
| you to identify latent dimensions in the text and perform
| recommendations and measure similarities. Finally, a hands-on demo
| guides you through the process of using all of these techniques.
| Lesson 5, Sentiment Analysis: After identifying the topics covered
| in a document, the next place to go is how you extract sentiment
| information. In other words, what kind of sentiments are being
| expressed? Are the words used positive or negative? The next step
| is to consider how to handle negations and modifiers and use
| corpus-based approaches to define the valence of each word as
| demonstrated in the lesson-ending demo.
| Lesson 6, Text Classification: In this lesson you learn how to use
| feed forward networks and convolutional neural networks to
| classify the sentiment of movie reviews as a test case for how to
| deploy machine learning approaches in the context of NLP. It also
| discusses further applications of this approach before proceeding
| with a hands-on demo.
| Lesson 7, Sequence Modelling: Lesson 7 builds on the foundations
| laid in the previous lesson to explore the use of recurrent neural
| network architectures for text classification. It starts with the
| basic RNN architecture before moving on to gated recurrent units
| and long short-term memory. It also includes a discussion of
| auto-encoder models and text generation. The lesson wraps up with
| the demo.
| Lesson 8, Applications: This course has focused on some
| fundamental and not-so-fundamental tools of natural language
| processing. This final lesson considers specific applications and
| advanced topics. Perhaps one of the most important developments in
| NLP in recent years is the popularization of word embeddings in
| general and word2vec in particular. This enables you to delve
| deeper into vector representations of words and concepts and how
| semantic relations can be expressed through vector algebra. GloVe
| is the main competitor to word2vec, so this lesson also explores
| its advantages and disadvantages. Also discussed are the potential
| applications of transfer learning to NLP and the question of
| language detection. The lesson finishes with a demo.