Natural Language Processing for Semantic Search

James Briggs
James Briggs

Learn how to make machines understand language as people do. This free course covers everything you need to build state-of-the-art language models, from machine translation to question-answering, and more.

Introduction

Semantic search has long been a critical component in the technology stacks of giants such as Google, Amazon, and Netflix. The recent democratization of these technologies has ignited a search renaissance, and these once guarded technologies are being discovered and quickly adopted by organizations across every imaginable industry.

Why the explosion of interest in semantic search? It unlocks an essential recipe to many products and applications, the scope of which is unknown but already broad. Search engines, autocorrect, translation, recommendation engines, error logging, and much more are already heavy users of semantic search. Many tools that can benefit from a meaningful language search or clustering function are supercharged by semantic search.

Two pillars support semantic search; vector search and NLP. In this course, we focus on the pillar of NLP and how it brings ‘semantic’ to semantic search. We introduce concepts and theory throughout the course before backing them up with real, industry-standard code and libraries.

You will learn what dense vectors are and why they’re fundamental to NLP and semantic search. We cover how to build state-of-the-art language models covering semantic similarity, multilingual embeddings, unsupervised training, and more. Learn how to apply these in the real world, where we often lack suitable datasets or masses of computing power.

In short, you will learn everything you need to know to begin applying NLP in your semantic search use-cases.

Let’s begin!

Subscribe to Pinecone

Get the latest updates via email when they're published:

Chapter 01

Dense Vectors

An overview of dense vector embeddings with NLP.

Chapter 02

Sentence Transformers and Embeddings

How sentence transformers and embeddings can be used for a range of semantic similarity applications.

Chapter 03

Training Sentence Transformers with Softmax Loss

The original way of training sentence transformers like SBERT for semantic search.

Chapter 04

Training Sentence Transformers with Multiple Negatives Ranking Loss

How to create sentence transformers by fine-tuning with MNR loss.

Chapter 05

Multilingual Sentence Transformers

How to create multilingual sentence transformers with knowledge distillation.

Chapter 06

Unsupervised Training for Sentence Transformers

How to create sentence transformer models without labelled data.

Chapter 07

An Introduction to Open Domain Question-Answering

The illustrated overview to open domain question-answering.

Chapter 08

Retrievers for Question-Answering

How to fine-tune retriever models to find relevant contexts in vector databases.

Chapter 09

Readers for Question-Answering

How to fine-tune reader models to identify answers from relevant contexts.

Chapter 10

Data Augmentation with BERT

Augmented SBERT (AugSBERT) is a training strategy to enhance domain-specific datasets.

Chapter 11

Domain Transfer with BERT

Transfer information from an out-of-domain (or source) dataset to a target domain.

Chapter 12

Unsupervised Training with Query Generation (GenQ)

Fine-tune retrievers for asymmetric semantic search using GenQ.

Chapter 13

Generative Pseudo-Labeling (GPL)

A powerful technique for domain adaptation using unstructured text data.

Chapter 14

Training Sentence Transformers

The most popular methods for training sentence transformers, and tips for each.

Chapter 15

And more...

Start building knowledgeable AI today

Create your first index for free, then pay as you go when you're ready to scale.