Machine Learning and especially Deep Learning are playing increasingly important roles in the field of Natural Language Processing. Over the past few years, Transformer architectures have become the state-of-the-art (SOTA) approach and the de facto preferred route when performing language related tasks.
While once you are getting familiar with Transformes the architecture is not too difficult, the learning curve for getting started is steep. What’s more, the complexity of Transformer based architectures also makes it challenging to build them on your own using libraries like TensorFlow and PyTorch.
Fortunately, today, we have HuggingFace Transformers – which is a library that democratizes Transformers by providing a variety of Transformer architectures (think BERT and GPT) for both understanding and generating natural language. What’s more, through a variety of pretrained models across many languages, including interoperability with TensorFlow and PyTorch, using Transformers has never been easier.
Transformers (formerly known as pytorch-transformers and pytorch-pretrained-bert) provides general-purpose architectures (BERT, GPT-2, RoBERTa, XLM, DistilBert, XLNet…) for Natural Language Understanding (NLU) and Natural Language Generation (NLG) with over 32+ pretrained models in 100+ languages and deep interoperability between TensorFlow 2.0 and PyTorch.HuggingFace (n.d.)
At MachineCurve, we offer a variety of articles for getting started with HuggingFace. This page nicely structures all these articles around the question “How to get started with HuggingFace Transformers?”. It offers a go-to page for people who are just getting started with HuggingFace Transformers. In fact, I have learned to use the Transformers and library through writing the articles linked on this page. Going from intuitive understanding to advanced topics through easy, few-line implementations with Python, this should be a great place to start.
Have fun! 🚀🤗
Table of contents
- What are Transformers?
- Saying hello to HuggingFace Transformers
- Advanced topics
What are Transformers?
I’m a big fan of castle building. It means that when you want to understand something in great detail, it’s best to take a helicopter viewpoint rather than diving in and looking at a large amount of details. Castles are built brick by brick and with a great foundation. On this website, my goal is to allow you to do the same, through the Collections series of articles. That’s why, when you want to get started, I advise you to start with a brief history of NLP based Machine Learning and an introduction to the original Transformer architecture.
- From vanilla RNNs to Transformers: a history of Seq2Seq learning
- An Intuitive Explanation of Transformers in Deep Learning
- Differences between Autoregressive, Autoencoding and Seq2Seq models.
Saying hello to HuggingFace Transformers
Now that you understand the basics of Transformers, you have the knowledge to understand how a wide variety of Transformer architectures has emerged. Let’s now proceed with all the individual architectures. This is followed by implementing a few pretrained and fine-tuned Transformer based models using HuggingFace Pipelines. Slowly but surely, we’ll then dive into more advanced topics.
Looking at Transformer Architectures
- Funnel Transformer
- Transformer XL
Getting started with Transformer based Pipelines
Now that you know a bit more about the Transformer Architectures that can be used in the HuggingFace Transformers library, it’s time to get started writing some code. Pipelines are a great place to start, because they allow you to write language models with just a few lines of code. They use pretrained and fine-tuned Transformers under the hood, allowing you to get started really quickly. In the articles, we’ll build an even better understanding of the specific Transformers, and then show you how a Pipeline can be created.
- Easy Sentiment Analysis with Machine Learning and HuggingFace Transformers
- Easy Text Summarization with HuggingFace Transformers and Machine Learning
- Easy Question Answering with Machine Learning and HuggingFace Transformers
- Easy Named Entity Recognition with Machine Learning and HuggingFace Transformers
- Easy Machine Translation with Machine Learning and HuggingFace Transformers
- Easy Masked Language Modeling with Machine Learning and HuggingFace Transformers
Running other pretrained and fine-tuned models
The pipelines above are the easiest implementations of pretrained Transformer models. You have to do nothing more than importing the
pipeline and then initializing it. It’s then ready to start converting data into summaries, translations, and more.
However, there are more pretrained models out there. The HuggingFace Model Hub contains many other pretrained and finetuned models, and weights are shared. This means that you can also use these models in your own applications. Now that you understand a
pipeline into more detail, it’s time to dive into the
PreTrainedTokenizerFast tokenizers and
TFPreTrainedModel pretrained models for PyTorch and TensorFlow, respectively. Let’s do that now.
- Speech Recognition / Text-to-Speech with Wav2vec2
- Causal Language Modeling with GPT2
- Table Parsing / Table Question Answering with TAPAS
- Transformers for Long Text: Examples for using Longformer
- Building a Chatbot with DialoGPT
- Coming later.
- Coming later.
- Coming later.
HuggingFace. (n.d.). Transformers — transformers 4.1.1 documentation. Hugging Face – On a mission to solve NLP, one commit at a time. https://huggingface.co/transformers/index.html
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30, 5998-6008.