What is 'fine-tuning' based training for NLP models?

Ask Questions Forum: ask Machine Learning Questions to our readersCategory: Deep LearningWhat is 'fine-tuning' based training for NLP models?
Chris Staff asked 11 months ago
1 Answers
Best Answer
Chris Staff answered 11 months ago

In the fine-tuning based approach to training a NLP model (mostly used with Transformer architectures), training involves two steps:

  1. Pretraining a model with a large, unlabeled dataset. Specific language tasks are designed for this, such as language modeling, next sentence prediction and masked language modeling.
  2. Finetune your model with a small- to medium-sized labeled dataset. You use the pretrained model for this and effectively tune the model that has generic language understanding capabilities to your own dataset.

For example, you can pretrain a model on a large corpus such as CommonCrawl and then fine-tune it using your own data, which could e.g. be tailored to answering engineering questions.
This approach is the opposite of the feature-based approach for training NLP models, where a pretrained model is used for generating features, which are then used in a smaller model that is better trainable.

Your Answer

2 + 14 =