Does adding Language Modeling as an objective help Transformer fine-tuning?

Ask Questions Forum: ask Machine Learning Questions to our readersCategory: Machine LearningDoes adding Language Modeling as an objective help Transformer fine-tuning?
Chris Staff asked 11 months ago
1 Answers
Chris Staff answered 11 months ago

The answer is a yes sometimes. The GPT paper i.e. Radford et al. (2018) has performed a set of experiments with an “auxiliary language modeling task added to fine-tuning”.
 
In other words, they used the following loss for optimization during fine-tuning:
 
\(L_{combined}(C) = L_{ft}(C) + \lambda \times L_{u}(C)\)
 
The combined loss function is composed of the fine-tuning loss function (which is task-specific) and a \(\lambda\) weighted unsupervised training loss function.
 
Radford et al. (2018) shows that adding this additional Language Modeling loss function (and hence language modeling objective) during fine-tuning improves the results in some cases:

  • On Natural Language Inference (text entailment) tasks, performance is improved sometimes.
  • On Semantic Similarity inference (question pairs) tasks, performance is improved sometimes.
  • Performance of fine-tuning is especially apparent on larger datasets.

 
Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training.

Your Answer

12 + 3 =