What is 'feature-based' training for NLP models?

Chris Staff asked 2 months ago
1 Answers
Best Answer
Chris Staff answered 1 month ago

In feature-based training in NLP, a pretrained language model (such as a Transformer like BERT) is used for generating features based on some input tokens. These features are then used for training a smaller and possibly different model, like an LSTM or ConvNet, for a specific language task.
 
This is the opposite of a fine-tuning based task, where the pretrained model is used instead and fine-tuned based on your dataset.
 
In BERT, this can for example be achieved by taking the class output token C as a joint, sentence-level representation of all the input tokens. Since BERT (like any Transformer) processes tokens in parallel through its self-attention mechanism, some information from the tokens spills over in C as well. For this reason, C can be a good sentence-level representation, and can be used for generating sentence-level features to be used in different models.
 

 
While the benefits for the feature-based NLP lie mostly in training and inference speed (and hence saving computational costs), fine-tuning based approaches seem to work a bit better (according to Devlin et al., 2018).
 
Source:
 
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.

Your Answer

7 + 0 =