Extract Word Timings with HuggingFace Wav2vec2

Ask Questions Forum: ask Machine Learning Questions to our readersCategory: PyTorchExtract Word Timings with HuggingFace Wav2vec2
Oscar asked 2 weeks ago

I’m looking to extract word timing information from the HuggingFace Wav2vec2 transformer.
Ideally the output would be in JSON format such that every word has an associated start time and end time:
{
“time_start”: 
“time_end”:
“text”:
}
Does anyone know if there is an easy approach to achieve this?

Your Answer

3 + 4 =