Is it possible to detect when a correct alignment is not possible? #302

zxul767 · 2023-08-08T06:35:20Z

I'm exploring possibilities on how to gauge whether a transcription algorithm did a good job when we have no supervision available (i.e., no annotated dataset).

It occurred to me that perhaps one way to do this would be to compute some kind of reconstruction score on the audio domain (when doing the forced alignment):

(audio) --> [transcribe] --> (text) --> [force-align] --> (alignment score)
 |                                        ^
 |                                        |
 +----------------------------------------+

Not being too familiar with the implementation of aeneas, I tried testing what would happen if I passed a completely erroneous transcription, but I didn't see an error in the output or anything in the resulting alignment that would help me detect automatically that the transcription was really bad.

After having read how the underlying algorithm works, I suspect this is because the alignment is bounded to a small region along the diagonal of the cost matrix, so even a completely erroneous transcription would result in an alignment that appears reasonable (at least until a human has a look and realizes the transcription is totally wrong).

I was wondering if there's any simple way to modify the algorithm to detect this case? I suspect that it might be possible if we somehow quantified how often the alignment happens on the "fringe" of the diagonal's margin, but I'm not sufficiently familiar with DTW to know if this would actually be a good idea.

Your guidance and help is much appreciated.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it possible to detect when a correct alignment is not possible? #302

Is it possible to detect when a correct alignment is not possible? #302

zxul767 commented Aug 8, 2023 •

edited

Is it possible to detect when a correct alignment is not possible? #302

Is it possible to detect when a correct alignment is not possible? #302

Comments

zxul767 commented Aug 8, 2023 • edited

zxul767 commented Aug 8, 2023 •

edited