transducer grad compute formular #37

zh794390558 · 2022-07-25T08:02:06Z

The formular for gradient is below in warprnnt_numba and warp_transducer cpu:

    T, U, _ = log_probs.shape
    grads = np.full(log_probs.shape, -float("inf"))
    log_like = betas[0, 0]  # == alphas[T - 1, U - 1] + betas[T - 1, U - 1]

    # // grad to last blank transition
    grads[T - 1, U - 1, blank] = alphas[T - 1, U - 1]
    grads[: T - 1, :, blank] = alphas[: T - 1, :] + betas[1:, :]

    # // grad to label transition
    for u, l in enumerate(labels):
        grads[:, u, l] = alphas[:, u] + betas[:, u + 1]

    grads = -np.exp(grads + log_probs - log_like)

that is not same to torchaudio, optimized_transducer and ,warp_transducer gpu,
but you said that warp_transducer cpu grad is same to optimized_transducer and torchaudio, how that is achieved?

The text was updated successfully, but these errors were encountered:

csukuangfj · 2022-07-25T08:05:17Z

but you said that warp_transducer cpu grad is same to optimized_transducer and torchaudio

Where did you find that?

csukuangfj · 2022-07-25T08:06:49Z

The README.md says:

Therefore, optimized_transducer produces the same alpha and beta as warp-transducer for the same input.

It only says alpha and beta, not grad.

zh794390558 · 2022-07-25T08:18:36Z

It borrows the methods of computing alpha and beta from warp-transducer. Therefore, optimized_transducer produces the same alpha and beta as warp-transducer for the same input.

However, warp-transducer produces different gradients for CPU and CUDA when using the same input. See https://github.com/HawkAaron/warp-transducer/issues/93. I also created a [colab notebook](https://colab.research.google.com/drive/1vMkH8LmiCCOiCo4KTTEcv-NU8_OGn0ie?usp=sharing) to reproduce that issue.

This project produces consistent gradient on CPU and CUDA for the same input, just like what torchaudio is doing. (We borrow the gradient computation formula from torchaudio).

Sorry, I got it wrong. So for the known conclusion, trochaudio is aligned with optimized_transducer. The warp_transducer gpu will has the same grad result as optimized_transducer, beside warp_transducer cpu since the gradient formula is not right?

zh794390558 · 2022-07-25T08:21:44Z

why cpu and gpu loss for warp_transducer is not equal, in the codelab?

I think the above wrong conclusion is got from here.

csukuangfj · 2022-07-25T08:23:19Z

The warp_transducer gpu will has the same grad result as optimized_transducer

No. You can find the conclusions in the colab (listed in the README.md).

why cpu and gpu loss for warp_transducer is not equal, in the codelab?

Please ask the author of warp-transducer.

zh794390558 · 2022-07-29T03:19:41Z

用的codalab的case， espnet的rnnt，结果是一致的。是我使用有问题吗？

csukuangfj · 2022-07-29T06:23:57Z

我刚刚又跑了一遍上面的 colab notebook, 发现复现不了以前的结果了。不知道哪里出问题了。

zh794390558 · 2022-07-29T09:45:35Z

所以这个问题还有吗？可能是cuda版本问题？

BTW, 能把colab里的torch版本固定吗？上次跑了下，发现无法跑通。

csukuangfj · 2022-07-29T09:52:52Z

codelab

readme.md 中，给的 colab notebook, 里面使用了 Tesla K80 gpu.

我今天试的 colab notebook, 被分配到了 Tesla T4, 所以测试环境不一样了。

如果你能在 Tesla K80 gpu 中复现的话，那么，这个问题，就是存在的。不能的话，那么应该就不存在了。

（我稍后在本地的 v100 gpu 中，看能不能复现).

BTW, 能把colab里的torch版本固定吗？上次跑了下，发现无法跑通。

可以的。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

transducer grad compute formular #37

transducer grad compute formular #37

zh794390558 commented Jul 25, 2022 •

edited

csukuangfj commented Jul 25, 2022

csukuangfj commented Jul 25, 2022

zh794390558 commented Jul 25, 2022 •

edited

zh794390558 commented Jul 25, 2022 •

edited

csukuangfj commented Jul 25, 2022

zh794390558 commented Jul 29, 2022

csukuangfj commented Jul 29, 2022

zh794390558 commented Jul 29, 2022 •

edited

csukuangfj commented Jul 29, 2022

transducer grad compute formular #37

transducer grad compute formular #37

Comments

zh794390558 commented Jul 25, 2022 • edited

csukuangfj commented Jul 25, 2022

csukuangfj commented Jul 25, 2022

zh794390558 commented Jul 25, 2022 • edited

zh794390558 commented Jul 25, 2022 • edited

csukuangfj commented Jul 25, 2022

zh794390558 commented Jul 29, 2022

csukuangfj commented Jul 29, 2022

zh794390558 commented Jul 29, 2022 • edited

csukuangfj commented Jul 29, 2022

zh794390558 commented Jul 25, 2022 •

edited

zh794390558 commented Jul 25, 2022 •

edited

zh794390558 commented Jul 25, 2022 •

edited

zh794390558 commented Jul 29, 2022 •

edited