Support additional source separation models #19

JeffreyCA · 2020-10-11T21:53:08Z

Summary of other models

Model	Supported?	Paper	Source code	Vocals (SDR)	Drums (SDR)	Bass (SDR)	Other (SDR)	Avg (SDR)	Notes
Spleeter	Yes	Link	Yes	6.55	5.93	5.10	4.24	5.46
Demucs	Yes	Link	Yes	6.29	6.08	5.83	4.12	5.58
Conv-Tasnet	Yes	Link	Yes	6.81	6.08	5.66	4.37	5.73	Worse perceived quality than Demucs
X-UMX	Yes	Link	Yes	5.53	6.33	4.54	6.50	5.73	Slow CPU separation
D3Net	Yes	Link	Yes	7.24	7.01	5.25	4.53	6.01	Slow CPU separation
MMDenseLSTM	No	Link	Yes	6.6	6.43	5.16	4.15	5.59	No pretrained models
Meta-TasNet	No	Link	Yes	6.4	5.91	5.58	4.19	5.52	Issues with higher frequencies (sum of sources do not equal original) (pfnet-research/meta-tasnet#4)
Nachmani et al.	No	Link	No	6.92	6.15	5.88	4.32	5.82
LaSAFT	No	Link	Yes	7.33	5.68	5.63	4.87	5.88	Looks promising! Sum of sources do not equal original (ws-choi/Conditioned-Source-Separation-LaSAFT#3 (comment))

JeffreyCA · 2020-12-22T03:16:17Z

I will prioritize adding the following models:

Demucs (v2.0.0 - Demucs integration #47)
Tasnet (v2.0.0 - Demucs integration #47)
X-UMX
d3net

ws-choi · 2021-02-01T02:32:06Z

Hi Jeffrey! I recommend to postpone adding LaSAFT features. We are going to re-organize the code structure, aligned with the camera-ready version of the ICASSP 2021 paper (our paper was accepted to ICASSP 2021). It might cause conflicts. We'll also upload check-points of models trained on the larger scale (n_fft of 4096; currently we only support 2048). We will finish refactoring until March. Thank you.

JeffreyCA · 2021-02-01T03:11:15Z

Hi @ws-choi, thanks for the update! I meant to comment earlier but I intend to only support models where the separated sources closely add up to the original source. Will your changes help with this?

I'm not very familiar with these conferences so this is the first time hearing about ICASSP and it's being hosted in Toronto this year (although virtual)! Do you know what other conferences are there related to this research field?

ws-choi · 2021-02-01T03:27:29Z

Will your changes help with this? =>
This time update will not support it, but future updates might.
Since I have to change the overall structure of training for it, I need more time.
I'll let you know if LaSAFT-Net provides such features :)

What other conferences are there related to this research field? =>
International Society for Music Information Retrieval (ISMIR) is the most relevant conference.

and other ML conferences such as Neurips, ICLR, ICML, AAAI, IJCAI, IJCNN and ECAI,
or signal processing conferences such as ICASSP, interspeech might also include the state-of-the-art papers in this domain.

JeffreyCA · 2021-02-01T03:34:36Z

Awesome, thanks!

JeffreyCA · 2021-06-21T06:15:23Z

D3Net support is coming very soon!

jacksongoode · 2021-10-30T08:44:53Z

Would love to see LaSAFT!

Ma5onic · 2022-04-26T05:56:39Z

@JeffreyCA, Could you please add support for the kuielab MDX-Net models? both leaderboard A and leaderboard B?
Their best model scored a 9.00 for the SDR of vocal separation, compared to the hybrid demucs model which scored a SDR of 8.13.

Model Comparison:
https://paperswithcode.com/sota/music-source-separation-on-musdb18
That list is a good reference as it lists open source models that have better than the scores that you mentioned in your original comment:

Summary of other models

Model Supported? Paper Source code Vocals (SDR) Drums (SDR) Bass (SDR) Other (SDR) Avg (SDR) Notes
Spleeter Yes Link Yes 6.55 5.93 5.10 4.24 5.46
Demucs Yes Link Yes 6.29 6.08 5.83 4.12 5.58
Conv-Tasnet Yes Link Yes 6.81 6.08 5.66 4.37 5.73 Worse perceived quality than Demucs
X-UMX Yes Link Yes 5.53 6.33 4.54 6.50 5.73 Slow CPU separation
D3Net Yes Link Yes 7.24 7.01 5.25 4.53 6.01 Slow CPU separation
MMDenseLSTM No Link Yes 6.6 6.43 5.16 4.15 5.59 No pretrained models
Meta-TasNet No Link Yes 6.4 5.91 5.58 4.19 5.52 Issues with higher frequencies (sum of sources do not equal original) (pfnet-research/meta-tasnet#4)
Nachmani et al. No Link No 6.92 6.15 5.88 4.32 5.82
LaSAFT No Link Yes 7.33 5.68 5.63 4.87 5.88 Looks promising! Sum of sources do not equal original (ws-choi/Conditioned-Source-Separation-LaSAFT#3 (comment))

Could you also update the demucs installer to also include their hybrid model?
I've used both before and will try to help out, but I can't guarantee that I can get it integrated.

JeffreyCA · 2022-04-26T06:16:52Z

Thanks for the suggestion, I'll check that out. The latest Spleeter Web already supports Demucs v3, which is the Hybrid version.
I'm always open to contributions 🙂

Ma5onic · 2022-04-26T06:49:33Z

Awesome! I'll look at the way you deploy your containers and try to follow the same structure.

Here is a presentation that breaks down how it works:
https://ws-choi.github.io/personal/presentations/slide/2021-08-21-aicrowd

The readme was updated since I forked it:
kuielab/mdx-net-submission@80f5983
They finally added notes for adding custom models and it seems that someone already trained an improved version using the UVR dataset:
kuielab/mdx-net-submission@3dc5581
https://github.com/Anjok07/ultimatevocalremovergui/releases/tag/MDX-Net-B
The model achieved a 9.708 SDR score on aicrowd's private testset

JeffreyCA · 2022-04-30T20:54:41Z

It's also a bit more complex as it requires Demucs 2, and Spleeter Web uses v3.

Ma5onic · 2022-05-10T20:22:51Z

I'll try to get an isolated container working for the default kuielab code, then I'll see if it will work with Demucs v3 by changing the requirements.txt to the latest demucs pip release. (I highly doubt that it'll be that easy, but I'll try nonetheless)
I do have hope however, because the README of demucs v3, they mention that model a couple times & make direct comparisons to it:

When trained only on MusDB HQ, Hybrid Demucs achieved a SDR of 7.33 on the MDX test set, and 8.11 dB with 200 extra training tracks. It is particularly efficient for drums and bass extraction, although KUIELAB-MDX-Net performs better for vocals and other accompaniments.

Ma5onic · 2022-10-02T17:27:52Z

After further investigation, I found that mdx-net uses the Demucs v2 code but downloads the Demucs v3 model. It can be installed without conflict by using anaconda/miniconda.
I just realized this, @ws-choi is one of the main contributors to that project (mdx-net).

dts350z · 2022-12-22T17:40:25Z

Can we have the Spleeter model with Piano (5 stems instead of 4)?

Ma5onic · 2023-01-15T22:17:13Z

@dts350z, I think that @JeffreyCA implemented the changes that you asked for:
See pull #458
The 5 Stem Spleeter model got merged to the main branch 😄

JeffreyCA changed the title ~~Support other source separation models~~ Support other source separation models (e.g. Demucs) Oct 11, 2020

JeffreyCA pinned this issue Oct 12, 2020

JeffreyCA added the enhancement New feature or request label Dec 18, 2020

JeffreyCA added this to the 1.2 milestone Dec 20, 2020

JeffreyCA changed the title ~~Support other source separation models (e.g. Demucs)~~ Support additional source separation models Dec 22, 2020

JeffreyCA modified the milestones: 1.2, 1.3 Dec 23, 2020

JeffreyCA removed this from the 2.0 milestone Jan 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support additional source separation models #19

Support additional source separation models #19

JeffreyCA commented Oct 11, 2020 •

edited

JeffreyCA commented Dec 22, 2020 •

edited

ws-choi commented Feb 1, 2021

JeffreyCA commented Feb 1, 2021 •

edited

ws-choi commented Feb 1, 2021 •

edited

JeffreyCA commented Feb 1, 2021

JeffreyCA commented Jun 21, 2021

jacksongoode commented Oct 30, 2021

Ma5onic commented Apr 26, 2022 •

edited

Summary of other models

JeffreyCA commented Apr 26, 2022

Ma5onic commented Apr 26, 2022 •

edited

JeffreyCA commented Apr 30, 2022

Ma5onic commented May 10, 2022 •

edited

Ma5onic commented Oct 2, 2022 •

edited

dts350z commented Dec 22, 2022

Ma5onic commented Jan 15, 2023

Support additional source separation models #19

Support additional source separation models #19

Comments

JeffreyCA commented Oct 11, 2020 • edited

Summary of other models

JeffreyCA commented Dec 22, 2020 • edited

ws-choi commented Feb 1, 2021

JeffreyCA commented Feb 1, 2021 • edited

ws-choi commented Feb 1, 2021 • edited

JeffreyCA commented Feb 1, 2021

JeffreyCA commented Jun 21, 2021

jacksongoode commented Oct 30, 2021

Ma5onic commented Apr 26, 2022 • edited

Summary of other models

JeffreyCA commented Apr 26, 2022

Ma5onic commented Apr 26, 2022 • edited

JeffreyCA commented Apr 30, 2022

Ma5onic commented May 10, 2022 • edited

Ma5onic commented Oct 2, 2022 • edited

dts350z commented Dec 22, 2022

Ma5onic commented Jan 15, 2023

JeffreyCA commented Oct 11, 2020 •

edited

JeffreyCA commented Dec 22, 2020 •

edited

JeffreyCA commented Feb 1, 2021 •

edited

ws-choi commented Feb 1, 2021 •

edited

Ma5onic commented Apr 26, 2022 •

edited

Ma5onic commented Apr 26, 2022 •

edited

Ma5onic commented May 10, 2022 •

edited

Ma5onic commented Oct 2, 2022 •

edited