New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for FLAC compressed samples? #605
Comments
SF4 seems to be very unknown. So far we only have one reply: https://lists.nongnu.org/archive/html/fluid-dev/2020-01/msg00007.html At least it's a positive one. I'm also positive because it should integrate into the current code base very easily. In case anybody is willing to draft a PR, you're welcome. (Not sure when/if I find time for that.) |
As already written in the thread on fluid-dev, I retract my positive reply. I thought I could reduce my loading times with FLAC compressed soundfonts, but that doesn't actually work. @mxmilkb Do you have a good use-case for FLAC compressed soundfonts? I agree that it seems like a sensible idea in principle... but I think that's not enough to create yet another non-standard SF2 variant. |
So great to catch live talk, related to sfz-ext standards. As far as I understand, adding compressed samples to sfz is just matter of grep/sed-ing through its text description, e.g. to replace extensions. Just look to this FR: davy7125/polyphone#7 Unfortunally idea of supporting every possible format via libsndfile was not even commented. Any pros&cons about that? Why should it ever require separate (sub-)format version to just add a single audio file format? |
This thread is about extending SF2, not SFZ. |
There's further discussion on divideconcept/FluidLite#16 I really don't mind by what mechanism, but the use case for FLAC sound banks is plain and simple; smaller yet still lossless files. |
Just visited sf2convert sources page. Now I'm lost in what is sf3. MuseScore names with it an extended sfz, differing only by ogg support. Edit: Another probably good format, which just came to my mind, for sf2/sfz support - opus. |
SF3 and the proposed SF4 are SF2 extensions. It has nothing to do with SFZ whatsoever. |
Well, thanks. Meanwhile I visited musescore soundfonts page, and it states same :/ (how could I miss that) |
Given that we went from SF2 (no compression) to SF3 (Vorbis compression), it seems a bit weird to then go to SF4 (FLAC compression). This means an application could conceivably support SF2 and SF4, but not SF3, just because it doesn't support Vorbis compression. Also, where do we go for Opus compression? SF5 anyone? So we're already thinking about SF5, and yet nothing has changed since SF2 except the type of compression! Perhaps it would be better to do something like SF2X where X is the format. E.g.:
This makes it almost like It's unfortunate that SF3 already breaks this trend, but it's non-standard anyway so it doesn't really matter. |
As already posted on the mailing list: This is not about bumping the Soundfont's major version every single time we want to support a new compression! Bumping to version 3 was necessary due to breaking changes in the sample indexing. No further bumps are required just to support additional compressions. Please scratch that thought from your brains. I changed the title to clarify that. What we're really looking for are people who have a use-case for FLAC, OPUS, or what-so-ever compressed samples.
The compression is sample-specific. That is, each sample may be compressed differently than the others. Thus, you cannot name the whole file after its compression. |
When I experimented converting by banks (sorry, again SFZ))) to FLAC - just essential, that compression requires some CPU load during load. Similar to linux kernel loading schema. Why it really matters for me - way lesser space usage. Just like WAV is worst to store music, sampler bank collection also would need at least some compression. Even 2x to 3x space use reduction, most typical for FLAC, looks valuable. |
We are looking for real-world use cases for FLAC (or opus, or whatever) compressed samples in SF3 soundfonts. I think we all agree that compression is a good concept in general. But so far nobody has come forward stating an actual problem that FLAC would solve (again: for SF3 soundfonts). I thought it might solve my loading time problem, but that turned out to be not true. |
For sf2 there could be SF2C (sf2 compressed) or so file ext. Edit:
Ok, but I noticed some discussion about possible solutions and proposed own). |
Thanks for that. But the technical aspect is not really the issue here. Adding FLAC support to SF3 is nearly trivially easy to implement. The question is: does anyone have a really good use-case that justifies creating yet another non-standard extension to the already non-standard and undocumented SF3 format? |
Hm... If sf3 is not standard, then why worry. But even then - considering it's like standard version (v2 for sf2, v3 for sf3), it could be like v3.1. Still sf3, but even further improved. Adding more sample format support doesn't look breaking. At least unless it's baked as standard. |
Because we are not alone in the world but part of a larger community of projects that read and write SoundFonts. Of course we could simply create our own format and write a tool that converts SF2 to FLAC compressed SF3. But even for internal use we would probably want to properly document and specify it. And maintain it for the forseeable future. Regardless of the standardisation question... the question is: why should we do that, why spend time and effort on it, what is the real-world use-case here? And limiting it to a FluidSynth-only extension doesn't remove that question. Now the question would be: who has a use-case for FLAC compressed SF3 SoundFonts that only work with FluidSynth? |
Oh, interesting to see this. I’ve requested FLAC-compressed samples (for its lossless property) in SF3 soundfonts on the MuseScore side years ago. This could be done in SF3 because you can put FLAC into an Ogg container, and SF3 basically specifies Ogg containers (or could be read as doing so). Sadly, nobody on the MuseScore side even understood why I would wish for that… Use case is easy: soundfonts get huge (MuseScore_General_HQ is currently 467 MiB as SF2, 82 MiB as Vorbis-compressed SF3). I’d estimate it to need less than 200 MiB as FLAC-compressed SF3, and FLAC can be used as source format for editing, whereas Vorbis is lossy. But, of course, MuseScore (Cc’ing @anatoly-os because he could spread the word on their side) would have to support it in their internal fluid synthesiser as well. (Also, Vorbis currently adds the hard-coded string |
But the question remains: what is the use-case of having a 200 MB vs. 467 MB Soundfont? I mean: why do you need smaller soundfonts? Do you have 200 different Soundfonts on your machine and have run out of disk space? Or do you need to transfer the soundfonts via low-bandwidth connections? (Not trying to be difficult here... I just want to make sure there is an actual use-case an not just the "smaller is always better" argument). |
Marcus Weseloh dixit:
Or do you need to transfer the soundfonts via low-bandwidth
connections?
Ever been to Germany? ;-) So, yes.
(Also, been helping out some people stuck on crap devices
like the Raspberry Pi recently. Anything to reduce size
may help.)
bye,
//mirabilos
--
Wish I had pine to hand :-( I'll give lynx a try, thanks.
Michael Schmitz on nntp://news.gmane.org/gmane.linux.debian.ports.68k
a.k.a. {news.gmane.org/nntp}#news.gmane.linux.debian.ports.68k in pine
|
I see, it supports libsndfile. Just like linuxsampler, which simply takes what libsndfile can read. If sndfile fails, than that's real obstacle. |
Ja, ich halte mich da hauptsächlich auf. :-)
Not sure I follow: how would reducing on-disk size of Soundfonts help the Raspberry Pi? |
Marcus Weseloh dixit:
Not sure I follow: how would reducing on-disk size of Soundfonts help
the Raspberry Pi?
I’ve hopes that a mode will be implemented in which not all samples
for all instruments need to be uncompressed ahead of time (only for
instruments actually needed).
Other than that, package download and SD card size of course, though
that applies to other devices as well.
bye,
//mirabilos
--
“ah that reminds me, thanks for the stellar entertainment that you and certain
other people provide on the Debian mailing lists │ sole reason I subscribed to
them (I'm not using Debian anywhere) is the entertainment factor │ Debian does
not strike me as a place for good humour, much less German admin-style humour”
|
If your goal is to reduce RAM consumption, we already have such a feature: |
For the Pi specifically, yes. Network/disc usage is a more general goal. Both are somewhat orthogonal, admittedly.
Oh, nice! @anatoly_os we need that in MuseScore. |
After testing out #652 and needing to download a 4GB soundfont, I do see some value in FLAC compression to reduce file size :-) |
Ok... in my opinion, before we can even think about extending SF3 to support different encoding formats, we need a specification for the current SF3 format. I've created a first draft in the FluidSynth wiki here: https://github.com/FluidSynth/fluidsynth/wiki/SoundFont3Format Please note that this page is meant to document the current SF3 format, not about extensions for other encoders or other changes. Any comments, clarifications, error corrections highly welcome. |
Marcus Weseloh dixit:
Please note that this page is meant to document the *current* SF3
@Format, not about extensions for other encoders or other changes. Any
Sure. We might wish to get @anatoly_os and @lasconic at the very
least from the MuseScore side into the boar.
@comments, clarifications, error corrections highly welcome. derselbst
@is there a way to make this page editable for everybody? Or do you
@Think there would be a better place to discuss and document the SF3
@Format?
Unsure if there is one, but the musescore/sftools repository also
contains hidden knowledge, and both @ChurchOrganist (I think) and
@mrbumpy409 (definitely) have worked with and tweaked it.
|
Thanks for the wiki page @mawe42. I think that's a good place. "All samples in an SF3 file MUST BE Ogg Vorbis compressed, there is no support for mixing uncompressed (SF2) and compressed (SF3) samples." Marcus, it's not clear to me what makes you think so? Using
Every authenticated GH user should already be able to edit it. |
One more question @derselbst: is this issue the best place to discuss the SF3 spec/description? Ideally we would want to get input from even more people, MuseScore, Polyphone, and others. Not sure how the best approach would be here... |
I understand your point. But keep in mind that this tool was designed by Werner with the sole intention to reduce file size of a SF2. Thus it's only natural to compress all samples. Also, adding support for mixed sample compression would have tremendously increased development effort.
I think this is not quite correct. My arguing is, that starting with major version 3, indexing was changed to bytes regardless of the compression used. Look at this section If
Yeah, I admit, that's tricky. To me, this seems to be underspecified in SF3, so we do have some room for defining this case.
For fluidsynth, we usually discuss on the mailing list. But since this issue has already received quite a lot of interest here on GH and because the community of the mentioned projects also (partly) lives on GH, I think it's better to keep the discussion here. |
You are right. But the only other software I know that can write SF3 files (Polyphone) also compresses all samples. Then again, there is no reason to prevent mixing... only the fact that SF3 is not specified and how programs like Polyphone, MuseScore, FluidSynth deal with mixed files might differ currently (see below).
Well, that is open to interpretation (as nearly everything in SF3). The behaviour of the sf3convert tool is not a clear indicator. As it only writes SF3 when compression is enabled, and then compresses every sample, the fact that byte indices are used could simply be because all samples are compressed, not because SF3 means all indices are byte indices. And in fact, our SF3 implementation only uses byte indices for samples with OGG_VORBIS sample type. So if we were to load a mixed SF3 now, things would probably break. But you are right, of course. My "All samples in an SF3 file MUST BE Ogg Vorbis compressed" is also an interpretation. Hm.... maybe we should approach this differently. Decide how things should be and make sure that the final spec is compatible with currently existing SF3 files and implementations?
Ok, then tonight I'll write in the MuseScore forum and invite the devs here and also contact Polyphone and the fluid-dev mailing list. |
For discussion about possible new features for the SoundFont format, see the proposal for an upgraded SoundFont spec by the author of Polyphone here. |
S. Christian Collins dixit:
For discussion about possible new features for the SoundFont format,
see the proposal for an upgraded SoundFont spec by the author of
Polyphone [here](https://github.com/davy7125/soundfont-standard-v3).
Interesting. I commented on some things of it, but github makes
that harder than it should be, needing to do line comments on
commits, which are almost not discoverable by anyone else.
|
Marcus Weseloh dixit:
You are right. But the only other software I know that can write SF3
files (Polyphone) also compresses all samples. Then again, there is no
Polyphone just uses a fork of (possibly an old version of) the MuseScore
sftools code: davy7125/polyphone#105
Well, that is open to interpretation (as nearly everything in SF3). The
behaviour of the sf3convert tool is not a clear indicator. As it only
The MuseScore code to read SF3 codes only uses byte offsets if the
*sample* type is Ogg Vorbis:
if (sampletype & FLUID_SAMPLETYPE_OGG_VORBIS) {
if (!fd.seek(sf->samplePos() + start))
return;
}
else {
if (!fd.seek(sf->samplePos() + start * sizeof(short)))
return;
}
But maybe this can be changed, if SF3 is indeed extended to support
multiple formats.
And in fact, our SF3 implementation only uses byte indices for samples
with OGG_VORBIS sample type. So if we were to load a mixed SF3 now,
things would probably break.
Ah, fluid as well then.
Ok, then tonight I'll write in the MuseScore forum and invite the devs
here and also contact Polyphone and the fluid-dev mailing list.
That’s probably for the best.
|
Very interesting, thanks! That looks like quite a large and comprehensive proposal. I'm not sure how much it will actually help in our case, though... as Pandoras box is already wide open. There are .sf3 files already out there and have been for quite some time now. Quite a few software packages have already implemented support for them. My personal aim here was to "reign in" those implementations by actually specifying the current SF3 format. And then maybe extend it with additional encoders/compression algorithms. But try to keep the changes relative to SF2 to a minimum. Definitely don't change the synthesis model of SoundFont 2... but again, this is just my personal opinion. |
I got lost in the "SF Version 3 Draft Specification". The fact that this exists makes it even more important to specify the SF3 format we and all the other projects out there already support. My initial idea was to document what is currently implemented and then fill in the gaps. But thinking about it more (and having read what other people think SF3 should or could be), I feel that there are so many gaps that the discussion might loose focus. So I propose that we at FluidSynth create a draft spec for the current SF3 that is more refined than what is currently on the Wiki page and come back here once it's done, for feedback, discussion and refinement. |
Not sure if this is the best place to discuss this, but today I found out about SF3/SF4 and the sad state this unofficial extension is in. Here are some observations from my side about ambiguities I found. Comments on how they should be treated are very welcome:
|
It's been already one year ago, when Marcus, JJC and I had a private conversation about this. At first we wanted to write the current de-facto SF3 standard, before going for public discussion. There was one last open point: we could not agree on the size constraints of the SM24 chunk. Our status-quo is already in the wiki. It should answer your questions: https://github.com/FluidSynth/fluidsynth/wiki/SoundFont3Format |
Thanks, I did overlook the mention of the padding the first time I read it. I think it would be good if that article could acknowledge the issues with cognitone/sf2convert: I'm not sure if there is any other tool at the moment that would write SF2+FLAC files, so any SF2 files with FLAC samples that currently exist will not be using the 0x10 sample type. While this situation is ugly in many ways, Maybe the specification could at least assume that all samples are compressed if the major version is 4. That way, at the time of writing, compliant SF2+Vorbis files can be written with MuseScore's tool, Vorbis support in cognitone's tool remains broken, but at least their SF2+FLAC files could be read properly. I know it's ugly, but so is telling people that all their files are broken and that no tool can handle them. |
IMHO, SF4 does not exist. It's a quick hack entirely made up by cognitone's tool. The major bump from 2 to 3 was necessary because of the semantic changes in the SMPL chunk. But bumping the major version every time when you just want to support a new compression is pointless and poorly engineered. People who were using cognitone's tool should be aware that this is a highly experimental, unstable, likely-to-change and - by now - unmaintained solution. As the tool is apparently broken, all it's files are broken - If it helps, I can mention this on the wiki page, no problem. |
I fully agree that it's badly engineered - but that sadly doesn't change the fact that such files might be floating around in the wild, and people might want to be able to open them. Anyway, in the meantime I have found that this fork of the cognitone tool introduces SF3+FLAC support: https://github.com/KKQ-KKQ/sf2convert/ - so I hope that makes SF4 less relevant all in all. |
SF3+FLAC support is not really necessary. FLAC already supports compressing SF2 files really well. I'm getting 4:1 compression (75% reduction) SF2->FLAC. Contrast that to only 30% smaller when using MacOS "compress" in Finder. This is 99.99% as good as FLAC gets. Yes, I am compressing the sf2 header with FLAC! Lossless has its perks =) The SF2 header only takes up like 0.01% of the compressed FLAC file, which is neglible. Input SF2: 25.7 MB (akai_steinway.sf2)
FLAC: 7.3 MB (akai_steinway.flac)
Reconstructed SF2: 25.7 MB (/akai_steinway_decompressed.sf2) Yes. The SF2 files are really the same still.
|
Chip Weinberger dixit:
SF3+FLAC support is not necessary. FLAC already gives 4:1 lossless
compression on SF2 files.
**Input SF2: 25.7 MB**
`flac --force-raw-format --endian=little --sign=signed --channels=1 --bps=16 --sample-rate=48000 /akai_steinway.sf2 -o /akai_steinway.flac`
**FLAC: 7.3 MB**
But the resulting FLAC file is not a valid SF2 file, which
makes it unusable.
bye,
//mirabilos
--
Gestern Nacht ist mein IRC-Netzwerk explodiert. Ich hatte nicht damit
gerechnet, darum bin ich blutverschmiert… wer konnte ahnen, daß SIE so
reagier’n… gestern Nacht ist mein IRC-Netzwerk explodiert~~~
(as of 2021-06-15 The MirOS Project temporarily reconvenes on OFTC)
|
I am now joining the discussion after a long period of silent reading. I am the author of the Sobanth engine, which is now also in FLStudio since 20.9 the engine of the FLStudio soundfont player, and thus probably also now the most widespread SF2-compatible engine on end user side. And now to the actual topic: Sobanth supports OGG Vorbis and FLAC sample data more or less according to the current SF3 concept, even if still only unofficially. However, at Sobanth the beginning "header" inside the sample raw data is then ultimately decisive for the distinction whether a sample with SF3-SMPL chunk is then loaded as OGG Vorbis or FLAC. Because so it would be in principle later also possible to support e.g. Opus and so on, for which the SF3 file format itself would not have to be changed any more, if the "audio codec" distinction takes place by beginning header inside the sample raw data. |
Benjamin Rosseaux dixit:
However, at Sobanth the beginning "header" inside the sample raw data
is then ultimately decisive for the distinction whether a sample with
SF3-SMPL chunk is then loaded as OGG Vorbis or FLAC. Because so it
would be in principle later also possible to support e.g. Opus and so
on, for which the SF3 file format itself would not have to be changed
any more, if the "audio codec" distinction takes place by beginning
header inside the sample raw data.
True. I’d kinda hoped to be able to do away with fixed headers and
magic numbers to determine the format eventually. Perhaps we could
even store the samples beyond a fixed part of the header is some‐
thing I had hoped for to reduce their size… I’m especially on the
war path against the Vorbis encoder putting a *long* string into
*each single sample*, namely:
| Xiph.Org libVorbis I 20200704 (Reducing Environment)
This is a whopping 52 bytes saving per sample! In a typical soundfont
the string occurs 2202 times, ending up eating about 128 KiB of extra
disc a̲n̲d̲ RAM space!
Okay, let me dream 😾
bye,
//mirabilos
--
“It is inappropriate to require that a time represented as
seconds since the Epoch precisely represent the number of
seconds between the referenced time and the Epoch.”
-- IEEE Std 1003.1b-1993 (POSIX) Section B.2.2.2
|
Of course. But what I mean is that it's much simpler from a standardization standpoint to just FLAC encode the SF2, and have synths support a *.sf2flac format standards are best when they are simple. Even if "proper" SF3+FLAC became a thing, 99% of synths will still decompress all the samples when loaded, using the same amount of ram either way. In other words, "proper" support is not worth the complexity. Just standardize using *.sf2flac! In just 2 lines of terminal code, I've described the entire format. Compare that to SF3+flac! |
Chip Weinberger dixit:
> But the resulting FLAC file is not a valid SF2 file, which makes it unusable.
Of course.
But what I means is that it's much simpler from a **standardization**
standpoint to just FLAC encode the SF2, and have synths support a
*.sf2.flac format
Absolutely not, because then you first have to decode the ENTIRE
thing to memory, instead of just the instrument(s) you actually
need (which is usually less than 1% of the soundfont).
standards are best when they are simple.
Without compromising usability, then I agree.
bye,
//mirabilos
--
"Using Lynx is like wearing a really good pair of shades: cuts out
the glare and harmful UV (ultra-vanity), and you feel so-o-o COOL."
-- Henry Nelson, March 1999
|
It's also not exactly helpful when an application supports both soundfonts and FLAC samples as instrument sources. How should it now decide if that FLAC file should be treated? Just because the first few bytes look like a RIFF header after decompression doesn't have to mean anything. |
So I also have quite a few SF2 files which are larger than 500MB, or even some with several GBs, of which I also almost always use only one to a few instruments from the tens of hundreds or thousands of instrument patches in these SF2 files, where it would be very counterproductive to have to load the whole soundfont into RAM and decompress it just to use a few instrument patches from it. And besides, I can only agree with what sagamusix has already said about the file format recognition headache then. In other words, your idea is well-intentioned, but poorly thought out and not practical in the real world, which is why it is not feasible. |
Who says you need to load it all in memory? After you get your original SF2, you can just load the instruments you need. Programs don't even need to keep the entire SF2 file in storage, if they want to make that optimization. Just discard bytes for instruments you don't need as you decode. Flac decode is quick and requires little ram. *.sf2flac files cover all your needs, and are a very simple standard.
The *.sf2flac file extension, of course! The main way we know how to treat most files. A more complicated flac standard would really need to justify its existence in light of sf2flac already existing. (yes it does exist now, as of my first comment :) |
To give some perf numbers,
|
Chip Weinberger dixit:
flac decompression time: 1.35 s (M1 Macbook Air)
Now this is unfair.
Do it on a RPi 2 with 1 GiB RAM in total.
Without swapping, because that kills the card.
This is the scenario I was looking for (and which
I was actually supporting an actual user on, with
MuseScore).
bye,
//mirabilos
--
<cnuke> den AGP stecker anfeilen, damit er in den slot aufm 440BX board passt…
oder netzteile, an die man auch den monitor angeschlossen hat und die dann für
ein elektrisch aufgeladenes gehäuse gesorgt haben […] für lacher gut auf jeder
LAN party │ <nvb> damals, als der pizzateig noch auf dem monior "gegangen" ist
|
A lot of well-written applications do not need to rely on file extensions. Not every data stream that you may want to feed into an application is going to have a filename associated with it. Your "simple" design is making software that doesn't care about filenames needlessly complex or impossible to write. If you really want an SF2-derived format which stores all samples in a single continuous FLAC stream, it already exists: It's called SF2PACK. No need to re-invent it. |
And besides, still the completely wrong approach, file format design-wise. This only makes loading more complicated, if you also have to put a FLAC decoder in front of it so that it is also able to skip byte ranges, and what if a SF2 loader implementation also wants to do random-seeking, e.g. to reload on-demand samples? Side info: I have implemented my own FLAC decoder from Scratch for Sobanth (and for my own WIP DAW). But not every SF2 compatible engine developer is able to implement an own FLAC decoder or would be able to extend or hook an at least existing FLAC decoder so that it can skip byte ranges accordingly. And besides FLAC decoding then still costs CPU time unnecessarily, also for the byte ranges to be skipped. The bottom line is that this idea is simply not practical, even if it was well intended. |
SF3 uses the Vorbis lossy format, which, whilst offering a reduced file size, comes with a reduced sample quality that would often be unsuitable for a professional audio environment.
SF4 is essentially a similar modern variation on SF2, albeit using losslessly compressed FLAC audio samples, offering a balanced middle-ground between SF2 and SF3.
(The SFZ format already supports FLAC samples.)
There is scant support for SF4 as of yet - there is a conversion script available, https://github.com/cognitone/sf2convert - but does seem an eminently sensible idea.
The text was updated successfully, but these errors were encountered: