Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase maximum FFT resolution #176

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from

Conversation

jwt27
Copy link

@jwt27 jwt27 commented Sep 13, 2020

The FFT resolution is currently limited by the maximum width of a canvas element, which is 32768 in most browsers. From what I read online, Internet Explorer even sets the the limit is as low as 8192.

To work around this, the waterfall can be split up over multiple canvas elements drawn side by side.

@jketterl
Copy link
Owner

Wow. what are you doing that you need more than 32k FFT points?

this will take some time to review, my knowledge of these code parts is limited, and I will also need to see if this impacts the rendering performance across various browsers.

The rendering performance could use some improvement in general. I'm no canvas expert, but I'm pretty sure that moving a more-or-less static image across the screen shouldn't take that much CPU power...

BTW: Internet Explorer is on my blacklist, I don't really care about that, it's probably mostly broken for some time now. I can't check it either, I don't have any running Windows installs.

@jwt27
Copy link
Author

jwt27 commented Sep 13, 2020

I just like a high-resolution, high-contrast display. My server can easily handle it, and adpcm compression keeps the bandwidth down to reasonable levels. I imagine it's most useful for wideband SDRs like the RSPdx.

So far I haven't noticed any performance issues on my (reasonably fast) machine, the only weird thing I found is that at 256k resolution, clicking the auto-adjust color button freezes the UI. No idea why that happens...
If you do find that using multiple canvas elements has some performance impact, and IE compatibility is not an issue, then canvas_max_width can be increased to 32768. Then this patch will have no effect for FFT sizes less than that.

@jketterl
Copy link
Owner

What's happening when your clicking that button is... on the next line coming from the server, the data is passed to Math.min() / Math.max() to determine the min/max levels. Since that happens using Function.prototype.apply(), it's a function call with about 200k arguments. You may have hit another limit there...

@jwt27
Copy link
Author

jwt27 commented Sep 13, 2020

Thanks for the pointer, I worked around it by slicing the waterfall data into fixed size chunks and computing min/max in two passes. The slice size is set to 100k so it shouldn't affect performance on lower FFT sizes at all.

Is it necessary? Probably not, but I don't like arbitrary limits :)

@jwt27
Copy link
Author

jwt27 commented Sep 13, 2020

Also, I think it's better if the maximum zoom level and number of zoom steps were calculated from the configured FFT size (and perhaps the profile bandwidth), instead of hardcoding it in JS. Or alternatively, these settings could be placed in config_webrx.py where users can easily edit them. Let me know if you think that's a good idea or not.

@jketterl
Copy link
Owner

Yes, more flexibility is always a good idea, but I'm currently deferring most config-related stuff. The web configuration is a very important feature, but it's completely stalled right now. Anything that is added to the configuration right now is only going to make life harder there.

Also... I'm still intending to rework the waterfall and scale code. It's still mostly in its original, unmodified form, but the code structure makes it very hard to perform any actual modifications without breaking stuff. I have been refactoring many parts already, but that's still ahead of me.

@jwt27
Copy link
Author

jwt27 commented Sep 17, 2020

Some feedback on your recent changes: I pulled in the latest commits and rebased my local changes on top of that. I found that I had to revert fa08f1e since it completely tanks performance, and OWRX becomes unusable with a large FFT sixe. When I run the performance profiler in my browser I see that all cpu time is spent in waterfall_mkcolor(). I don't know what benefit chroma.js is supposed to bring, to me it seems it only introduces more overhead. Maybe if waterfall coloring could be done in multiple threads, it would be better, but I don't know if that's possible in JS.

@jketterl
Copy link
Owner

chroma.js is a color library that probably does a better job at color interpolation than the former custom code that seems to do a simplistic, linear interpolation (at least, that's what it seems to do). the colors of the waterfall are defined by a series of color values, and since that series is typically shorter than the range of expected values, intermediate values need to be interpolated.

i was experimenting with other color maps and it seemed a simple move to integrate it into the waterfall, it was already present in the project (it's used for the map square colors). It comes with an implementation of the cubehelix colormap that i was interested in.

even though i did anticipate it, i didn't notice any performance impact while working on it. i haven't done any measurements, though.

i will see if there's some replacement with better performance...

@jketterl
Copy link
Owner

results are baaaaaaaaad 🙁

Screenshot_2020-09-18_15-13-06

@jwt27
Copy link
Author

jwt27 commented Sep 18, 2020

Yep, that doesn't look so good. I think it's better to leave the old code in for now, there is very little (if any) visual difference but the overhead is much less.

I did some googling and found some things that may be worth looking into. There are "web workers" which is basically multi-threading for JS. If you split up the canvas as I did in this PR, you could have multiple workers doing mkcolor in parallel. Also maybe the websocket I/O could be made to run on a different worker, to eliminate audio stutter when the UI thread is busy. Downside is that worker code has to be in a separate js file, you can't manipulate html from a worker, and data can not be shared between workers, only copied.
Another option, and this seems ideally suited for color mapping, is running mkcolor on the GPU with gpu.js. But this will not benefit clients that do not have a GPU (do those still exist?)

@jketterl
Copy link
Owner

There is already an AudioWorklet implemtation that is in place because the ScriptProcessorNode is officially deprecated. The latter is still in use in most installations simply because the AudioWorklet requires a SecureContext, i.e. an SSL / https connection. That's a major drawback since most OpenWebRX installations don't implement SSL, and some are not even allowed to use it (amateur radio laws in germany prohibit the use of encryption). So, that doesn't see much use. And I'm quite happy about it since whenever I use it, I experience audio dropouts, not just with OpenWebRX. I have spent many hours trying to find out how my implementation is wrong until I went to the demo page and all the demos had the same problem.

I don't want to end up with the same dillema for the waterfall. I don't know the drawbacks or the state of the web workers API. There's another called "Service Workers" that require the SecureContext as well. I'm not convinced this is the way to go.

The same applies to GPU specific code: There is still enough clients that don't have GPU acceleration. I can even imagine that this relies on GPU specific code in the browser, which is probably not always present either. I'm not going to go down that route if that means I will need to maintain two implementations.

Either way, this is completely going off topic by now.

@jketterl
Copy link
Owner

Screenshot_2020-09-20_19-37-16

Pushed it around a bit, didn't expect to gain much, but looks like it's about twice as fast now (results vary on consecutive runs, it's everything but scientific... but the new variant always comes out on top). It still uses chroma, but only on init. Test runs on a Million iterations.

@jwt27
Copy link
Author

jwt27 commented Sep 20, 2020

Oh nice, that's a lot better. I rebased this PR on top of your latest changes now.

Quite surprising that it's faster now without the alpha channel. If this were C++ I'd keep the loop size per pixel at 4 so that the compiler can easily auto-vectorize it. Guess we don't have that luxury in JS.

@jketterl
Copy link
Owner

I suppose Javascript wouldn't be able to vectorize since it's not even compiled. Not too sure though.

Given the old interpolation code though, I don't think the omission of the alpha channel was the most relevant change.

    for (var i = 0; i < 4; i++) {
        var add = ((((first & (0xff << (i * 8))) >>> 0) * percent) + (((second & (0xff << (i * 8))) >>> 0) * (1 - percent))) & (0xff << (i * 8));
        output |= add >>> 0;
    }

I'm guessing it's the amount of multiplications that made it slow. The shifting should be rather cheap, but all in all... this does way too much, and given that first and second always come from the same range of values, this is certainly repetitive. The new interpolation is down to a single multiplication, no shifting, no loop. I don't really think that the alpha channel is of much use in this application, anyway.

chroma.js has an additional cache on the scaling, that seemed to speed things up (I tried the test with the cache off, results were about tripled in time). So there's an idea to make this even faster. I'm not too bothered right now, though.

@jketterl
Copy link
Owner

Alright, finally found some time to review this. Sorry it took some time, but I did want the release out of the door due to people becoming impatient about the new features.

I have tested this on my local machine, and for some reason it makes my waterfall somewhat "jumpy". It looks somewhat like on an unstable connection where data would get buffered, except that this shouldn't happen on localhost (and doesn't happen without the change).

I've done some profiling, but the before and after results are pretty much the same... Not sure what to make out of this. Is there any kind of rendering back-buffer that gets triggered by this change?

@jwt27
Copy link
Author

jwt27 commented Oct 17, 2020

The FFT data still goes straight to the canvas, there's no additional buffering involved as far as I'm aware.
As I said before I haven't noticed any difference in performance with this patch. Do you see this "jumpiness" even with the default FFT size (or <= 8192)? You could try increasing canvas_max_width to 32768 to reduce the number of canvases being drawn.

@jketterl
Copy link
Owner

jketterl commented Nov 4, 2020

Alright, this has once again slipped off the radar. There's too much going on.

I know that the data is going straight in, just as before, that's why I was asking. I don't know what could be the cause of this, and as I said, it doesn't show up in any profiling reports.

I have performed all my testing with the default 4096 FFT bins so far, I wanted to see if the default behavior would be impacted.

Given that this is a purely subjective impression, I really don't know what to do. I can repeat the test, but other than that...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants