Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

decode, computing efficiency #9

Open
ea4gmz opened this issue Feb 19, 2024 · 11 comments
Open

decode, computing efficiency #9

ea4gmz opened this issue Feb 19, 2024 · 11 comments

Comments

@ea4gmz
Copy link

ea4gmz commented Feb 19, 2024

Hello,
Not really an issue or a bug, but a question. I can run decode command in a linux computer with i5 cpu; the computing time is 50 ms. I have compiled your program in a Raspberry pi 1 model B with BCM2835 (ARM1176JZF-S 700 MHz). There, decode takes exactly 10 seconds to process an audio file. I must say the program works and provides a nice result. Do you think this is the time it must reasonably take, or could it be improved maybe with a different compiling method? It puzzles me that the Rpi is 200 times slower than a regular computer.
The point of using a RPi is running, non-stop, a digipeater and http gateway, instead of using a desktop pc.
Thank you and best regards.

@xdsopl
Copy link
Member

xdsopl commented Feb 20, 2024

That CPU does not have NEON, which then needs to be emulated for the Polar list decoder. It also has very little cache.
You could try to lower the list size or replace list decoding with normal decoding and also lower the OSD decoder order or replace with RS decoding but this all will reduce receiver performance by a large margin.
Instead maybe you should get a more modern CPU with NEON, like the one in the Raspberry Pi 400: It only needs 150 ms to decode Rattlegram messages with the untouched code from the short branch and in only 20 ms when lowering the OSD order to 3 instead of the default 4.

@xdsopl
Copy link
Member

xdsopl commented Feb 20, 2024

You should probably play with the OSD ORDER first and set it to two: CODE::OrderedStatisticsDecoder<255, 71, ORDER> osddec;

And then change the list SIZE to maybe four: typedef SIMD<code_type, SIZE> mesg_type;

See if that helps.

@ea4gmz
Copy link
Author

ea4gmz commented Feb 20, 2024 via email

@xdsopl
Copy link
Member

xdsopl commented Feb 20, 2024

decode.cc of course ;-)

@ea4gmz
Copy link
Author

ea4gmz commented Feb 22, 2024 via email

@xdsopl
Copy link
Member

xdsopl commented Feb 23, 2024

keep me posted

@ea4gmz
Copy link
Author

ea4gmz commented Feb 29, 2024

Hello,

My findings so far.

As I posted above, decode modified with your notes now runs very fast in a RPi 1. I wanted to make a comparison on how robust the alternative version was, and planned to record signals and have them processed by the two decode versions. I wanted to do that in a computer as it runs faster. Now I can see that the modified program ends with segmentation fault after showin Es/N0 data. This only happens in the PC, as it runs fine in a RPi. I also tried downloading the latest version of "modem" and compiling it again, unmodified, and it also gives a segfault. But it works in RPi.
Summary:
modem code, weeks old, unmodified. PC: ok. RPi 1: ok, but takes 10s to decode.
modem code, weeks old, modified. PC: seg fault. RPi 1: ok, fast, robustness not yet compared.
modem code, cloned today, unmodified. PC: seg fault. RPi 1: ok, but takes 10s to decode.
I can provide more detailed information like logs, scripts, etc.
regards

@xdsopl
Copy link
Member

xdsopl commented Mar 1, 2024

You should also update the code repository. I made a lot of SIMD related changes lately that might cause the problems you see. If that does not help, please add more info here, and please give me the output of:
grep -m 1 flags /proc/cpuinfo

@ea4gmz
Copy link
Author

ea4gmz commented Mar 13, 2024

Hello
The modified program runs fast in a Raspberry pi 1 and the sensitivity or ability to decode weak signals seems just as good as the original program. I am doing more experiments. I plan to set up a permanent digipeater and RF to http gateway running on this old rpi.
regards

@xdsopl
Copy link
Member

xdsopl commented Mar 14, 2024

That's good to hear. I did more improvements in the DSP and CODE repositories and would be interested to know how well they work on the old pi as well.

@LieBtrau
Copy link

LieBtrau commented May 13, 2024

Hi,
Maybe slightly off-topic, but I've tested decoding on an ESP32: rattlegram-openmodem. It works but it's very slow. It takes about 37s to decode a 512 byte packet (next-branch).
I went here to find some answers.
Your suggestions yield dramatic improvements.
Decoding the 48kHz sample (512bytes) now takes 6.4s instead of 37s. When reducing the sample rate to 8kHz, decoding only takes 1.5s. It's still not real time (what I had hoped), but it's maybe good enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants