New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Pinwheel Expand 1D Effect #3961
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me, except the use of sinf/cosf which should be replaced by sin_t/cos_t.
BTW I found out that there is no need to use int_fast32_t
unless you strictly need 32 bits and the code is ported to 16 bit environments (or int_fast16_t
). int
would be cleaner (or unsigned
where appropriate) and the compiler treats it as such, but that's just my preference.
Hi, let me answer as the one who did the optimization part of the new mapping mode.
sin_t/cos_t are slow and less accurate. I guess you intend to reduce 8266 memory footprint, by not pulling in the dependency to std math lib? We might be able to implement your suggestion at least in the getpixelcolor() part, which is inaccurate any way.
Not sure. In fact we do fixed-point math in setPixelColor(). As the x and y position is incremented until we hit the border, 9 bit of "fraction" just leaves about 1-2bit of slack when one of the matrix dimensions is close to the maximum of 256 (imagine a matrix 196x4). Unsigned is not possible, because we have negative increments when sin() gets negative. So considering that x and y can be at most 256, we'd need 18bit of resolution for good results (one bit for sign, 8bit integer part, 9bit fraction). Maybe we can squeeze out 2bits of fraction so everything fits into 16bit. I'd need to do some more testing to see if reducing the resolution creates holes in the mapping. I'm not sure if we can expect a benefit from going to int16 for x and y. What do you think? Edit: just re-read your question - maybe I misunderstood part of it 😄 yes in principle we can change |
(Changing hat - now speaking as a maintainer 🎩 ) the pinwheel mapping works with many effects, however it has a few open ends:
However I think these two limitations are still acceptable for a first shot, just there is room for improvement in the future. |
|
Ok - just let me add that sin_t and cos_t are not just inaccurate - on esp32 platforms they are noticeably slower than sinf/cosf. @Brandon502 please make the changes as request by @blazoncek.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved from my side, assuming that changes already requested by @blazoncek get implemented.
Adding the "documentation" tag, because I think that limitations of the pinwheel mapping - it may not work well with some effects - should be documented in the KB. But that's a different task.
Nobody said they were faster. 😁 |
cosf/sinf changed to cos_t/sin_t. int_fast32_t and int_fast_16_t changed to int.
Made the requested changes. Everything still seems to work the same on my matrix. Have not had the chance to test different size matrices yet. |
@blazoncek do you think this can go into the upcoming 0.15.0-b3, or should we better wait and merge this PR after -b3 is out? |
I think this will be included in b3 release this weekend. |
@softhack007 I never tested large matrices after your rework just looked at the code. I loaded up latest MM version on a virtual 64x64 matrix. Up to 30x30 works fine. 31x31 and 32x32 produces holes. 33x33 up to 50x50 also works correctly. 51x51 and above also produces holes on my end. Not sure if I'm setting up something wrong or if some things need to be tweaked a bit. This applies to v15 version as well for me. |
@Brandon502 thanks for testing. Actually, my main test environments are 24x32 and 52x52 where I didn't see holes. I can take a look on Sunday. If you have time, maybe test out these ideas:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit perplexed by complexity of what should be a simple expansion.
I did not try to compile so can't judge the efficiency of the code but for testing I'd recommend setting up 4 segments (16x16) with pinwheel expansion and see what FPS you get.
Otherwise the code looks ok.
EDIT: Do you have a video of the expansion in action?
32x32.mp4virtual matrix so liveview is a bit laggy. Shows a stable 20fps running 4 16x16 segments (Rainbow, Two Dots, Rainbow Runner, Washing Machine). A single 32x32 effect runs around ~28fps. |
Removed holes on 31x31 and 32x32 grids.
I thought it would be slow. Unfortunate as that is. @Brandon502 & @softhack007 if you can spot a place for speed optimisation this would need it. A word about missing pixels: Unless you increase the "resolution" of your expansion (i.e. slow it down), you'll always get them on certain dimensions. Larger dimensions more chance for them as you are dealing with sin/cos. |
Yes - the limiting factor is the number of setPixelColor() calls needed for the full pinwheel, so I see two options:
I'm always from my esp32 today - most likely can test that tomorrow. |
Created a custom effect so I can control rays and turn rounding on/off. Rounding seems to be much worse, at least the method you mentioned above. Here's the amount of rays needed for various sizes without rounding. 8x8 30 rays This may not be 100% accurate, but it's a decent baseline. |
Added small pinwheel size. Adjusts medium and large values.
Dear Lord! 😁 For each pixel you draw 200 lines on 32x32? No wonder it is slow. You need to optimize loop to only call |
I added small pinwheel option for matrices 16x16 and below. The 4 16x16 effects on 32x32 grid went from ~20 fps to ~33 fps. Pinwheel seems to be higher fps then arc on effects that I've tested. |
Yes arc is slow as it iterates in polar space (increasing angle) instead of using Bresenham's circle drawing algorithm as that produces empty pixels. That is also in need of optimisation. |
200 lines total on 32x32. setPixelColor seems to be called 3604 times on 32x32. So that can definitely be reduced by avoiding overwriting near the center like you said. I'll play around with trying to reduce it. |
Changed method for drawing odd numbered rays.
Here's the fps on my setup. 38 FPS on 4 16x16 effects Experimental method: Combining your lastX/Y and my odd ray method I pushed my odd ray code. You can likely change the math and make it much quicker. It still goes outward but I just increment it a few times before entering the loop so it starts about midway to the edge instead of the center. When you keep track of the prevRay number you can use this method without breaking effects that don't cover full screen.
This is how I'm jumping away from the center. On square matrices you can go even further from the center, but rectangles will start producing holes. Can likely find a better spot to jump and make it quicker. |
Thanks, I think that's a great result - we are almost twice as fast now, compared to the original code. And still no holes 😃 I'll look at your "odd ray" code later tonight - and maybe add some tuning for the math. |
I'll wait with b3. |
* minor cleanup, moved prevRay into setPixelColor * removed experimental code (too slow) * comments cleanup
fixing holes that appeared during testing * at 52x52 (big 296 -> 304) * at 24x32 (medium 192 -> 208) * at 12x16 (small 72 -> 82) ... there is still one hole at 14x16 ... well wtf
@softhack007 can't test the 52x52 hole yet, but the holes on 24x32, 12x16, and 14x16 are from the prevRay method it seems. Need to tweak the jump distance a bit more. Edit: Fixed on my end for the smaller matrices. Just messing around with the highest possible jump and I'll push. |
Jump distance for odd rays fixed. Removed holes on rectangular matrices.
May have messed up while committing. Forgot to pull before committing. 9e0b91a contains the fix. Still need to test on 52x52 though. The jump needed to be the same for each axis. This improves fps slightly for a single 32x32 effect. Pinwheel_Steps_Small = 72 |
* setPixelColor: ensure that 0/0 is used * getPixelColor: accuracy improvements unfortunately, "scrolling" audioreactive effects are still not working properly - they end after 1/4 of the circle. Could be due to limited resolution of getPixelColor.
Fixed getPixelColor.
@softhack007 fixed music mode effects. Got tired of trying ways to fix the weird rounding issues causing wrong pixel to be used in getPixelColor so I just used the exact same method as setting. Seems like it works. Changed small and medium pinwheel sizes back to the smaller values. I haven't spotted any missing holes. Waterfall and DJ Lights (scrolling effects) are running at 55+ fps on 32x32. |
Hi @Brandon502, maybe that's the better solution 👍 . The other benefit is that we get higher resolution in getpPixelColor now, because we always use an edge pixel not a circle inside the segment frame - that's an improvement especially on non-square layouts. @blazoncek I think we are good now for beta3. Maybe just wait if something comes up until the evening, then merge the PR. |
reducing code duplication between sPC and gPC
there were still two holes in my 52x52 setup --> added "XL" size for bigger than 50x50 - achieves 18fps on 52x52
Hi guys, for me it's good now 😃 An intersting goodie (and we should keep that) is the behaviour of "Wavesins" with pinwheel mapping. |
Hey @softhack007, I did a run through of all the effects last night. The only "broken" effects are now ones that use a "stripe" pattern. Like wavesins (above) and traffic light along with a few others. With these you can see the jump on odd rays. Even if you remove the jump they still are somewhat broken since close rays still overwrite each other. Playing with the sliders on wavesins does produce so nice patterns with the jump visible. I'm not sure why you're getting holes on your large setup. If you want to reduce the count of rays even further can you try these values....
I've doubled check all of these values, so if they aren't working for you on >32x32 I can't trust my setup anymore lol. |
If you like data here's the optimal rays for every grid size and the amount of setPixelColor calls with and without jump needed for each. This data wasn't verified like the numbers above but they seem to all line up. If you ever wanted to make this even more efficient for the very large grids I think I have a way that uses this same method, just can't think of how to code it quite yet. But I'm sure there are even better ways.
|
One difference is I'm using an -S3 in my setup, which requires a newer framework version (esp-idf 4.4.4 +arduino-esp 2.0.9). Could be that my additional holes comes from slightly different accuracies of sinf/cosf in the new framework. I had two holes in 51x52 and 52x52.
Thanks 👍 will put them into excel later tonight - maybe there's a pattern. I'd say let's do a second optimization round after -beta3 is out. |
Sounds good. Just pushed a small change to format getPinwheelAngle the same as getPinwheelLength removing the long single line method. Final FPS numbers: |
Added Pinwheel Expand 1D ->2D effect mapping mode
Requested on discord to bring this effect from MM to main branch. Reworked and optimized by @softhack007.