Skip to content

Conversation

@WanliZhong
Copy link
Member

@WanliZhong WanliZhong commented Jul 9, 2024

This PR tries to fix the compilation error introduced by #24941
Error detail: #24941 (comment)
Merged after #26109

Additionally, this PR adds the missing documentation for the v_log function.

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake
force_builders=Custom
build_image:Custom=simd-emulator
buildworker:Custom=linux-1,linux-4

@WanliZhong WanliZhong marked this pull request as ready for review July 9, 2024 17:03
@WanliZhong WanliZhong requested a review from asmorkalov July 9, 2024 17:03
@WanliZhong WanliZhong added this to the 4.11.0 milestone Jul 9, 2024
@WanliZhong
Copy link
Member Author

@asmorkalov I removed the macros OPENCV_HAL_MATH_HAVE_XXX defined in intrin_cpp.hpp. This change will fix the error. I passed the compilation and the tests in the local machine with -DCV_FORCE_SIMD128_CPP=1

@vpisarev
Copy link
Contributor

@opencv-alalek, could you please review it? Frankly speaking, I don't understand logic behind all those myriads of flags that we have, so you are the best person to review it

@WanliZhong
Copy link
Member Author

WanliZhong commented Aug 10, 2024

@opencv-alalek @vpisarev @asmorkalov Hi, sorry for the delayed fix. I finally found out why this error occurs. That's because the intrin_cpp.hpp will only generate functions for SIMD128 (v_float32x4 and v_float64x2).

However, on AVX2, AVX512, and LASX platforms, SIMD256 or SIMD512 are used, which has more lanes. As a result, tests or other codes cannot find the corresponding functions. In this case, I made the general implementation of v_float work, and then fixed this error.

@asmorkalov asmorkalov changed the title Resolve Compilation Error for v_func Function in SIMD Emulator WIP: Resolve Compilation Error for v_func Function in SIMD Emulator Aug 30, 2024
@asmorkalov
Copy link
Contributor

@opencv-alalek could you take a look again?

@WanliZhong
Copy link
Member Author

This PR includes #26109. For this PR,you can only focus on intrin_math.hpp file.

@asmorkalov
Copy link
Contributor

@WanliZhong I merged related PR. Please rebase and fix conflicts.

@WanliZhong
Copy link
Member Author

OK, I will fix it later

@WanliZhong
Copy link
Member Author

@vpisarev @asmorkalov I still don't know how to use the template to rewrite the code. The problem is vector datatype can use the template to represent, but v_setall() and v_setzero() are dependent on the specific vector width functions. So how can we generate different functions for different backends by using a template? Currently my code still use macro functions

@asmorkalov asmorkalov changed the title WIP: Resolve Compilation Error for v_func Function in SIMD Emulator Resolve Compilation Error for v_func Function in SIMD Emulator Sep 17, 2024
Copy link
Contributor

@asmorkalov asmorkalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@WanliZhong
Copy link
Member Author

As the discussion with Vadim, I will try to create new functions v_setall, v_setzero, v_load etc. The implementation will be like v_setall(float vlaue, float32x4 /*unused*/) for 128-bit, v_setall(float, float32x8 /*unused*/) for 256-bit ... Using the unused parameters to distinguish among platforms. Then use template to rewrite the v_func.

@WanliZhong WanliZhong changed the title Resolve Compilation Error for v_func Function in SIMD Emulator WIP: Resolve Compilation Error for v_func Function in SIMD Emulator Sep 18, 2024
@WanliZhong WanliZhong changed the title WIP: Resolve Compilation Error for v_func Function in SIMD Emulator Resolve Compilation Error for v_func Function in SIMD Emulator Sep 19, 2024
@WanliZhong
Copy link
Member Author

I have changed the implementation by using templates, how about this implementation?

@opencv-alalek
Copy link
Contributor

See #15839

@vpisarev vpisarev self-requested a review September 27, 2024 19:37
@vpisarev
Copy link
Contributor

excellent job! 👍

inline _Tpvec v_setzero(_Tpvec /*unused*/) \
{ return v256_setzero_##suffix(); } \
inline _Tpvec v_setall(_Tp v, _Tpvec /*unused*/) \
{ return v256_setall_##suffix(v); } \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you copy-paste code 3+ times, when we have anti-pattern and design should be changed.

2. replaced 'v_setall(LaneType x, VecType dummy)' with 'v_setall_<VecType>(LaneType x)'
3. added tests for the new v_setzero_<> and v_setall_<>.
@vpisarev vpisarev merged commit 783fe72 into opencv:4.x Oct 2, 2024
29 of 30 checks passed
@vpisarev
Copy link
Contributor

vpisarev commented Oct 2, 2024

Let's move forward

asmorkalov pushed a commit that referenced this pull request Oct 10, 2024
Add support for v_sin and v_cos (Sine and Cosine) #25892

This PR aims to implement `v_sincos(v_float16 x)`, `v_sincos(v_float32 x)` and `v_sincos(v_float64 x)`. 
Merged after #25891 and #26023

**NOTE:** 
Also, the patch changes already added `v_exp`, `v_log` and `v_erf` to pass parameters by reference instead of by value, to match API of other universal intrinsics.

TODO:
- [x] double and half float precision
- [x] tests for them
- [x] doc to explain the implementation

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
@asmorkalov asmorkalov mentioned this pull request Oct 23, 2024
ting-xin pushed a commit to ting-xin/opencv that referenced this pull request Nov 17, 2024
…v#25891)

* use 2 parms for now to identify the error

* Revert "use 2 parms for now to identify the error"

This reverts commit 86faf99.

* replace += with =

* add v_log ref

* refactor intrin_math code

* Add include guard to `intrin_math.hpp` to prevent multiple inclusions

* rename VX to V; make fp64 impl in neon be optional

* add v_setall, v_setzero for all backends; rewrite the intrin_math

* fix error on rvv_scalable

* let v_erf use v_exp_default_32f function

* 1. replaced 'v_setzero(VecType dummy)' with 'v_setzero_<VecType>()'
2. replaced 'v_setall(LaneType x, VecType dummy)' with 'v_setall_<VecType>(LaneType x)'
3. added tests for the new v_setzero_<> and v_setall_<>.

* gcc does not seem to like static_assert in functions even when they are not used

* trying to fix compile errors in Debug mode on Linux

---------

Co-authored-by: Vadim Pisarevsky <vadim.pisarevsky@gmail.com>
ting-xin pushed a commit to ting-xin/opencv that referenced this pull request Nov 17, 2024
Add support for v_sin and v_cos (Sine and Cosine) opencv#25892

This PR aims to implement `v_sincos(v_float16 x)`, `v_sincos(v_float32 x)` and `v_sincos(v_float64 x)`. 
Merged after opencv#25891 and opencv#26023

**NOTE:** 
Also, the patch changes already added `v_exp`, `v_log` and `v_erf` to pass parameters by reference instead of by value, to match API of other universal intrinsics.

TODO:
- [x] double and half float precision
- [x] tests for them
- [x] doc to explain the implementation

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
thewoz pushed a commit to CobbsLab/OPENCV that referenced this pull request Feb 13, 2025
…v#25891)

* use 2 parms for now to identify the error

* Revert "use 2 parms for now to identify the error"

This reverts commit 86faf99.

* replace += with =

* add v_log ref

* refactor intrin_math code

* Add include guard to `intrin_math.hpp` to prevent multiple inclusions

* rename VX to V; make fp64 impl in neon be optional

* add v_setall, v_setzero for all backends; rewrite the intrin_math

* fix error on rvv_scalable

* let v_erf use v_exp_default_32f function

* 1. replaced 'v_setzero(VecType dummy)' with 'v_setzero_<VecType>()'
2. replaced 'v_setall(LaneType x, VecType dummy)' with 'v_setall_<VecType>(LaneType x)'
3. added tests for the new v_setzero_<> and v_setall_<>.

* gcc does not seem to like static_assert in functions even when they are not used

* trying to fix compile errors in Debug mode on Linux

---------

Co-authored-by: Vadim Pisarevsky <vadim.pisarevsky@gmail.com>
thewoz pushed a commit to CobbsLab/OPENCV that referenced this pull request Feb 13, 2025
Add support for v_sin and v_cos (Sine and Cosine) opencv#25892

This PR aims to implement `v_sincos(v_float16 x)`, `v_sincos(v_float32 x)` and `v_sincos(v_float64 x)`. 
Merged after opencv#25891 and opencv#26023

**NOTE:** 
Also, the patch changes already added `v_exp`, `v_log` and `v_erf` to pass parameters by reference instead of by value, to match API of other universal intrinsics.

TODO:
- [x] double and half float precision
- [x] tests for them
- [x] doc to explain the implementation

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants