Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HPX support #473

Open
wants to merge 157 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
157 commits
Select commit Hold shift + click to select a range
8667212
several fixes
Feb 3, 2023
4279897
fixed success CI task
svenwoop Mar 8, 2023
bccec3f
turning EMBREE_LEVEL_ZERO on in one CI test
svenwoop Mar 10, 2023
7da36c4
added arm emulation testing to nightly (#2520)
dopitz Mar 10, 2023
ece6335
Use all models from repo
tpyra Mar 10, 2023
9166e2e
Minor fixes to performance CI (#2525)
tpyra Mar 13, 2023
cb9c871
added EMBREE_ISA_NEON2X to embree-config.cmake (#2526)
dopitz Mar 14, 2023
b869cb4
added arm support to changelog and doc (#2524)
dopitz Mar 14, 2023
5880151
introduced triangle format, quad format, and vertex format in ze_rayt…
svenwoop Mar 8, 2023
eb32547
removed rthwifGetSupportedFeatures from ze_raytracing build API
svenwoop Mar 8, 2023
f604f2d
added stype and pNext to ze_raytracing build API implementation descr…
svenwoop Mar 9, 2023
7d171e1
implemented build flag to skip duplicate any hit shader invokation
svenwoop Mar 9, 2023
2703930
implemented zeRaytracingAccelFormatCompatibilityExt in ze_raytracing …
svenwoop Mar 9, 2023
f221efd
fixed interface of some ze_raytracing API functions
svenwoop Mar 15, 2023
bba874e
all names of ze_raytracing build API implementation match spec
svenwoop Mar 15, 2023
d14c565
ci: update gfx-driver internal to ci-neo-025905
kraszkow Mar 15, 2023
7a2a3fa
properly destroying parallel build operations
svenwoop Mar 16, 2023
1b9aa70
updates to compile ze_raytracing builder in isolation in separate rep…
svenwoop Mar 16, 2023
5298842
fixed success job
svenwoop Mar 16, 2023
864ecf4
build a deliverable testing package (#2532)
dopitz Mar 29, 2023
ba2f591
fixed issue that caused allocation of RTDispatchGlobals by default
svenwoop Mar 30, 2023
04a7a2a
using proper RT allocation function to allocate RTDispatchGlobals
svenwoop Mar 31, 2023
8092c0d
fix tarball package unpacks into single folder (#2535)
dopitz Apr 5, 2023
b9221ca
fix config error in generated embree-addtest.cmake (#2536)
dopitz Apr 14, 2023
4450418
update dpcpp to 20230417 and fix deprecation warnings
freibold Apr 18, 2023
0fb15b6
do not use exception mechanism for BVH build retry due to TBB 2020.3
freibold Apr 18, 2023
33f8ac4
Added EMBREE_BACKFACE_CULLING_SPHERES option
stefanatwork Apr 14, 2023
686d953
* Added CMake options for sphere and curve backface culling
stefanatwork Apr 17, 2023
f21a990
using L0 ray tracing API to build acceleration structure for cornell_…
svenwoop Apr 24, 2023
4798eda
introduced rtas builder concept to L0 ray tracing API
svenwoop Apr 24, 2023
6b52e28
adopted build and size query functions of L0 ray tracing build API
svenwoop Apr 24, 2023
62226c3
removed not required update of build arguments
svenwoop Apr 25, 2023
119e5ed
adjusted property query for parallel operation object in L0 ray traci…
svenwoop Apr 25, 2023
6a33c14
implemented rtas device properties query for L0 ray tracing build API
svenwoop Apr 25, 2023
c289ce5
adjusted Level Zero ray tracing Build API function names to match spec
svenwoop Apr 25, 2023
3971cba
adjusted Level Zero ray tracing Build API struct member names to matc…
svenwoop Apr 25, 2023
e3d581e
adding flags to rtas builder properties
svenwoop Apr 26, 2023
3fd5e8f
returning proper error codes in rtas builder
svenwoop Apr 26, 2023
7839837
renamed rtbuild and rttrace API header files
svenwoop Apr 26, 2023
1f17217
only including rttrace.h
svenwoop Apr 26, 2023
775f10b
cleanups to cornell_box rtas build example
svenwoop Apr 26, 2023
9b06766
Do/remove macosx embree tag (#2543)
dopitz Apr 26, 2023
b16e4be
removing ze_raytracing CI tests
svenwoop Apr 27, 2023
0f6a668
using standard Level Zero results in Level Zero rtas build API
svenwoop Apr 27, 2023
2d4b773
using argument structure in bounds callback for Level Zero ray tracin…
svenwoop Apr 27, 2023
326d1c0
updates to compile Level Zero ray tracing builder API to match spec
svenwoop Apr 27, 2023
3c23e7e
adding Level Zero rtas header from spec
svenwoop Apr 27, 2023
50d2a61
properly checking for stype of builder properties and descriptor
svenwoop Apr 28, 2023
33a7887
returning ZE_RESULT_ERROR_INVALID_ENUMERATION error when stype is not…
svenwoop Apr 28, 2023
13cbef5
- changed version to 4.1.0
svenwoop May 3, 2023
b0499a4
using proper ZE_STRUCTURE_TYPE_RAYTRACING_MEM_ALLOC_EXT_DESC stype fo…
svenwoop May 3, 2023
cc29d2c
checking if rtas allocation is in 48 bit address range
svenwoop May 3, 2023
3153139
updated to latest L0 rtas build API spec
svenwoop May 3, 2023
d7d0383
making ze_rtas_builder_geometry_exp_flag_t opaque by default
svenwoop May 3, 2023
959ee31
make EMBREE_NO_SPLASH a proper CMake option.
freibold May 3, 2023
378f2f0
adding always_inline for Embree SYCL API functions
svenwoop May 9, 2023
05ae8c4
changed package filenames (#2550)
tpyra May 9, 2023
5b29a8b
Do/release (#2551)
dopitz May 10, 2023
d756352
rebase release branch into devel (#2553)
dopitz May 11, 2023
9a04f54
Do/release (#2554)
dopitz May 11, 2023
cc22661
L0 ray tracing support tests can use rtcore SW simulation
svenwoop May 4, 2023
5ff0aaf
limiting triangle pair primID delta to 5 bits
svenwoop May 4, 2023
f83f2fc
prepared L0 rtas builder to support multiple rtas formats
svenwoop May 4, 2023
44b9dea
checking that number of geometries is in range
svenwoop May 9, 2023
244e385
removed zeRTASInitExp/zeRTASExitExp functions
svenwoop May 10, 2023
0bd483e
using ZE_RESULT_EXP_ERROR_RETRY_RTAS_BUILD error code to re-try rtas …
svenwoop May 15, 2023
33d2abb
replace insecure fopen with secure std::ifstream (#2557)
dopitz May 16, 2023
1c99e92
Do/fix fopen (#2558)
dopitz May 16, 2023
81417ab
added filter files for test execution (#2541)
dopitz May 17, 2023
62cd7b3
adding Impl to RTAS builder to have separate symbols to API function …
svenwoop May 22, 2023
74742dc
Do/integration (#2562)
dopitz May 23, 2023
62766a2
load ze_loader library during runtime with dlopen/LoadLibrary. Remove
freibold May 23, 2023
0265075
fix rthwif tests for test package (#2567)
kraszkow May 30, 2023
f8df5ae
fixing experimental disable of deviceID check
svenwoop May 30, 2023
8d9fea3
adding MTL support
svenwoop May 31, 2023
e7718d6
fix rthwif tests with new level zero runtime loading change.
freibold May 31, 2023
7a8678b
init ZeWrapper in rthwif tests.
freibold May 31, 2023
0309da8
using decltype to get function types in ze_wrapper
svenwoop May 31, 2023
07ae638
moving ze_rtas.h to level_zero folder
svenwoop May 31, 2023
f5cb3ff
adding RTASBuilder support to ze_wrapper
svenwoop Jun 1, 2023
646bbf5
- using new rtas API header
svenwoop Jun 1, 2023
d75dab7
using zeDeviceGetProperties to query RTAS device properties
svenwoop Jun 1, 2023
34e2770
Improved BVH build performance on many core machines by avoiding spin…
svenwoop May 24, 2023
5f94281
New model repository address (#2575)
tpyra Jun 5, 2023
6a3ad6e
fix to get dispatch globals allocation working again for debugging pu…
svenwoop Jun 5, 2023
ae9ab7e
passing dispatch globals for through debug RTAS build API extension
svenwoop Jun 12, 2023
de4c2ac
Do/release test package (#2579)
dopitz Jun 13, 2023
523c13d
Do/fix release package name sycl (#2583)
dopitz Jun 15, 2023
7e9e319
Added CI feature to store results in database (#2584)
tpyra Jun 15, 2023
cc04821
Added rtcGetGeometryTransformFromScene API function that can get used…
svenwoop Jun 15, 2023
73e94ce
clarified that rtcOccluded and occlusion filter cannot get used to ga…
svenwoop Jun 19, 2023
39258c5
Added PVC runner to perf CI (#2588)
tpyra Jun 20, 2023
dd39f11
updated changelog
svenwoop Jun 21, 2023
77cecda
SYCL version of Embree with GPU support is no longer in beta phase.
svenwoop Jun 21, 2023
dc66f70
setting version to 4.2.0
svenwoop Jun 21, 2023
8b3120e
removed extra comma (#2591)
tpyra Jun 21, 2023
4252dc8
fix typo (#2592)
dopitz Jun 22, 2023
6b67cef
Do/release (#2597)
dopitz Jun 30, 2023
af29815
Do/release (#2601)
dopitz Jul 4, 2023
68a0fb8
checking ze_rtas_builder_procedural_geometry_info_exp_t reserved memb…
svenwoop Jul 4, 2023
1bdc285
adding check if rtas extension present and properly initialized
svenwoop Jul 13, 2023
9c1c335
fix imgui empty label complaint on Windows
freibold Jul 12, 2023
bcb51a8
add memory monitor to GPU BVH build
freibold Jul 13, 2023
33074fe
Do/update driver (#2607)
dopitz Jul 18, 2023
f6dad7c
added configuration to run rtas builder tests using internal or level…
svenwoop Jul 19, 2023
e69f52b
Add support for ARM64 windows platform cmake
anthony-linaro Jul 19, 2023
fb6c562
matching rcp math between rthwif_test and BVH builder
svenwoop Jul 19, 2023
038ae14
implemented varying version of rtcGetGeometryTransform for ISPC
svenwoop Jul 20, 2023
a6acf29
using L0 RTAS build extension only when available
svenwoop Jul 21, 2023
29f96e1
rthwif tests use RTAS build extension only when available
svenwoop Jul 24, 2023
8c4c530
- updating to dpcpp compiler sycl-nightly/20230724
svenwoop Jul 25, 2023
eeb4834
fixed sub_group related compile warning
svenwoop Jul 25, 2023
e5ac939
cleanups to sycl namespace usage
svenwoop Jul 25, 2023
ec04c51
docker GPU image for perf CI to embree/ubuntu:22.04 (#2615)
tpyra Jul 26, 2023
3c4b936
CI-perf sycl build on Ubuntu 22.04 (#2617)
tpyra Jul 26, 2023
a7782ef
use DPCPP_SETUP_SCRIPT environment variable in test.py script
freibold Jul 31, 2023
b1f1d8e
updated graphics drivers in CI
svenwoop Jul 31, 2023
63d433e
Revert "use DPCPP_SETUP_SCRIPT environment variable in test.py script"
svenwoop Aug 1, 2023
2768271
consistently using main version of reusable workflow
svenwoop Aug 1, 2023
456c8b3
do not source dpcpp environment manually
freibold Aug 3, 2023
7e7877b
use ubuntu 20.04 for release again and update release driver
freibold Aug 3, 2023
79f1141
Do/update compiler cmakepreset (#2621)
dopitz Aug 14, 2023
e8c1b94
remove using sycl::fmax and using sycl::fmin
freibold Aug 30, 2023
18bd626
update to intel-llvm 20230830
freibold Aug 30, 2023
77873cb
do not sign the zip package after build for windows any more (#2624)
dopitz Aug 31, 2023
ea3aa94
cherry pick from carlocab (#2623)
dopitz Aug 31, 2023
8cc689e
fixed deviceID check for PVC
svenwoop Aug 31, 2023
ed56a60
removed unused file
svenwoop Aug 31, 2023
67a5584
not using rtas_builder extension by default yet
svenwoop Sep 6, 2023
81d04cf
not using rtas_builder extension by default in rthwif tests
svenwoop Sep 7, 2023
7a28f06
properly checking if rtas extension can get loaded
svenwoop Sep 7, 2023
fe31059
added cmake option to enable usage of L0 rtas builder
svenwoop Sep 7, 2023
7d55d75
disabling L0 rtas builder only for release
svenwoop Sep 7, 2023
58212ad
disabling L0 RTAS builder by default
svenwoop Sep 12, 2023
e194380
integration of coverity
dopitz Sep 13, 2023
9e821ab
added argument to set resolution support for cornell_box test
svenwoop Sep 15, 2023
d14fed9
validating that rtasFormat is not ZE_RTAS_FORMAT_EXP_INVALID
svenwoop Sep 15, 2023
7deef5d
WA for changed sycl include headers in oneAPI DPC++ 2024.0
freibold Sep 21, 2023
37ca584
add ICX release candidate tests
freibold Sep 21, 2023
7220e84
Update changelog.md
freibold Sep 15, 2023
2ded3b4
add tests for sycl-nightly RK version
freibold Sep 22, 2023
2faa8dd
disable nightly bezier_round furball tests on 2 windows configs for now
dopitz Sep 25, 2023
e1d0051
add instance array geometry type.
freibold May 29, 2023
6ed2106
update compilation docu and version number
freibold Sep 26, 2023
ec3ab17
use rockylinux for release and split linux sycl
freibold Sep 26, 2023
3606b69
update documentation
freibold Sep 26, 2023
8187f71
add pi_win_proxy_loader.dll to release package
freibold Sep 26, 2023
502e7d5
add also weird libsycl.so version (libsycl.so.7.0.0-8 ?!?!?)
freibold Sep 26, 2023
203232c
add old include/lib directory structure for ICX 2024, too.
freibold Sep 27, 2023
03bbc7b
updates
freibold Sep 27, 2023
49a0114
rebase
ct-clmsn Jan 23, 2024
824051c
completed hpx port
ct-clmsn Jan 23, 2024
3277f16
improved hpx parallel_for use
ct-clmsn Jan 23, 2024
86acf6b
Merge branch 'master' into hpx
ct-clmsn Jan 23, 2024
4467c25
added license
ct-clmsn Jan 24, 2024
ffc0479
upated readme
ct-clmsn Jan 24, 2024
e9389f8
several fixes for linking error
ct-clmsn Jan 27, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
29 changes: 29 additions & 0 deletions CMakeLists.txt
Expand Up @@ -17,6 +17,10 @@ SET(EMBREE_PROJECT_COMPILATION ON)

include(CMakeDependentOption)

set(CMAKE_CXX_STANDARD 20)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
set(CMAKE_CXX_EXTENSIONS OFF)

# We use our own strip tool on macOS to sign during install. This is required as CMake modifies RPATH of the binary during install.
IF (APPLE AND EMBREE_SIGN_FILE)
SET(EMBREE_STRIP ${CMAKE_STRIP})
Expand Down Expand Up @@ -235,9 +239,11 @@ OPTION(EMBREE_MIN_WIDTH "Enables min-width feature to enlarge curve and point th
IF (APPLE AND CMAKE_SYSTEM_NAME STREQUAL "Darwin" AND (CMAKE_SYSTEM_PROCESSOR STREQUAL "arm64" OR CMAKE_OSX_ARCHITECTURES MATCHES "arm64"))
MESSAGE(STATUS "Building for Apple silicon")
SET(EMBREE_ARM ON)
SET(EMBREE_ISA_AVX512SKX OFF)
ELSEIF(CMAKE_SYSTEM_PROCESSOR STREQUAL "aarch64" OR CMAKE_SYSTEM_PROCESSOR STREQUAL "ARM64")
MESSAGE(STATUS "Building for AArch64")
SET(EMBREE_ARM ON)
SET(EMBREE_ISA_AVX512SKX OFF)
ENDIF()

SET(EMBREE_TASKING_SYSTEM "TBB" CACHE STRING "Selects tasking system")
Expand All @@ -253,14 +259,33 @@ IF (EMBREE_TASKING_SYSTEM STREQUAL "TBB")
SET(TASKING_TBB ON )
SET(TASKING_INTERNAL OFF)
SET(TASKING_PPL OFF )
SET(TASKING_HPX OFF )
ADD_DEFINITIONS(-DTASKING_TBB)
LIST(APPEND ISPC_DEFINITIONS -DTASKING_TBB)
ELSEIF (EMBREE_TASKING_SYSTEM STREQUAL "PPL")
SET(TASKING_PPL ON )
SET(TASKING_TBB OFF )
SET(TASKING_HPX OFF )
SET(TASKING_INTERNAL OFF)
ADD_DEFINITIONS(-DTASKING_PPL)
LIST(APPEND ISPC_DEFINITIONS -DTASKING_PPL)
ELSEIF (EMBREE_TASKING_SYSTEM STREQUAL "HPX")
IF(NOT HPX_DIR AND HPX_ROOT)
SET(HPX_DIR ${HPX_ROOT}/lib/cmake/HPX)
ENDIF()

IF(NOT HPX_DIR AND EXISTS "$ENV{HPX_DIR}")
SET(HPX_DIR $ENV{HPX_DIR})
ENDIF()

UNSET(CMAKE_CXX_STANDARD)
SET(CMAKE_CXX_STANDARD 20)
SET(TASKING_HPX ON )
SET(TASKING_PPL OFF )
SET(TASKING_TBB OFF )
SET(TASKING_INTERNAL OFF)
ADD_DEFINITIONS(-DTASKING_HPX)
LIST(APPEND ISPC_DEFINITIONS -DTASKING_HPX)
ELSE()
SET(TASKING_INTERNAL ON )
SET(TASKING_TBB OFF)
Expand Down Expand Up @@ -379,8 +404,10 @@ ELSE()
ENDIF()

IF (EMBREE_ARM)
message(STATUS "NEON, NEON2X")
SET_PROPERTY(CACHE EMBREE_MAX_ISA PROPERTY STRINGS NONE NEON NEON2X)
ELSE()
message(STATUS "SSE2 SSE4.2 AVX AVX2 AVX512")
SET_PROPERTY(CACHE EMBREE_MAX_ISA PROPERTY STRINGS NONE SSE2 SSE4.2 AVX AVX2 AVX512 DEFAULT)
ENDIF()

Expand All @@ -390,9 +417,11 @@ IF (EMBREE_MAX_ISA STREQUAL "NONE")
IF (APPLE)
OPTION(EMBREE_ISA_NEON "Enables NEON ISA." OFF)
OPTION(EMBREE_ISA_NEON2X "Enables NEON ISA double pumped." ON)
TRY_COMPILE(COMPILER_SUPPORTS_ARM "${CMAKE_BINARY_DIR}" "${PROJECT_SOURCE_DIR}/common/cmake/check_isa.cpp" COMPILE_DEFINITIONS ${FLAGS_ARM})
ELSE()
OPTION(EMBREE_ISA_NEON "Enables NEON ISA." ON)
OPTION(EMBREE_ISA_NEON2X "Enables NEON ISA double pumped." OFF)
TRY_COMPILE(COMPILER_SUPPORTS_ARM "${CMAKE_BINARY_DIR}" "${PROJECT_SOURCE_DIR}/common/cmake/check_isa.cpp" COMPILE_DEFINITIONS ${FLAGS_ARM})
ENDIF()
ELSE()
TRY_COMPILE(COMPILER_SUPPORTS_AVX "${CMAKE_BINARY_DIR}" "${PROJECT_SOURCE_DIR}/common/cmake/check_isa.cpp" COMPILE_DEFINITIONS ${FLAGS_AVX})
Expand Down
11 changes: 8 additions & 3 deletions README.md
Expand Up @@ -305,7 +305,7 @@ macOS M1

- Apple Clang 12.0.5 (macOS 11.7.1)

IMPORTANT: Unfortunatlly, latest version of the Intel® oneAPI DPC++/C++
IMPORTANT: Unfortunately, latest version of the Intel® oneAPI DPC++/C++
Compiler (2023.2.1), has a bug that doesn't allow Embree to run correctly with
ISAs >= AVX2. Please wait for 2024.0.0, which will be released soon after
Embree 4.3.0.
Expand All @@ -325,6 +325,11 @@ installation, put the path to `ispc` permanently into your `PATH` environment
variable or you set the `EMBREE_ISPC_EXECUTABLE` variable to point at the ISPC
executable during CMake configuration.

Embree supports using the HPX runtime system as the tasking system. HPX can be
enabled by setting `EMBREE_TASKING_SYSTEM=HPX`. If HPX is enabled the CMake
variables `HPX_DIR` or `HPX_ROOT` are required to be set. The variables are
file system paths where `HPXConfig.cmake` or `HPXConfigVersion.cmake` resides.

You additionally have to install CMake 3.1.0 or higher and the developer
version of [GLFW](https://www.glfw.org/) version 3.

Expand Down Expand Up @@ -818,8 +823,8 @@ parameters that can be configured in CMake:

+ `EMBREE_TASKING_SYSTEM`: Chooses between Intel® Threading TBB
Building Blocks (TBB), Parallel Patterns Library (PPL) (Windows
only), or an internal tasking system (INTERNAL). By default, TBB is
used.
only), HPX (HPX), or an internal tasking system (INTERNAL). By default,
TBB is used.

+ `EMBREE_TBB_ROOT`: If Intel® Threading Building Blocks (TBB)
is used as a tasking system, search the library in this directory
Expand Down
2 changes: 1 addition & 1 deletion common/CMakeLists.txt
Expand Up @@ -5,4 +5,4 @@ ADD_SUBDIRECTORY(sys)
ADD_SUBDIRECTORY(math)
ADD_SUBDIRECTORY(simd)
ADD_SUBDIRECTORY(lexers)
ADD_SUBDIRECTORY(tasking)
ADD_SUBDIRECTORY(tasking)
32 changes: 32 additions & 0 deletions common/algorithms/parallel_for.h
Expand Up @@ -8,6 +8,11 @@
#include "../math/emath.h"
#include "../math/range.h"

#if defined(TASKING_HPX)
#include <hpx/algorithm.hpp>
#include <hpx/modules/iterator_support.hpp>
#endif

namespace embree
{
/* parallel_for without range */
Expand Down Expand Up @@ -46,6 +51,20 @@ namespace embree
concurrency::parallel_for(Index(0),N,Index(1),[&](Index i) {
func(i);
});
#elif defined(TASKING_HPX)
std::vector<hpx::future<void>> futures;
futures.reserve(N-1);

hpx::threads::run_as_hpx_thread([N, &func, &futures]()
{
for(auto i = 1; i < N; ++i) {
futures.push_back( hpx::async([i, &func]() { func(i); }) );
}

func(0);
hpx::wait_all(futures);
});

#else
# error "no tasking system enabled"
#endif
Expand Down Expand Up @@ -84,7 +103,20 @@ namespace embree
concurrency::parallel_for(first, last, Index(1) /*minStepSize*/, [&](Index i) {
func(range<Index>(i,i+1));
});
#elif defined(TASKING_HPX)
auto irange = hpx::util::counting_shape(last-first);

hpx::future<void> fut =
hpx::threads::run_as_hpx_thread([minStepSize, &irange, &func]() -> hpx::future<void> {
hpx::experimental::for_loop_strided(hpx::execution::par, hpx::util::begin(irange), hpx::util::end(irange), minStepSize,
[&func](auto i) {
func(range<Index>(*i, (*i)+1));
});

return hpx::make_ready_future<void>();
});

fut.wait();
#else
# error "no tasking system enabled"
#endif
Expand Down
58 changes: 57 additions & 1 deletion common/algorithms/parallel_reduce.h
Expand Up @@ -5,6 +5,11 @@

#include "parallel_for.h"

#if defined(TASKING_HPX)
#include <numeric>
#include <hpx/parallel/algorithms/transform_reduce.hpp>
#endif

namespace embree
{
template<typename Index, typename Value, typename Func, typename Reduction>
Expand Down Expand Up @@ -69,7 +74,7 @@ namespace embree
throw std::runtime_error("task cancelled");
return v;
#endif
#else // TASKING_PPL
#elif defined(TASKING_PPL)
struct AlignedValue
{
char storage[__alignof(Value)+sizeof(Value)];
Expand Down Expand Up @@ -107,6 +112,57 @@ namespace embree
};
const Value v = concurrency::parallel_reduce(Iterator_Index(first), Iterator_Index(last), AlignedValue(identity), range_reduction, reduction);
return v;
#elif defined(TASKING_HPX)
/*
binner = parallel_reduce(begin,end,blockSize,binner,
[&](const range<size_t>& r) -> BinInfoT { BinInfoT binner(empty); binner.bin(prims + r.begin(), r.size(), mapping); return binner; },
[&](const BinInfoT& b0, const BinInfoT& b1) -> BinInfoT { BinInfoT r = b0; r.merge(b1, mapping.size()); return r; });
*/
struct AlignedValue
{
char storage[__alignof(Value)+sizeof(Value)];
static uintptr_t alignUp(uintptr_t p, size_t a) { return p + (~(p - 1) % a); };
Value* getValuePtr() { return reinterpret_cast<Value*>(alignUp(uintptr_t(storage), __alignof(Value))); }
const Value* getValuePtr() const { return reinterpret_cast<Value*>(alignUp(uintptr_t(storage), __alignof(Value))); }
AlignedValue(const Value& v) { new(getValuePtr()) Value(v); }
AlignedValue(const AlignedValue& v) { new(getValuePtr()) Value(*v.getValuePtr()); }
AlignedValue(const AlignedValue&& v) { new(getValuePtr()) Value(*v.getValuePtr()); };
AlignedValue& operator = (const AlignedValue& v) { *getValuePtr() = *v.getValuePtr(); return *this; };
AlignedValue& operator = (const AlignedValue&& v) { *getValuePtr() = *v.getValuePtr(); return *this; };
operator Value() const { return *getValuePtr(); }
};

std::function<AlignedValue(AlignedValue, AlignedValue)> red = [&](AlignedValue x, AlignedValue y) -> AlignedValue {
return AlignedValue(reduction(x, y));
};

std::function<AlignedValue(Index)> xfm = [&](Index i) -> AlignedValue {
return AlignedValue(func(range<Index>(i,i)));
};

const Index sz = last-first;
auto irange = hpx::util::counting_shape(sz);
auto beg = hpx::util::begin(irange);
auto end = hpx::util::end(irange);

Value v =
hpx::threads::run_as_hpx_thread([&red, &xfm, &beg, &end, &identity]() -> Value
{

Value v = hpx::transform_reduce(
hpx::execution::par,
beg, end,
AlignedValue(identity),
red,
xfm
);

return v;
});

return v;
#else
# error "no tasking system enabled"
#endif
}

Expand Down
4 changes: 4 additions & 0 deletions common/cmake/check_isa.cpp
Expand Up @@ -26,6 +26,10 @@ char const *info_isa = "ISA" ":" "AVX";
char const *info_isa = "ISA" ":" "SSE42";
#else // defined(__SSE2__)
char const *info_isa = "ISA" ":" "SSE2";
#else defined(__arm__)
char const *info_isa = "ISA" ":" "ARM";
#else defined(__aarch64__)
char const *info_isa = "ISA" ":" "ARM";
#endif

int main(int argc, char **argv)
Expand Down
8 changes: 7 additions & 1 deletion common/cmake/clang.cmake
Expand Up @@ -127,7 +127,9 @@ ELSE()
SET(CMAKE_CXX_FLAGS_RELWITHDEBINFO "${CMAKE_CXX_FLAGS_RELWITHDEBINFO} -O3") # enable full optimizations

IF (APPLE)
SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -mmacosx-version-min=10.7") # makes sure code runs on older MacOSX versions
IF(NOT CMAKE_OSX_DEPLOYMENT_TARGET)
SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -mmacosx-version-min=10.7") # makes sure code runs on older MacOSX versions
ENDIF()
SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -stdlib=libc++") # link against libc++ which supports C++11 features
ELSE(APPLE)
IF (NOT EMBREE_ADDRESS_SANITIZER) # for address sanitizer this causes link errors
Expand All @@ -140,6 +142,10 @@ ELSE()
SET(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -z noexecstack") # we do not need an executable stack
ENDIF()
ENDIF()

SET(CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS_DEBUG} -fno-aligned-allocation")
SET(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} -fno-aligned-allocation")
SET(CMAKE_CXX_FLAGS_RELWITHDEBINFO "${CMAKE_CXX_FLAGS_RELWITHDEBINFO} -fno-aligned-allocation")
ENDIF(APPLE)


Expand Down
31 changes: 26 additions & 5 deletions common/sys/CMakeLists.txt
@@ -1,8 +1,10 @@
## Copyright 2009-2021 Intel Corporation
## SPDX-License-Identifier: Apache-2.0

SET(CMAKE_THREAD_PREFER_PTHREAD TRUE)
FIND_PACKAGE(Threads REQUIRED)
IF(NOT TASKING_HPX)
SET(CMAKE_THREAD_PREFER_PTHREAD TRUE)
FIND_PACKAGE(Threads REQUIRED)
endif()

ADD_LIBRARY(sys STATIC
sysinfo.cpp
Expand All @@ -20,9 +22,28 @@ ADD_LIBRARY(sys STATIC
SET_PROPERTY(TARGET sys PROPERTY FOLDER common)
SET_PROPERTY(TARGET sys APPEND PROPERTY COMPILE_FLAGS " ${FLAGS_LOWEST}")

TARGET_LINK_LIBRARIES(sys ${CMAKE_THREAD_LIBS_INIT} ${CMAKE_DL_LIBS})
IF (EMBREE_SYCL_SUPPORT)
TARGET_LINK_LIBRARIES(sys ${SYCL_LIB_NAME})
IF(TASKING_HPX)
IF(HPX_FOUND)
TARGET_INCLUDE_DIRECTORIES(sys PUBLIC "${HPX_INCLUDE_DIRS}")
TARGET_LINK_LIBRARIES(sys PUBLIC ${CMAKE_DL_LIBS} HPX::hpx)
ELSE()
find_package(HPX REQUIRED)
IF(HPX_FOUND)
TARGET_INCLUDE_DIRECTORIES(sys PUBLIC "${HPX_INCLUDE_DIRS}")
IF (EMBREE_SYCL_SUPPORT)
TARGET_LINK_LIBRARIES(sys PUBLIC {CMAKE_DL_LIBS} ${SYCL_LIB_NAME} HPX::hpx)
ELSE()
TARGET_LINK_LIBRARIES(sys PUBLIC ${CMAKE_DL_LIBS} HPX::hpx)
ENDIF()
ELSE()
message("-- Not found HPX")
ENDIF()
ENDIF()
ELSE()
TARGET_LINK_LIBRARIES(sys ${CMAKE_THREAD_LIBS_INIT} ${CMAKE_DL_LIBS})
IF (EMBREE_SYCL_SUPPORT)
TARGET_LINK_LIBRARIES(sys ${SYCL_LIB_NAME})
ENDIF()
ENDIF()

IF (EMBREE_STATIC_LIB)
Expand Down
16 changes: 15 additions & 1 deletion common/sys/barrier.cpp
Expand Up @@ -101,7 +101,7 @@ namespace embree
__forceinline void wait()
{
mutex.lock();
count++;
count+=1;

if (count == barrierSize) {
count = 0;
Expand All @@ -128,19 +128,33 @@ namespace embree
namespace embree
{
BarrierSys::BarrierSys (size_t N) {
#if defined(TASKING_HPX)
b = std::make_shared<hpx::barrier<>>(N);
#else
opaque = new BarrierSysImplementation(N);
#endif
}

BarrierSys::~BarrierSys () {
#if !defined(TASKING_HPX)
delete (BarrierSysImplementation*) opaque;
#endif
}

void BarrierSys::init(size_t count) {
#if defined(TASKING_HPX)
b.reset(new hpx::barrier<>(count));
#else
((BarrierSysImplementation*) opaque)->init(count);
#endif
}

void BarrierSys::wait() {
#if defined(TASKING_HPX)
b->arrive_and_wait();
#else
((BarrierSysImplementation*) opaque)->wait();
#endif
}

LinearBarrierActive::LinearBarrierActive (size_t N)
Expand Down
10 changes: 10 additions & 0 deletions common/sys/barrier.h
Expand Up @@ -7,6 +7,11 @@
#include "sysinfo.h"
#include "atomic.h"

#if defined(TASKING_HPX)
#include <memory>
#include <hpx/barrier.hpp>
#endif

namespace embree
{
/*! system barrier using operating system */
Expand All @@ -31,7 +36,12 @@ namespace embree
void wait();

private:

#if defined(TASKING_HPX)
std::shared_ptr< hpx::barrier<> > b;
#else
void* opaque;
#endif
};

/*! fast active barrier using atomic counter */
Expand Down