Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Each SYCL implementation should have its own namespace to co-exist #228

Open
keryell opened this issue Dec 14, 2018 · 5 comments
Open

Each SYCL implementation should have its own namespace to co-exist #228

keryell opened this issue Dec 14, 2018 · 5 comments
Assignees
Labels
extension SYCL extension
Projects

Comments

@keryell
Copy link
Member

keryell commented Dec 14, 2018

While SYCL is a standard, some implementations are better suited for some devices and it makes sense in a full platform to take advantage of multiple SYCL implementation in the same C++ program.
While in OpenCL ICD it is managed with dynamic libraries and proxy functions, in a higher-level language like modern C++ it could be managed with namespaces, generic programming, concepts...
After this first step we can think about some other extensions, such as how to share buffers between implementations (we have already mutex that could be used for this), how to mix dependency graphs...

@keryell keryell added the extension SYCL extension label Dec 14, 2018
@keryell keryell self-assigned this Dec 14, 2018
@keryell keryell added this to Needs triage in Extensions via automation Dec 14, 2018
@keryell
Copy link
Member Author

keryell commented Mar 26, 2019

It would be easier to implement #227 as a side effect of this feature...

@keryell keryell moved this from Needs triage to High priority in Extensions Jun 24, 2019
@keryell
Copy link
Member Author

keryell commented Jun 24, 2019

I have started some cleaning with #248 before diving in the renaming.

keryell added a commit to keryell/triSYCL that referenced this issue Jun 25, 2019
This is the start of implementing
triSYCL#228
and the ability to have several SYCL implementation to coexist and
having layered implementations.

First looked at clang-rename to rename cl::sycl to trisycl but it does
not work because it is more a renaming + namespace fusion. But since
we are using C++17 namespaces now, a simple regex solution works. :-)

  find . -type f -name '*.hpp' -exec sed -i 's/cl::sycl/trisycl/g' {} +

Some manual work is required to qualify better some names because
there are more namespace ambiguity, such as trisycl,
trisycl::vendor::trisycl, trisycl::detail and
trisycl::vendor::trisycl::scop::detail
keryell added a commit to keryell/triSYCL that referenced this issue Jun 25, 2019
This is the start of implementing
triSYCL#228
and the ability to have several SYCL implementation to coexist and
having layered implementations.

First looked at clang-rename to rename cl::sycl to trisycl but it does
not work because it is more a renaming + namespace fusion. But since
we are using C++17 namespaces now, a simple regex solution works. :-)

  find . -type f -name '*.hpp' -exec sed -i 's/cl::sycl/trisycl/g' {} +

Some manual work is required to qualify better some names because
there are more namespace ambiguity, such as trisycl,
trisycl::vendor::trisycl, trisycl::detail and
trisycl::vendor::trisycl::scop::detail
@MathiasMagnus
Copy link
Contributor

How do you imagine a compiling a program with multiple SYCL implementations? With SYCL, there has to be some sort of a compiler that takes care of the magic. I wouldn't expect things to work if there are multiple SYCL implementations used in the same TU. But if I cannot mix implementations in the same TU, what use is having multiple namespaces?

@agozillon
Copy link
Contributor

agozillon commented Jun 25, 2019

I don't think @keryell's original point was that namespaces would fix it all, this is intended as only part of the larger puzzle I believe.

Although, as far as I'm aware it should be possible to mix multiple implementations in the same TU, perhaps not as of yet with today's SYCL specification and triSYCL/triSYCL compiler but with some work I'm not sure why it wouldn't be? Which is part of the reason why we're experimenting a little. I could be missing some critical piece of the picture though (or misconstruing what you mean, sorry if I am), so I'd be interested in knowing why you think it's not possible!

@keryell
Copy link
Member Author

keryell commented Jun 25, 2019

A simple starting point is to have different SYCL implementations acting on different TU and at least you do not want them to collide on the SYCL execution runtime library.
Obviously it is very prospective, but with several SYCL implementations flying around and targeting different devices, it has some value.

keryell added a commit that referenced this issue Jan 16, 2020
The list of commits since last publication:

commit 3b16711c81300d14879cc488a2a28efb63828f46
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Thu Jan 16 13:24:01 2020 -0800

    Split Doxygen comment
    
    Otherwise the class description was taken for a Doxygen group description.

commit 41dc05093fc571e6260567639a077a1f0b1d85e1
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Thu Jan 16 13:16:48 2020 -0800

    Fix description of Doxygen group vendor_trisycl_scope
    
    It needs to be on one line otherwise it is truncated.

commit 7e32082eaa0f8ed1337578c8cb5ffd1c7eaeba32
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Thu Jan 16 13:05:28 2020 -0800

    Happy new year 2020

commit 43f4a949a188a20770e6b3ae2741c46438037484
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Wed Dec 4 18:49:24 2019 -0800

    Add SYCL keynote presentation at SC19/H2RC19

commit 151572a318a6e6644759a19f15187f918c5fc8a6
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Sat Nov 2 14:48:53 2019 +0000

    Add last presentation at Association of C and C++ Users - San Francisco Bay Area

commit b72cc55b2580800438a9544d137d0766f23fb07c
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Mon Oct 14 07:10:52 2019 -0700

    Detail linear_id computation in unit test comment

commit e75687ec96047d6352d1b76415e693915424ef1c
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Fri Oct 11 14:56:37 2019 -0700

    Clarify and shorten comments for C++ meetup presentation
    
    Also replace an explicit computation by an un-evaluated expression
    since the compiler made some progress since SC17 presentation.

commit 55659e68f95f4aeeb6b1ff4e9c7aaf2a300faff8
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Thu Oct 10 09:08:04 2019 -0700

    Fix indentation

commit ac60c5dcddaea3bc957701a18d2bc35dc6a73115
Merge: 5219fa34 19d37c4f
Author: agozillon <andrew.gozillon@uws.ac.uk>
Date:   Thu Oct 3 14:41:47 2019 -0700

    Merge pull request #258 from keryell/master
    
    Make the definition of alternate namespace more liberal

commit 19d37c4f920ecae45ea94cb4008bc2311898c268
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Tue Oct 1 15:25:25 2019 -0700

    Make the definition of alternate namespace more liberal
    
    Declare ::cl and ::cl::sycl in a different way.
    This should fix https://github.com/triSYCL/triSYCL/issues/256 in case
    there is already an existing ::cl::sycl or ::sycl
    This sounds like a better neighborhood behavior.

commit 5219fa349f430a909158c2629d5c3c145597fb77
Merge: c8891bab 9cb9f6b6
Author: agozillon <andrew.gozillon@uws.ac.uk>
Date:   Tue Oct 1 17:46:57 2019 -0700

    Merge pull request #257 from keryell/modernize-capture
    
    Update lambda captures to C++17 and C++20

commit 9cb9f6b629971d66adb56d2a6b7152faa2cc4397
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Mon Sep 30 15:29:54 2019 -0700

    Update lambda captures to C++17 and C++20
    
    C++17 allows now std::move capture.
    C++20 deprecated implicit capture of "this", so add some explicit captures.

commit c8891bab865d2d27e5469c271b65064f2d7f7555
Merge: bda1067d 795f0648
Author: agozillon <andrew.gozillon@uws.ac.uk>
Date:   Mon Sep 30 17:02:59 2019 -0700

    Merge pull request #255 from keryell/master
    
    Modernize Clang & GCC versions to be used

commit 795f064847457a9980a5119e79bf037d8bff0261
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Fri Sep 27 18:53:42 2019 -0700

    Modernize Clang & GCC versions to be used

commit bda1067d6f8d8c0cdfcec0a86e318566cc816d9a
Merge: d6abe3c6 6a5c7173
Author: Ronan Keryell <rkeryell@xilinx.com>
Date:   Sun Aug 25 13:45:18 2019 +0200

    Merge pull request #254 from jeffamstutz/disable_tests
    
    only build tests subdirectory if BUILD_TESTING is enabled

commit 6a5c71732ca71fa654c35265367b5c104450ced2
Author: Jefferson Amstutz <jefferson.d.amstutz@intel.com>
Date:   Tue Aug 6 15:00:59 2019 -0500

    only build tests subdirectory if BUILD_TESTING is enabled

commit d6abe3c60250ed22891904e685c425c039360f5d
Merge: dc50058e 856d6d66
Author: agozillon <andrew.gozillon@uws.ac.uk>
Date:   Fri Jul 19 10:02:40 2019 -0700

    Merge pull request #252 from keryell/tbb_openmp_doc
    
    Clarify documentation about OpenMP & TBB mode for the host kernel back-end

commit 856d6d6606e091e61db8e3aea343de3320226cbd
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Thu Jul 18 19:54:22 2019 -0700

    Clarify documentation about OpenMP & TBB mode for the host kernel back-end

commit 980e937ef768a1798e9ccd27733f7e366a519211
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Thu Jul 18 19:44:08 2019 -0700

    Fix alignment issues and typos in comments

commit dc50058eb5d83b8cdddd6a43c994f76ea04a8b0e
Merge: 39e340fd da281b04
Author: agozillon <andrew.gozillon@uws.ac.uk>
Date:   Thu Jul 18 20:28:11 2019 -0700

    Merge pull request #251 from keryell/master
    
    Streamline header organization to expose clearer coexisting namespaces

commit da281b0459a2aad59f75364549b14507b4ca6f2a
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Thu Jul 18 18:17:55 2019 -0700

    Update the documentation to describe new SYCL headers and namespaces
    
    The main library is defined in include/CL/sycl.hpp which
    introduces the API inside the ::cl::sycl namespace as defined by
    the SYCL_ 1.2.1 standard.
    
    As a convenience extension, there are 2 other include files that can
    be used instead:
    
    - include/SYCL/sycl.hpp to have the SYCL API defined inside
      ::sycl as a shortcut to save 4 letters. It does not define
      anything inside the ::cl::sycl namespace, allowing another
      SYCL implementation to coexist there;
    
    - include/triSYCL/sycl.hpp which is where the triSYCL
      implementation resides. It defines all the API under ::trisycl,
      leaving the ::cl and ::sycl namespaces free to be used by
      some other implementations and to have triSYCL to coexist with them.

commit 12e6f0d61644f1477ab0bf5e10f587b9c5542c32
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Thu Jul 18 17:18:57 2019 -0700

    Rename sycl.hpp to SYCL/sycl.hpp
    
    This limit the conflicts of includes at the top level.
    
    introduce triSYCL in the sycl:: namespace and do not use cl::sycl::
    leaving room for other implementations to coexist in cl::sycl:: or
    just provide a short cut saving 4 letters.
    
    Also, while it is an extension, using SYCL/sycl.hpp instead of
    sycl.hpp remains more in the spirit of the SYCL 1.2.1 specification
    along CL/sycl.hpp.

commit f4ea52c26bf3e4bfc522e09d1af0d5fdc7374484
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Thu Jul 18 16:56:29 2019 -0700

    Rename trisycl directory to triSYCL and trisycl.hpp to triSYCL/sycl.hpp
    
    This limit the conflicts of includes at the top level.
    
    introduce triSYCL in the trisycl:: namespace and do not use cl::sycl::
    or sycl::, leaving room for other implementations to coexist.
    
    Also, while it is an extension, using triSYCL/sycl.hpp instead of
    trisycl.hpp remains more in the spirit of the SYCL 1.2.1 specification
    along CL/sycl.hpp.

commit 39e340fdc8102332b1f1df4a4a3d30fcf11f7dd4
Merge: 8cb0b3fe dead7b65
Author: agozillon <andrew.gozillon@uws.ac.uk>
Date:   Mon Jul 8 11:55:10 2019 -0700

    Merge pull request #250 from keryell/function-try-block
    
    Use function-try-block to simplify code

commit dead7b650a5dd1adb00883a45c2ce7f8e2763949
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Mon Jul 8 10:38:30 2019 -0700

    Use function-try-block to simplify code
    
    Remove useless {} since the try/catch it at the function top-level.

commit 8cb0b3fef5ae1ca6c1d47a2d364cd9c2d5cc03e7
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Mon Jul 1 13:56:53 2019 -0700

    Allow empty device scope by default
    
    This about the scope storage extension.

commit 239e8c80e1cca4327754911fc7994da77c513acf
Merge: fedf3210 9d9f32cb
Author: agozillon <andrew.gozillon@uws.ac.uk>
Date:   Fri Jun 28 17:59:59 2019 -0700

    Merge pull request #249 from keryell/trisycl-namespace
    
    Introduce new ::trisycl & ::sycl namespaces with new top-level include files.

commit 9d9f32cb7a30d9496b67158ec0768dc9a6703046
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Fri Jun 28 15:56:15 2019 -0700

    Improve formatting of the headers for Doxygen

commit a247d39834f6f735ed94f6c2c03ae43efc01ca10
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Fri Jun 28 15:39:38 2019 -0700

    Add #include guards inside the top include files too

commit d0bd957b62eaafdcde704b878c88bb3393b7e8ef
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Fri Jun 28 15:37:37 2019 -0700

    Update namespace inside C++ generated by the kernel internalizer

commit 6b7a0cb6b8bc1eb761c362c231808b5a98da6c91
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Fri Jun 28 15:11:20 2019 -0700

    Add new sycl.hpp header to introduce ::sycl API as an extension
    
    This implements extension https://github.com/triSYCL/triSYCL/issues/227

commit 549f8b0d3de469e0d3a3374b1ed5399b2a3fac84
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Fri Jun 28 12:19:16 2019 -0700

    Improve example testing a triSYCL feature to use #include "trisycl.hpp"

commit 6484662adf8566a35b445a2dbfaee1a05893dc62
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Fri Jun 28 12:04:18 2019 -0700

    Improve scope example to expose ::cl::sycl use with a triSYCL layered extension

commit 6bf60118af05fd30a090e772be709ac343d77ddc
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Fri Jun 28 11:54:42 2019 -0700

    Fix internal includes to use trisycl/ directory instead of CL/sycl/
    
    Apply
    find include -type f -name '*.hpp' -exec sed -i 's,CL/sycl/,trisycl,g' {} +

commit 6f1d609596bc1639c2b4b0d9a8cfc2bf20e63a26
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Fri Jun 28 11:30:36 2019 -0700

    Move triSYCL implementation from CL/sycl directory to trisycl
    
    Add a new trisycl.hpp to use triSYCL within ::trisycl namespace.

commit e3a6eac7050fdc8466cdc2eb61eb6bae61d7a46f
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Mon Jun 24 18:18:15 2019 -0700

    Rename cl::sycl:: to trisycl:: and add a cl::sycl:: compatibility mode
    
    This is the start of implementing
    https://github.com/triSYCL/triSYCL/issues/228
    and the ability to have several SYCL implementation to coexist and
    having layered implementations.
    
    First looked at clang-rename to rename cl::sycl to trisycl but it does
    not work because it is more a renaming + namespace fusion. But since
    we are using C++17 namespaces now, a simple regex solution works. :-)
    
      find . -type f -name '*.hpp' -exec sed -i 's/cl::sycl/trisycl/g' {} +
    
    Some manual work is required to qualify better some names because
    there are more namespace ambiguity, such as trisycl,
    trisycl::vendor::trisycl, trisycl::detail and
    trisycl::vendor::trisycl::scop::detail

commit 4ba836c585c727180595dfed1bba697a2347e477
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Mon Jun 24 17:51:31 2019 -0700

    Do not access internal header file
    
    While this still does test for an internal data structure, at least
    simplify the future header & namespace renaming.

commit fedf3210a462d810d881e07c2197f3d20cf0c328
Merge: 4ce08f86 00a310c8
Author: agozillon <andrew.gozillon@uws.ac.uk>
Date:   Mon Jun 24 16:28:16 2019 -0700

    Merge pull request #248 from keryell/c++17-nested-namespace
    
    C++17 nested namespace

commit 00a310c8a7d6a0a857d898f84d27da6b6bb048a6
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Mon Jun 24 15:46:54 2019 -0700

    Use C++17 nested namespace
    
    Mainly use
    clang-tidy-9 --fix-errors --checks='-*,modernize-concat-nested-namespaces'
    and fix a few errors manually.

commit 83be4bd5de820e847c0dbb68ccd45c1436f1ed39
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Mon Jun 24 11:55:40 2019 -0700

    Mention -DCMAKE_EXPORT_COMPILE_COMMANDS=1 for using some refactoring tools

commit 92cd1818e93b995744eee63bd28022f01e62b626
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Mon Jun 24 11:21:22 2019 -0700

    Compile using just as many threads as CPU cores
    
    Previous raw `-j` was doing a thread spamming, typically acting as a
    deny-of-service on the machine because of memory limitations.
    Note that it might be still too high for small-memory systems...

commit 4ce08f861adcda88a0396cc929c38fd87aa7fc37
Merge: 0374a80b 9cf027fa
Author: agozillon <andrew.gozillon@uws.ac.uk>
Date:   Mon Jun 3 16:05:49 2019 -0700

    Merge pull request #246 from keryell/doc-update
    
    Update the documentation

commit 9cf027fab8efcd22c2474f53917b617cb0ff207a
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Mon Jun 3 09:42:31 2019 -0700

    Add link to triSYCL + Intel SYCL fusion

commit 86d88d8ab0d1d5782e14d5b648633fd2bb080a7e
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Mon Jun 3 09:41:53 2019 -0700

    Add link to Intel SYCL open-source implementation

commit c9e384733b2c3d15d9ca990ee913e20b88a40ea4
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Mon Jun 3 09:41:19 2019 -0700

    The Khronos SYCL CTS is now open-source

commit 6302c24031a351aec5867fa2c133f6597c6f5eb4
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Mon Jun 3 09:40:52 2019 -0700

    More content about hipSYCL project and CUDA

commit 402321642ce12b8849126f262b12ea4281f5f907
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Mon Jun 3 09:38:54 2019 -0700

    Modernize descriptions

commit 0374a80b8e486554f7c4c0c25595796f4e67a92a
Merge: 4318ee44 30262fe2
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Wed May 29 13:48:38 2019 -0700

    Merge pull request #245 from nazavode/develop
    
    Isolate cxx build settings into interface target

commit 30262fe26f7b08a3ce7c0f9ac29e0c995225d6e2
Author: Federico Ficarelli <federico.ficarelli@gmail.com>
Date:   Fri May 17 12:11:32 2019 +0200

    Fix cxx std level for partially supported toolchains

commit e8a3c1fd3e11e4dc895be5182eb4a0959efdc131
Author: Federico Ficarelli <federico.ficarelli@gmail.com>
Date:   Thu May 16 23:19:14 2019 +0200

    Isolate cxx build settings into interface target

commit 4318ee44647956a02409079b83c2e4b38fba868c
Merge: 5192ca3a ac023b6b
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Thu May 16 14:09:15 2019 -0700

    Merge pull request #244 from nazavode/fix/cmake
    
    Fix CMake implicit conversion warning

commit ac023b6b5aa9ab039d58a78a8bf0eafee7c512e5
Author: Federico Ficarelli <federico.ficarelli@gmail.com>
Date:   Thu May 16 22:27:12 2019 +0200

    Fix CMake implicit conversion warning
    
    Explicit data type must be specified when setting a value
    into cache in order to avoid implicit conversions that trigger
    warnings like this:
    
    ```
    CMake Warning (dev) at cmake/FindTriSYCL.cmake:138 (set):
      implicitly converting 'VERSION' to 'STRING' type.
    Call Stack (most recent call first):
      CMakeLists.txt:13 (include)
    This warning is for project developers.  Use -Wno-dev to suppress it.
    ```

commit 5192ca3acf6f8da4268c1489caf8f98ade98d35a
Merge: f397c52e 3a95a30b
Author: Ronan Keryell <ronan@keryell.fr>
Date:   Tue Apr 9 14:29:46 2019 -0700

    Merge pull request #242 from agozillon/buffer_fix
    
    set_final_data overload fix

commit 3a95a30b45b261257cdcf2c3fb295bdb83a44c1b
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Tue Apr 9 10:07:06 2019 -0700

    Fixing indentation

commit 383b19bc7c0bef7a23b7bd0b3a161b5012021880
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Fri Apr 5 16:37:46 2019 -0700

    set_final_data overload fix
    
    Fix to something I broke in a prior patch that was causing build/compile
    errors for tests in the SYCLParallelSTL.
    
    Didn't account for the need to remove the reference on the rvalue
    referened iterator when passed in, so the iterator_traits class could
    not find the value_type member!
    
    One of those little hard to catch move/reference semantic errors.
    
    Also some minor spelling fixes.

commit f397c52e39fcaa8fb50086471e78b6dcafa743bb
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Fri Mar 29 07:39:30 2019 -0700

    Move Xilinx FPGA extensions in their own directory
    
    Clearer to separate them from other on-coming architectures.

commit 3b64d71263a362f0b75204c5fbcf35e9d60cb908
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Mon Mar 4 15:13:18 2019 -0800

    Update documentation

commit 8952f19ff74e7f32896348fd311115c889fc42f3
Merge: 55b8eb6a db4f3a66
Author: Ronan Keryell <ronan@keryell.fr>
Date:   Mon Feb 18 22:19:52 2019 -0800

    Merge pull request #233 from agozillon/slambench_modifications
    
    Slambench modifications - math.hpp additions
    
    A few new math functions fmin, fmax, max, min, clamp, cross, length, floor and dot that are intended for use with the sycl::vec class.

commit 55b8eb6a2d38b67e9b6df90b2c74479986ab9391
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Fri Feb 1 13:38:22 2019 -0800

    Happy new year 2019!

commit 26131979cc8d720414ede49d5747fb4bc86da539
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Wed Jan 30 17:14:43 2019 -0800

    Pass more environment variables through LIT tests
    
    For example to test with own Clang/LLVM or C++ experimental md_span.

commit 0a728295876013648c9b05553e1546c82416ec06
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Wed Jan 30 17:14:16 2019 -0800

    Update lit.cfg to latest LLVM

commit db4f3a667624d703dc2974ab0994b65eb7ac456e
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Thu Jan 10 18:32:08 2019 -0800

    Fixing requested changes to indentation and floor implementation, also adding auto instead of explicit return type

commit a699105033cb78ce74556fb82caa51b456b5ca8f
Merge: 6b834531 1d61648f
Author: Ronan Keryell <ronan@keryell.fr>
Date:   Fri Jan 11 15:49:22 2019 -0800

    Merge pull request #226 from agozillon/device_compiler_tests
    
    Additional Device Compiler Tests
    
    Several new device compiler tests that work with POCL when they're
    compiled using a newer version of the triSYCL device compiler (after the
    commit: [OpenCL] Add generic AS to 'this' pointer).
    
    However, the buffer_of_objects_members.cpp test works with the
    sycl/release_70/master branch of the triSYCL device compiler.
    
    The buffer of objects tests are related to:
    https://github.com/triSYCL/triSYCL/issues/212
    
    The ctor_and_dtor tests are related to:
    https://github.com/triSYCL/triSYCL/issues/213

commit 6b8345318dd3cc62020bb1624ef63cc10d9b5d02
Merge: 10c5f962 ee3d918b
Author: Ronan Keryell <ronan@keryell.fr>
Date:   Fri Jan 11 15:42:26 2019 -0800

    Merge pull request #165 from airlied/cts-queue
    
    Initial implementation of queue get_info and properties.

commit 10c5f962451f3c3179bec7783f90dce14e8a3af5
Merge: 2606ffb9 a98d1cd4
Author: Ronan Keryell <ronan@keryell.fr>
Date:   Fri Jan 11 15:29:03 2019 -0800

    Merge pull request #168 from airlied/cts-event
    
    event code implementation

commit 1d61648fe4d0f5084cfc35ec0e5ee2c6a0e8557f
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Fri Jan 11 09:49:12 2019 -0800

    Modifying the comments to remove status. Apart from a note in the one failing test.

commit ee3d918b4af330e7976d5200f93c155e18f87c01
Author: Dave Airlie <airlied@redhat.com>
Date:   Tue Aug 7 16:02:49 2018 +1000

    queue: initial implementation of queue (v4)
    
    Implement queue get_info and properties support.
    
    We define queue to be derived from the property
    list class, as while it would work as a friend
    for queues, buffers require templates so it
    doesn't work so well there.
    
    Notes:
    this currently defaults to host selector (when it
    should be default selector, until we have a useful
    device compiler.
    
    v2: fix queue context check
    (passes cts queue test now)
    v3: fix some asyncHandler references + fix default explicit init
    v4: change get_info return cast, drop useless device cast, use has_value,
    fix comment (Ronan)

commit 539c7690a32c202b39c23796cd1183c8f9bace11
Author: Dave Airlie <airlied@redhat.com>
Date:   Tue Aug 7 16:00:11 2018 +1000

    initial property_list implementation. (v1.1)
    
    Since we have no properties yet, this doesn't actually implement much
    it's more of the skeleton that we fill the properties into later.
    
    has_property and get_property are in the property_list implementation
    as you can't specialize a member function without explicitly specializing
    the containing class, so for buffer properties we do that in here
    not in the buffer class.
    
    To handle the pack parameter, we create an iterative addproperty method
    that we implement for each base property type using the CREATE macro.
    
    the all_true is required to disambiguate queue constructors, so that
    a property_list can only be constructed from a list of classes
    derived from the base property detail class.
    
    Property list users are expected to derive this class.
    
    v1.1: use has_value + cleanup indents

commit ce5fd8fd3e80c03601398cbcefecd10492df2890
Author: Dave Airlie <airlied@redhat.com>
Date:   Fri Jul 6 14:45:20 2018 +1000

    add an all_true implenetation to detail
    
    we need this for the property list constructor
    so we can restrain it to only having valid
    pack parameter constructors for sets of object
    derived from the property base class.

commit f4ccec2957591c861c2617718efba8fd6ffbbc25
Author: Dave Airlie <airlied@redhat.com>
Date:   Tue Aug 7 15:27:40 2018 +1000

    queue: drop handler_event, fix queue submit interfaces.
    
    The queue submission interfaces in the spec take events

commit a98d1cd48f3ab19832353bef4d2b01317da3b0f9
Author: Dave Airlie <airlied@redhat.com>
Date:   Thu Jul 5 16:41:56 2018 +1000

    event: implement more of event class (v3.1)
    
    then creates the usual host/opencl detailed
    implementations for events, along with
    the get_info and get_profiling_info
    implementations.
    
    v2: collapse namespaces, add some basic docs,
    fix some other minor bits
    v3: update for master
    v3.1: fixup comments

commit 75ee35e73145ce078894742af3ecc96f74673e5d
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Thu Jan 10 11:21:28 2019 -0800

    Some minor requested vec math tweaks and implementation try 2 for generic functor applier

commit 2606ffb94eaf088a88f37f7364140a2d04a1339f
Merge: 09ac39e5 70f57613
Author: Ronan Keryell <ronan@keryell.fr>
Date:   Thu Jan 10 09:44:06 2019 -0800

    Merge pull request #234 from agozillon/vec_overloads
    
    Additional vec accessor overloads allowing swizzle on the LHS of an assignment

commit 7b1f983f57e24e380d6c299b0b3aaf314518218d
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Wed Jan 9 16:59:55 2019 -0800

    Changes to vec math functions, new test case and vec apply_functor
    
    Added a new test case tests/math/vector_math.cpp testing the new math
    functions for vec.
    
    Made some requested qol changes to the new vec math functions. The main
    one involving the addition of a template function that folds across
    vecs, applies a lambda for each index and then returns a vec containing
    the new data. Perhaps the implementation is a little too generic?

commit 70f576130ae2c5cd96e1bf7779fc02e93a680207
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Tue Jan 8 17:09:04 2019 -0800

    Cleaning vec.hpp a little as requested
    
    Applying the requested changes related to wrapping/formatting older code
    to the 80 character limit.
    
    And removing some unrequired parenthesis inside the macro for the hex
    indexer/accessors.

commit 47c9dae68f0c84577b890d20f567b55de4b64730
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Tue Jan 8 10:23:17 2019 -0800

    Fixing mistake made to hi and lo functions for vec4
    
    Was accessing the wrong elements when creating a swizzle of the vec.
    Silly error made from trying to copy paste vec3 -> vec4 functions as
    they share most of the same calls. Double checked other vec's incase I
    did other silly mistakes but they seem fine.

commit ddc2b69dd302850a09506e99bebd23e1dd27a06c
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Tue Jan 8 09:46:36 2019 -0800

    Fixing incorrect indentation

commit a0a79dd0710ceb18b9e82b0896d0ea5a6fe21c64
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Mon Jan 7 19:03:25 2019 -0800

    Additional vec accessor overloads
    
    Modifying accessors of type x() const {} to be const x() const {} and
    adding accessors of type x() {}, which should cover the two use cases
    for swizzles.

commit 63ab8edd4fbc465baa2888aef40aa56f14be84dd
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Mon Jan 7 18:45:50 2019 -0800

    Adding max and min overloads for SYCL vec

commit 5767eb894c2212844e9dbe5ef96aa8fc8d298108
Merge: 7aa30cea 09ac39e5
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Thu Jan 3 08:50:00 2019 -0800

    Merge branch 'master' into slambench_modifications

commit 09ac39e52c365e25276dda942012ed35c7c1368f
Merge: dd39f990 92ce2ffb
Author: Ronan Keryell <ronan@keryell.fr>
Date:   Thu Jan 3 12:54:36 2019 +0100

    Merge pull request #231 from agozillon/small_array_fix
    
    Empty base class fix to small_array
    
    Summary:
    small_array's empty base class optimization was not working correctly
    causing the vec class (and possibly others) to bloat by a few extra
    bytes. A large problem for cl::sycl::vec as alignas gets invoked so that
    it's correctly aligned to the data it contains, however in the case
    where its larger by a few bytes alignas was rounding it up to double the
    size it should be. The fix is basically chaining the boost operation
    classes that were causing the EBCO issue (chaining boost operation
    classes is an inbuilt boost workaround for this very issue it seems,
    link in the comments of small_array.hpp).
    
    Added a new test to check that vec's are appropriately sized and
    contigious (vecmemlayoutalign.cpp) and modified the CMakeList.txt to
    include it in the test suite.
    
    I also tweaked the vec.hpp classes get_size method as it was incorrect
    for vec's of 3 elements (special case for it to be treated as sizeof(T)
    x 4 in SYCL Spec 1.2.1 Section 4.10.2.6). Also fixing the vecacc.cpp
    test to reflect this change.

commit 92ce2ffbdd7ce90fe0938d8ca4c8daf3b87961c2
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Wed Jan 2 12:39:22 2019 -0800

    Making requested modifications
    
    Mostly spelling, punctuation and indentation fixes. Although, there is
    the addition of a tempalte type (value?) alias for the alignment type
    trait and then usage of that in the relevant areas.

commit dd39f99071e23e90a1798e65d9270a9aec029902
Merge: 437b906d b7a52a45
Author: Ronan Keryell <ronan@keryell.fr>
Date:   Fri Dec 21 12:28:23 2018 +0100

    Merge pull request #232 from pkeir/master
    
    Added the unary overload to the item::get_range method.

commit 28cbb4048fe7156c3d26fe19b28fcbbc29ff989a
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Thu Dec 20 16:27:34 2018 -0800

    Addition of alignment type trait and alignas on detail::vec
    
    Created an alignment type trait and added it to a new header called
    alignment_helper.hpp for the moment, perhaps there is a better location.
    
    Aligning cl::sycl::detail::vec now via the trait.
    
    Replaced previous alignof(sizeof(DataType) * ElementSize) in
    cl::sycl::vec with type trait.

commit b7a52a459fb5ab40ff2968ca6644a41a28254a1e
Author: Paul Keir <pkeir@outlook.com>
Date:   Wed Dec 19 15:33:35 2018 +0000

    Added the unary overload to the item::get_range method.

commit efee845c8ce0c286b3b87e322d8b12327f418b1a
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Tue Dec 18 18:02:42 2018 -0800

    Trying to fix a runaway return statement

commit 71a7fb1afa72a02a4a5f39705dd8f530bfd06dd7
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Tue Dec 18 17:56:41 2018 -0800

    Empty base class fix to small_array
    
    Related to: https://github.com/triSYCL/triSYCL/issues/230
    
    small_array's empty base class optimization was not working correctly
    causing the vec class (and possibly others) to bloat by a few extra
    bytes. A large problem for cl::sycl::vec as alignas gets invoked so that
    it's correctly aligned to the data it contains, however in the case
    where its larger by a few bytes alignas was rounding it up to double the
    size it should be. The fix is basically chaining the boost operation
    classes that were causing the EBCO issue (chaining boost operation
    classes is an inbuilt boost workaround for this very issue it seems,
    link in the comments of small_array.hpp).
    
    Added a new test to check that vec's are appropriately sized and
    contigious (vecmemlayoutalign.cpp) and modified the CMakeList.txt to
    include it in the test suite.
    
    I also tweaked the vec.hpp classes get_size method as it was incorrect
    for vec's of 3 elements (special case for it to be treated as sizeof(T)
    * 4 in SYCL Spec 1.2.1 Section 4.10.2.6). Also fixing the vecacc.cpp
    test to reflect this change.

commit 7aa30cea5b1957ce848a2721c689bec224bae991
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Tue Dec 18 17:34:05 2018 -0800

    Adding vec.h to math.hpp as it now contains vec related math operations.

commit 437b906d787b4ec532dc61ad161c818637885350
Merge: f30eb388 122244cb
Author: Ronan Keryell <ronan@keryell.fr>
Date:   Tue Dec 18 15:02:39 2018 -0800

    Merge pull request #225 from agozillon/nd_range_kernel
    
    parallel_for ND_RANGE_KERNEL support
    
    Adding support for OpenCL nd_range_kernel launches using parallel_for
    when TRISYCL_USE_OPENCL_ND_RANGE is specified. Related to issue:
    #146
    
    Modifications:
    
    parallelism.hpp: new parallel_for implementation that is conditionally
    compiled in place of the normal parallel_for when
    TRISYCL_USE_OPENCL_ND_RANGE is specified. Does mostly the same thing,
    gets the type of the kernels argument (item, id, etc.). Except instead
    of forwarding it too another overloaded parallel_for to create a
    recursive call graph of the kernel that can be scheduled, it forwards it
    too create_parallel_for_arg which generates an appropriate index class
    (for host or for device depending on the compile stage) and passes it to
    the kernel for invocation. This parallel_for function is invoked from
    the parallel for schedule kernel call.
    
    handler.hpp: Invokes the new parallel_for using a new schedule function
    called schedule_parallel_for_kernel which works similarly to
    schedule_single_task at the moment except that it queues up an nd_range
    kernel using the parallel_for invocation of the pencl_kernel
    implementation.
    
    architecture.rst, environment.rst, macros.rst: Adding text to explain
    how to activate the ND_RANGE_KERNEL through the Makefile and the define
    that you need to enable for it (TRISYCL_USE_OPENCL_ND_RANGE)
    
    Makefile: Searches for additional environment variable which defines
    TRISYCL_USE_OPENCL_ND_RANGE for the appropriate compile steps. And
    moving the -req-work-group-size-1 optimization step onto the XILINX_SDK
    compile path as it currently doesn't play nice with the OpenCL runtime
    (it's illegal in OpenCL to have a nullary specified work-group size for
    nd_range_kernel's and have this attribute specified).
    
    Added:
    
    opencl_spir_req.h: A minimalist file created to only include the OpenCL
    SPIR intrinsics that we currently use. Some of the intrinsics from the
    main file require certain OpenCL compiler flags to be flipped to fully
    work (the half type for example).
    
    opencl_spir_helpers.hpp: contains some helper functions that generate
    SYCL index container information for OpenCL devices using SPIR
    intrinsics. Alongside a function for forwarding to the appropriate
    container generation function (used inside detail::parallel_for invoked
    in the scheduler).
    
    It's still a WIP and needs groups and nd_item added and its possible
    that we may reduce the number of factory functions down to just id and
    range (perhaps item?) and invoke these from inside the index container
    classes.
    
    parallel_for_overloads.cpp: new test file that currently just checks if
    the appropriate data has been set in one of the kernel launches and that
    all kernels have been successfully launched. Extensions to the test will
    be added as the implementation becomes more fleshed out (e.g. testing
    different intrinsics like the get_global_size contained in ranges).

commit 122244cb50b07477838c60ffd7ea6e7d91882315
Merge: b50ffd96 f30eb388
Author: Ronan Keryell <ronan@keryell.fr>
Date:   Tue Dec 18 15:01:00 2018 -0800

    Merge branch 'master' into nd_range_kernel

commit f30eb3887f663a23850db76b99c17997609bac20
Merge: fb89e457 bf127c0c
Author: agozillon <andrew.gozillon@uws.ac.uk>
Date:   Mon Dec 17 09:33:17 2018 -0800

    Merge pull request #229 from keryell/master
    
    Update documentation for latest device compiler

commit bf127c0cbe66e42a30349a0296cae337e6366b1f
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Fri Dec 14 17:31:12 2018 -0800

    Update documentation for latest device compiler

commit 9fbd6d92e5f04e7e163dd4e586a7adea52d985c8
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Fri Dec 14 10:30:37 2018 -0800

    Capture by reference fix for new device compiler tests

commit b50ffd96a6df94f8b7b3dee04aff9947d58cee2d
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Fri Dec 14 09:41:34 2018 -0800

    Almost there (part 2)...

commit 8560da5bd00ed677097fcfc48c2c7e7720125ce3
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Thu Dec 13 14:01:22 2018 -0800

    Additional Device Compiler Tests
    
    Several new device compiler tests that work with POCL when they're
    compiled using a newer version of the triSYCL device compiler (after the
    commit: [OpenCL] Add generic AS to 'this' pointer).
    
    However, the buffer_of_objects_members.cpp test works with the
    sycl/release_70/master branch of the triSYCL device compiler.
    
    The buffer of objects tests are related to:
    https://github.com/triSYCL/triSYCL/issues/212
    
    The ctor_and_dtor tests are related to:
    https://github.com/triSYCL/triSYCL/issues/213

commit 75a16858d76f71faa4f0b7c1e63e0387e0f715f4
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Tue Dec 11 19:35:10 2018 -0800

    Modifications based on last set of requested changes
    
    opencl_spir_helpers.hpp:
    
    renamed gen_* functions to make_*
    
    renamed fill_id_or_range to make_array_with_func
    
    Added a static_assert in create_parallel_for_arg in the cases that it's
    being used incorrectly.
    
    opencl_spir_req.h:
    
    added reference to SPIR-Tools repository.
    
    changed size_t -> std::size_t and removed the std namespace.
    
    added \todo relating to cl::sycl::types
    
    moved the cstddef header

commit fb89e4575a1c9cda66a2b0c85413c16a42a1603b
Merge: 7f160f45 3a57d737
Author: agozillon <andrew.gozillon@uws.ac.uk>
Date:   Tue Dec 11 18:35:42 2018 -0800

    Merge pull request #224 from keryell/scopes
    
    First implementation of an extension to have storage scope for SYCL runtime objects

commit 6fdb22d7144785dc7d1ebccd7a6ed7ddae5b4e66
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Tue Dec 11 17:07:09 2018 -0800

    Remove commented out test code

commit 3375b751d9c2b42304df7664f6c76b6544579803
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Tue Dec 11 16:58:07 2018 -0800

    WIP parallel_for ND_RANGE_KERNEL review update
    
    A patch aimed at addressing all of the prior change requests in some
    way. Also adding two new device_compiler tests and a slight modification
    to id.hpp.
    
    Modifications:
    
    id.hpp: Added dimensionality as a public variable similarly to range.hpp
    as there appears to be no other simple way to introspect id's
    dimensionallity without type deduction at the moment. Unsure if
    dimensionality was intentionally left out of id.hpp and if so it can be
    removed (I'd have to readjust opencl_spir_helpers.hpp shouldn't be an
    issue however).
    
    environment.rst, architecture.rst and macros.rst: Documentation
    updates/corrections
    
    hanler.hpp: tidied in a previous commit and fixed some comments with
    this commit.
    
    opencl_spir_req.h: Added ifndef/define even if its not needed it
    probably doesn't hurt to have it! Also removed the top-level data types
    and changed the email/name in the documentation to my own.
    
    opencl_spir_helpers.hpp: Changing the email/name. Removing old
    gen_spir_id/gen_spir_range and enum class then replacing with
    fill_id_or_range (is there a better name?) which takes a function and
    invokes it for each dimension of the id or range passing the dimension
    to the function. Modified the other functions accordingly and added an
    else if constexpr rather than two seperate if constexpr's.
    
    parallelism.hpp: Addition of new template function capture_arg_v which
    captures the argument of a simple 1 argument passed in lambda and
    returns a instance of the captured argument, replaces what mem_fn was
    previously doing in parallel_for as the member functions we rely on will
    be deprecated soon (which weirdly looks like mem_fn is left as a
    redundant implementation of std::invoke). I also added comments
    describing its usage and why it's used in this context so that it's
    possible to read and understand the reasoning a little better.
    
    Makefile: Fixing indenting of several things I've added that are
    enclosed in ifdef's so that they suit the formatting that existed
    before. Adding the XILINX_SPECIFIC_PASS variable and appending
    -reqd-workgroup-size-t to it as suggested. Also removing a commented out
    piece of code that I think is no longer required.
    
    Added:
    
    parallel_for_ND_range.cpp & parallel_for_ranges.cpp: Two new tests the
    latter tests that range data is correct on a basic level and the former
    checks that ND range kernels are executing in more than 1D and tests
    get_linear_id and the ranges further.

commit 4462cdebdb341feb7945c7b447b47276efd75870
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Mon Dec 10 17:24:26 2018 -0800

    amending a comment

commit 8e4884ffa08c6f62ea629cc13f1a0c2671408ba8
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Mon Dec 10 17:14:13 2018 -0800

    tidying handler.hpp

commit 0456e2739b6a32e52ff8e3007d1d615607016300
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Mon Dec 10 09:39:29 2018 -0800

    WIP parallel_for ND_RANGE_KERNEL support
    
    Adding support for OpenCL nd_range_kernel launches using parallel_for
    when TRISYCL_USE_OPENCL_ND_RANGE is specified. Related to issue:
    https://github.com/triSYCL/triSYCL/issues/146
    
    Modifications:
    
    parallelism.hpp: new parallel_for implementation that is conditionally
    compiled in place of the normal parallel_for when
    TRISYCL_USE_OPENCL_ND_RANGE is specified. Does mostly the same thing,
    gets the type of the kernels argument (item, id, etc.). Except instead
    of forwarding it too another overloaded parallel_for to create a
    recursive call graph of the kernel that can be scheduled, it forwards it
    too create_parallel_for_arg which generates an appropriate index class
    (for host or for device depending on the compile stage) and passes it to
    the kernel for invocation. This parallel_for function is invoked from
    the parallel for schedule kernel call.
    
    handler.hpp: Invokes the new parallel_for using a new schedule function
    called schedule_parallel_for_kernel which works similarly to
    schedule_single_task at the moment except that it queues up an nd_range
    kernel using the parallel_for invocation of the opencl_kernel
    implementation.
    
    architecture.rst, environment.rst, macros.rst: Adding text to explain
    how to activate the ND_RANGE_KERNEL through the Makefile and the define
    that you need to enable for it (TRISYCL_USE_OPENCL_ND_RANGE)
    
    Makefile: Searches for additional environment variable which defines
    TRISYCL_USE_OPENCL_ND_RANGE for the appropriate compile steps. And
    moving the -req-work-group-size-1 optimization step onto the XILINX_SDK
    compile path as it currently doesn't play nice with the OpenCL runtime
    (it's illegal in OpenCL to have a nullary specified work-group size for
    nd_range_kernel's and have this attribute specified).
    
    Added:
    
    opencl_spir_req.h: A minimalist file created to only include the OpenCL
    SPIR intrinsics that we currently use. Some of the intrinsics from the
    main file require certain OpenCL compiler flags to be flipped to fully
    work (the half type for example).
    
    opencl_spir_helpers.hpp: contains some helper functions that generate
    sycl index container information for opencl devices using spir
    intrinsics. Alongside a function for forwarding to the appropriate
    container generation function (used inside detail::parallel_for invoked
    in the scheduler).
    
    It's still a WIP and needs groups and nd_item added and its possible
    that we may reduce the number of factory functions down to just id and
    range (perhaps item?) and invoke these from inside the index container
    classes.
    
    parallel_for_overloads.cpp: new test file that currently just checks if
    the appropriate data has been set in one of the kernel launches and that
    all kernels have been successfully launched. Extensions to the test will
    be added as the implementation becomes more fleshed out (e.g. testing
    different intrinsics like the get_global_size contained in ranges).

commit 3a57d737816dd9bae4cd7b488becda1b856645dd
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Mon Dec 10 16:09:56 2018 -0800

    Fix scope example by removing some parallelism
    
    The example was wrong because a pipe cannot be written by different
    work-items.
    Replace the parallel_for by a single_task instead of diving into
    pipe reservation station for this contrived example.

commit db9a72de1a78e42a2d028584438723111dbe530f
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Mon Dec 10 15:49:45 2018 -0800

    Fix typos and remove constexpr platform WIP
    
    This should address remarks from Andrew Gozillon in
    https://github.com/triSYCL/triSYCL/pull/224

commit ee880d46136940113d39d33ef032e91ddacce3e4
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Fri Dec 7 16:31:06 2018 -0800

    Add CMake support for scope extension example

commit d9318fbbf26b0a8c23c3b3fb84d8b9044ed9be2c
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Fri Dec 7 13:47:33 2018 -0800

    Remove example about compile-time platform extension from scope extension
    
    It will go in another branch.

commit 179b69967570353d8abffd2584eff7cbfb525c15
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Fri Dec 7 13:40:11 2018 -0800

    Rename the test directory for the scope extension
    
    It started as a `constexpr` extension but now it is more scope-related.

commit 623fc382f87d181453388b9b3dfc918fafe0b819
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Fri Dec 7 13:29:03 2018 -0800

    Clean scoped queue example

commit 15365ad28d331b009bdae9cda0e8cf2420d1d31a
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Fri Dec 7 13:20:20 2018 -0800

    Clean scope example and debug implementation

commit ca98f317ffe662ce59681056570564cdcb827d69
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Fri Dec 7 11:22:31 2018 -0800

    Create a separate scoped queue implementation and a global header for scopes

commit b7030bb68bcd35edc89ec112830c5ada49b48e64
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Fri Dec 7 10:59:34 2018 -0800

    Add a type trait to compute the scope storage type of a SYCL class

commit 7f160f45f41ce70ec7d4a0628bad8a6164186432
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Tue Dec 4 16:30:57 2018 -0800

    Add links to TBB

commit fe29d31f3eda34003d0633022616e048acfeb6db
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Tue Dec 4 15:26:15 2018 -0800

    Update presentations related to SYCL

commit 963044ce45433f67894884e8dcdf27b7b48f2396
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Mon Dec 3 18:53:39 2018 -0800

    Simplify naming of triSYCL scope storage extension
    
    Remove ``extension::`` from the namespace

commit 8c64762ac8085134d2be24bf3da826a4b7c356f5
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Mon Dec 3 18:12:27 2018 -0800

    Implement a device with device-scoped scoped storage
    
    Make a cleaner implementation with value semantics outside of the example.
    Clean up also the platform.

commit 171fb6de02516c2f74d68a66d8683df2bd376c80
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Mon Dec 3 15:54:33 2018 -0800

    Mock-up of SYCL_VENDOR_TRISYCL_EXTENSION_PLATFORM_SCOPE extension
    
    Fix example.
    Make a platform implementation outside of the example.

commit 621b4f485b430abe6219f0729f1c234282c7d41a
Merge: 32e896ff 4d05936e
Author: Ronan Keryell <ronan@keryell.fr>
Date:   Mon Dec 3 12:37:31 2018 -0800

    Merge pull request #215 from agozillon/agozillon-test
    
    Refactored `parallel_for_workitem()` implementation with OpenMP.
    Add SIMD OpenMP execution when no barrier.
    Simplify existing tests and adding new tests.

commit 4d05936e3e8ab4f09f8f017f3e21a87a074445ed
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Tue Nov 13 16:31:46 2018 -0800

    Refactored `parallel_for_workitem()` implementation
    
    I replaced the older OpenMP modulo/divison loop with two suggested newer
    implementations that use OpenMP parallel collapses and OpenMP SIMD
    operations to speed things up. There are now two separate
    implementations that use OpenMP. One  that is active when
    TRISYCL_NO_BARRIER is defined and the other when it's not defined.
    The implementation when TRISYCL_NO_BARRIER is defined is a little
    less strict and makes use of SIMD operations.
    
    I made use of constexpr rather than template specializations to
    specialise the OpenMP loops for each dimension.
    
    Modifying this however makes the results non-deterministic enough that
    printing output and regex'ng for it is no longer a sufficient testing
    mechanism for hierarchical.cpp and hierarchical_new.cpp (previously
    TRISYCL_NO_BARRIER allowed it to default to a non-OpenMP implementation).
    A new implementation for these tests has been created that assigns values
    to buffers and checks the output is correct.
    
    Related to triSYCL issue: https://github.com/triSYCL/triSYCL/issues/43

commit 36fb424066531144c207620ae16a6ec8a886687d
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Fri Nov 30 14:20:53 2018 -0800

    Add example of SYCL feature testing macros

commit 539f4a9831e86dbe40006e9b845c1a62c8e5dfcc
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Fri Nov 30 14:20:29 2018 -0800

    Add platform scope example

commit bda71ee5eec9b8660663a7ad73656b4b25c0bf21
Merge: 2315dc7e 32e896ff
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Fri Nov 30 10:07:24 2018 -0800

    Merge branch 'master' into slambench_modifications

commit 32e896ff919b58e9b966c465e9c1eb195bce95b5
Merge: 51b4273b 247c39ef
Author: Ronan Keryell <ronan@keryell.fr>
Date:   Wed Nov 28 14:21:13 2018 -0800

    Merge pull request #223 from jeffamstutz/tbb_threading
    
    TBB as a CPU threading option

commit 247c39efb6e0de9db0a9908fec18a9b49ed272f9
Author: Jefferson Amstutz <jefferson.d.amstutz@intel.com>
Date:   Tue Nov 27 15:50:53 2018 -0600

    remove unused headers, cleanup comment style, address other PR feedback

commit f289e018c56c8166f17262000ad45930c841e67a
Author: Jefferson Amstutz <jefferson.d.amstutz@intel.com>
Date:   Tue Nov 27 15:50:15 2018 -0600

    add notes on TBB in documentation

commit 2268dc553752d20656894357cb700be011fb9a30
Author: Jefferson Amstutz <jefferson.d.amstutz@intel.com>
Date:   Tue Nov 27 15:37:23 2018 -0600

    use C++17's terse namespace syntax

commit 5c1fe77a5ee33f1c97bc719a104751389038cad7
Author: Jefferson Amstutz <jefferson.d.amstutz@intel.com>
Date:   Tue Nov 27 14:40:20 2018 -0600

    fix group parallel_for() wrapper function, use general parallelism header

commit f44934a6c4f568250b5476d9f97a19caf5f7c98e
Author: Jefferson Amstutz <jefferson.d.amstutz@intel.com>
Date:   Tue Nov 27 13:16:54 2018 -0600

    add TBB option for building tests, cleanup repetitive CMake commands

commit 7b0ae0c480c91c4da0a5b80ad4addc6e737e2a5f
Author: Jefferson Amstutz <jefferson.d.amstutz@intel.com>
Date:   Tue Nov 27 13:10:24 2018 -0600

    disable OpenMP code paths from compiling when disabled (quiets warnings)

commit 3a11fa6d1e1a972a6d4c12f66baa921ea9ce0577
Author: Jefferson Amstutz <jefferson.d.amstutz@intel.com>
Date:   Tue Nov 27 12:44:33 2018 -0600

    add tbb parallelism implementation (opt-in)

commit 2315dc7e677f270bcad883e480f9a4e4f805f354
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Mon Nov 26 10:35:26 2018 -0800

    Adding Some Vec Math Functions
    
    These were the functions required to get SYCL SLAMBench functioning,
    they're intended for use with the Vec class. They don't function exactly
    like Codeplays implementation at the moment, these are a little stricter
    with the specification (Codeplay lets you pass in vectors for minval and
    maxval on clamp for instance, which doesn't appear to be in the SYCL
    spec at the moment. It could be a handy addition though.).

commit 51b4273b4a2e1ff69403a599b20fbc14724538ef
Merge: 98d778e8 7a84684d
Author: Ronan Keryell <ronan@keryell.fr>
Date:   Wed Nov 21 10:36:40 2018 -0800

    Merge pull request #219 from agozillon/makefile_mod_sw+hw_emu
    
    Modifying the Makefile for Xilinx FPGA hw_emu/sw_emu working with latest SDx & XRT

commit 5073712989db96db6fb8a46ea0e3e9e03cd46825
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Tue Nov 20 15:57:05 2018 -0800

    Clarify compilation errors by using empty classes for device and platform scopes

commit 7ebecd77b700c1b54b6d132a7e6bbd2076017b94
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Tue Nov 20 15:35:59 2018 -0800

    Device & platform scopes need to be l-value
    
    It was wrongly an r-value before.
    Now it compiles with Clang++ again. :-)

commit 9fc3cb11289773d4f8a68cf2cb259c0419fd0f8b
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Tue Nov 20 15:19:24 2018 -0800

    Prototype of device & platform scopes
    
    Compile with g++-8 but not clang++8.

commit 7a84684dfdba660d27de51666cfd626ea4e18c80
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Mon Nov 19 19:29:07 2018 -0800

    Modifying the Makefile for hw_emu/sw_emu
    
    Modifying the previous Xilinx_SDK .bin generation step which output a PK
    binary file to a newer method that outputs an xclbin binary that the xrt
    runtime seems to accept for the moment for sw_emu and hw_emu.
    
    Also adding some missing/new clean steps to support the new generation
    steps.
    
    Adding environment variables used to change target device and emulation
    mode so that the makefile doesn't need constantly modified to change
    this. If unspecified they default to hw and xilinx_vcu1525_dynamic_5_1.
    
    Adding notes to environment.rst for new environment variables used to
    change target device and emulation mode.

commit f5de1217fae9171c7d5c656991150f6d0fb3ecb1
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Mon Nov 19 15:46:53 2018 -0800

    Add range-based parallel_for to the conceptual queue

commit 09a03740bdb016e13f8b265c6c1bd9628e7317de
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Mon Nov 19 11:22:22 2018 -0800

    Add prototype of a more conceptual device and queue

commit 98d778e8756d7be6cdf359cd5a2d90237d7c212b
Merge: e3d29b5c 591847de
Author: Ronan Keryell <ronan@keryell.fr>
Date:   Thu Nov 8 20:05:15 2018 -0800

    Merge pull request #209 from agozillon/agozillon-test
    
    Remove limitation on number of programs managed by triSYCL

commit 591847decfa4ce6ad0f81b3107ffadc6fbd4a6c9
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Thu Nov 8 15:52:11 2018 -0800

    Remove limitation on number of programs managed by triSYCL
    
    Removing the static storage for cl_programs when compiling using SDx for
    the Xilinx runtime.
    
    This relates to issue 145: https://github.com/triSYCL/triSYCL/issues/145

commit e3d29b5cf53640e38e37a7308a3618fd63aaa20e
Merge: 08ce59bb 16b1bcd1
Author: Ronan Keryell <ronan@keryell.fr>
Date:   Thu Nov 8 14:52:06 2018 -0800

    Merge pull request #208 from agozillon/agozillon-test
    
    Removal of buffer::set_final_data(shared_ptr_class<T> finalData): issue #18

commit 16b1bcd1de869494b2f35dc5ff733b113efe3209
Author: Andrew Gozillon <andrew.gozillon@yahoo.com>
Date:   Wed Nov 7 11:12:13 2018 -0800

    Removal of buffer::set_final_data(shared_ptr_class<T> finalData)
    
    I removed the set_final_data functions related to the
    shared_ptr_class/shared_ptr. However, to keep the implicit casting from
    shared_ptr to weak_ptr I had to make the set_final_data call for
    Iterators more specific using the std::iterator_traits class (the same
    method is used on line 271 of SYCL/buffer.hpp). Otherwise invocations
    using shared_ptr's would fall into the too generic Iterator call and
    cause a compile error (no iterator trait members).
    
    The issue link: https://github.com/triSYCL/triSYCL/issues/18

commit 4eae97e3215672c851a036ec5bdb533184a0b2fc
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Sat Sep 15 08:43:11 2018 +0200

    Add use-case for constexpr-like queue extension

commit d0541453a975a38c40a9a5e0048c5a44a81d2303
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Fri May 25 17:23:24 2018 -0700

    WIP on constexpr platform

commit 0359d8b6b1c1424008aebc7b8432a19064a828fe
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Wed Apr 18 14:16:54 2018 -0700

    Add a small constexpr platform architecture

commit 5bc111e687a4b4027a052e9c3590567f1a41175e
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Tue Apr 17 16:40:17 2018 -0700

    Basic implementation of a list of constexpr platform names

commit 2b52dd2586ffb8790030131ab9693f9f6a316534
Author: Ronan Keryell <ronan.keryell@xilinx.com>
Date:   Tue Apr 17 15:08:13 2018 -0700

    Start a mock-up of a constexpr extension of platform
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extension SYCL extension
Projects
Extensions
  
High priority
Development

No branches or pull requests

3 participants