Merge pull request #473 from ptycho/dev

Release 0.8
ptycho · Feb 5, 2024 · e412764 · e412764
2 parents 4cce5ae + 1798bd1
commit e412764
Show file tree

Hide file tree

Showing 209 changed files with 16,613 additions and 581 deletions.
diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml
@@ -24,30 +24,24 @@ jobs:
       max-parallel: 10
       fail-fast: false
       matrix:
-        python-version: ['3.7','3.8','3.9','3.10']
-        mpi: ['mpich', 'openmpi']
-    name: Testing with ${{ matrix.mpi }} and Python ${{ matrix.python-version }} 
+        python-version: ['3.8','3.9','3.10', '3.11']
+    name: Testing with Python ${{ matrix.python-version }} 
     steps:
     - name: Checkout
-      uses: actions/checkout@v3
-    - name: Set up MPI
-      uses: mpi4py/setup-mpi@v1
-      with:
-        mpi: ${{ matrix.mpi }}
+      uses: actions/checkout@v4
     - name: Set up Python ${{ matrix.python-version }}
-      uses: actions/setup-python@v4
+      uses: actions/setup-python@v5
       with:
         python-version:  ${{ matrix.python-version }}
     - name: Add conda to system path
       run: |
         # $CONDA is an environment variable pointing to the root of the miniconda directory
         echo $CONDA/bin >> $GITHUB_PATH
-        conda update -n base conda
         conda --version
     - name: Install dependencies
       run: |
         # replace python version in core dependencies
-        sed -i 's/python=3.9/python=${{ matrix.python-version }}/' dependencies_core.yml
+        sed -i 's/python/python=${{ matrix.python-version }}/' dependencies_core.yml
         conda env update --file dependencies_core.yml --name base
         conda list	
     - name: Prepare ptypy

diff --git a/.gitignore b/.gitignore
@@ -28,3 +28,4 @@ ghostdriver*
 .DS_Store
 .ipynb_checkpoints
 .clang-format
+pip-wheel-metadata/
diff --git a/CONTRIB.rst b/CONTRIB.rst
@@ -26,46 +26,40 @@ Please ensure you satisfy most of PEP8_ recommendations. We are not dogmatic abo
 Testing
 ^^^^^^^
 
-Not much testing exists at the time of writing this document, but we are aware that this is something that should change. If you want to contribute code, it would be very good practice to also submit related tests.
+All tests are in the (``/test/``) folder and our CI pipeline runs these test for every commit (?). Please note that tests that require GPUs are disabled for the CI pipeline. Make sure to supply tests for new code or drastic changes to the existing code base. Smaller commits or bug fixes don't require an extra test.
 
 Branches
 ^^^^^^^^
 
+We are following the Gitflow https://www.atlassian.com/git/tutorials/comparing-workflows/gitflow-workflow development model where a development branch (``dev``) is merged into the master branch for every release. Individual features are developed on topic branches from the development branch and squash-merged back into it when the feature is mature
+
 The important permanent branches are:
- - ``master``: the current cutting-edge but functional package.
- - ``stable``: the latest release, recommended for production use.
- - ``target``: target for a next release. This branch should stay up-to-date with ``master``, and contain planned updates that will break compatibility with the current version.
- - other thematic and temporary branches will appear and disappear as new ideas are tried out and merged in.
+ - ``master``: (protected) the current release plus bugfixes / hotpatches.
+ - ``dev``: (protected) current branch for all developments. Features are branched this branch and merged back into it upon completion.
 
 
 Development cycle
 ^^^^^^^^^^^^^^^^^
 
-There has been only two releases of the code up to now, so what we can tell about the *normal development cycle* for |ptypy| is rather limited. However the plan is as follows:
- - Normal development usually happens on thematic branches. These branches are merged back to master when it is clear that (1) the feature is sufficiently debugged and tested and (2) no current functionality will break.
- - At regular interval admins will decide to freeze the development for a new stable release. During this period, development will be allowed only on feature branches but master will accept only bug fixes. Once the stable release is done, development will continue.
+|ptypy| does not follow a rigid release schedule. Releases are prepared for major event or when a set of features have reached maturity.
+
+ - Normal development usually happens on thematic branches from the ``dev`` branch. These branches are merged back to ``dev`` when it is clear that (1) the feature is sufficiently debugged and tested and (2) no current functionality will break.
+ - For a release the dev branch will be merged back into master and that merge tagged as a release.
 
 
 3. Pull requests
 ----------------
 
 Most likely you are a member of the |ptypy| team, which give you access to the full repository, but no right to commit changes. The proper way of doing this is *pull requests*. You can read about how this is done on github's `pull requests tutorial`_.
 
-Pull requests can be made against one of the feature branches, or against ``target`` or ``master``. In the latter cases, if your changes are deemed a bit too substantial, the first thing we will do is create a feature branch for your commits, and we will let it live for a little while, making sure that it is all fine. We will then merge it onto ``master`` (or ``target``).
-
-In principle bug fixes can be requested on the ``stable`` branch. 
-
-3. Direct commits
------------------
-
-If you are one of our power-users (or power-developers), you can be given rights to commit directly to |ptypy|. This makes things much simpler of course, but with great power comes great responsibility.
+Pull requests shall be made against one of the feature branches, or against ``dev`` or ``master``. For PRs against master we will only accept bugifxes or smaller changes. Every other PR should be made against ``dev``. Your PR will be reviewed and discussed anmongst the core developer team. The more you touch core libraries, the more scrutiny your PR will face. However, we created two folders in the main source folder where you have mmore freedom to try out things. For example, if you want to provide a new reconstruction engine, place it into the ``custom/`` folder. A new ``PtyScan`` subclass that prepares data from your experiment is best placed in the ``experiment/`` folder.
 
-To make sure that things are done cleanly, we encourage all the core developers to create thematic remote branches instead of committing always onto master. Merging these thematic branches will be done as a collective decision during one of the regular admin meetings.
+If you develop a new feature on a topic branch, it is your responsibility to keep it current with dev branch to avoid merge conflicts. 
 
 
 .. |ptypy| replace:: PtyPy
 
 
 .. _PEP8: https://www.python.org/dev/peps/pep-0008/
 
-.. _`pull requests tutorial`: https://help.github.com/articles/using-pull-requests/
+.. _`pull requests tutorial`: https://help.github.com/articles/using-pull-requests/
diff --git a/archive/cuda_extension/engines/DM_gpu.py b/archive/cuda_extension/engines/DM_gpu.py
@@ -57,6 +57,7 @@ class DMGpu(DMNpy):
     default = 'linear'
     type = str
     help = Subpixel interpolation; 'fourier','linear' or None for no interpolation
+    choices = ['fourier','linear',None]
 
     [update_object_first]
     default = True

diff --git a/archive/cuda_extension/engines/DM_npy.py b/archive/cuda_extension/engines/DM_npy.py
@@ -55,6 +55,7 @@ class DMNpy(DM):
     default = 'linear'
     type = str
     help = Subpixel interpolation; 'fourier','linear' or None for no interpolation
+    choices = ['fourier','linear',None]
 
     [update_object_first]
     default = True

diff --git a/archive/cuda_extension/python/gpu_extension.pyx b/archive/cuda_extension/python/gpu_extension.pyx
@@ -153,7 +153,7 @@ def abs2(input):
     cdef np.float32_t [:,:,::1] cout_3c
     cdef np.float64_t [:,::1] cout_d2c
     cdef np.float64_t [:,:,::1] cout_d3c
-    cdef int n = np.product(cin.shape)
+    cdef int n = np.prod(cin.shape)
 
     cdef np.float32_t [:, ::1] cin_f2c
     cdef np.complex64_t [:, ::1] cin_c2c

diff --git a/archive/engines/DM.py b/archive/engines/DM.py
@@ -55,6 +55,7 @@ class DM(PositionCorrectionEngine):
     default = 'linear'
     type = str
     help = Subpixel interpolation; 'fourier','linear' or None for no interpolation
+    choices = ['fourier','linear',None]
 
     [update_object_first]
     default = True

diff --git a/benchmark/diamond_benchmarks/moonflower_scripts/i08.py b/benchmark/diamond_benchmarks/moonflower_scripts/i08.py
@@ -28,6 +28,7 @@
 p.io.autoplot = u.Param(active=False)
 p.io.interaction = u.Param()
 p.io.interaction.server = u.Param(active=False)
+p.io.benchmark = "all"
 
 # max 200 frames (128x128px) of diffraction data
 p.scans = u.Param()

diff --git a/benchmark/diamond_benchmarks/moonflower_scripts/i13.py b/benchmark/diamond_benchmarks/moonflower_scripts/i13.py
@@ -28,6 +28,7 @@
 p.io.autoplot = u.Param(active=False)
 p.io.interaction = u.Param()
 p.io.interaction.server = u.Param(active=False)
+p.io.benchmark = "all"
 
 # max 200 frames (128x128px) of diffraction data
 p.scans = u.Param()

diff --git a/benchmark/diamond_benchmarks/moonflower_scripts/i14_1.py b/benchmark/diamond_benchmarks/moonflower_scripts/i14_1.py
@@ -28,6 +28,7 @@
 p.io.autoplot = u.Param(active=False)
 p.io.interaction = u.Param()
 p.io.interaction.server = u.Param(active=False)
+p.io.benchmark = "all"
 
 # max 200 frames (128x128px) of diffraction data
 p.scans = u.Param()

diff --git a/benchmark/diamond_benchmarks/moonflower_scripts/i14_2.py b/benchmark/diamond_benchmarks/moonflower_scripts/i14_2.py
@@ -29,6 +29,7 @@
 p.io.autoplot = u.Param(active=False)
 p.io.interaction = u.Param()
 p.io.interaction.server = u.Param(active=False)
+p.io.benchmark = "all"
 
 # max 200 frames (128x128px) of diffraction data
 p.scans = u.Param()

diff --git a/benchmark/mpi_allreduce_speed.py b/benchmark/mpi_allreduce_speed.py
@@ -11,7 +11,7 @@
 }
 
 def run_benchmark(shape):
-    megabytes = np.product(shape) * 8 / 1024 / 1024 * 2
+    megabytes = np.prod(shape) * 8 / 1024 / 1024 * 2
 
     data = np.zeros(shape, dtype=np.complex64)
 
@@ -39,4 +39,4 @@ def run_benchmark(shape):
     print('Final results for {} processes'.format(parallel.size))
     print(','.join(['Name', 'Duration', 'MB', 'MB/s']))
     for r in res:
-        print(','.join([str(x) for x in r]))
+        print(','.join([str(x) for x in r]))
diff --git a/cufft/dependencies.yml b/cufft/dependencies.yml
@@ -2,7 +2,7 @@ name: ptypy_cufft
 channels:
   - conda-forge
 dependencies:
-  - python=3.9
+  - python
   - cmake>=3.8.0
   - pybind11
   - compilers

diff --git a/cufft/extensions.py b/cufft/extensions.py
@@ -4,7 +4,6 @@
 import os, re
 import subprocess
 import sysconfig
-import pybind11
 from distutils.unixccompiler import UnixCCompiler
 from distutils.command.build_ext import build_ext
 
@@ -98,6 +97,7 @@ def __init__(self, *args, **kwargs):
         self.LD_FLAGS = [archflag, "-lcufft_static", "-lculibos", "-ldl", "-lrt", "-lpthread", "-cudart shared"]
         self.NVCC_FLAGS = ["-dc", archflag]
         self.CXXFLAGS = ['"-fPIC"']
+        import pybind11
         pybind_includes = [pybind11.get_include(), sysconfig.get_path('include')]  
         INCLUDES = pybind_includes + [self.CUDA['lib64'], module_dir]
         self.INCLUDES = ["-I%s" % ix for ix in INCLUDES]

diff --git a/cufft/setup.py b/cufft/setup.py
@@ -39,6 +39,7 @@
     description='Extension of CuFFT to include pre- and post-filters using callbacks',
     packages=package_list,
     ext_modules=ext_modules,
+    install_requires=["pybind11"],
     cmdclass=cmdclass
 )
 

diff --git a/dependencies_core.yml b/dependencies_core.yml
@@ -1,6 +1,6 @@
 name: ptypy_core
 dependencies:
-  - python=3.9
+  - python
   - numpy
   - scipy
   - h5py

diff --git a/dependencies_dev.yml b/dependencies_dev.yml
@@ -2,7 +2,7 @@ name: ptypy_full
 channels:
   - conda-forge
 dependencies:
-  - python=3.9
+  - python
   - numpy
   - scipy
   - matplotlib

diff --git a/dependencies_full.yml b/dependencies_full.yml
@@ -2,7 +2,7 @@ name: ptypy_full
 channels:
   - conda-forge
 dependencies:
-  - python=3.9
+  - python
   - numpy
   - scipy
   - matplotlib

diff --git a/ptypy/__init__.py b/ptypy/__init__.py
@@ -78,11 +78,16 @@
 
 # Convenience loader for GPU engines
 def load_gpu_engines(arch='cuda'):
-    if arch=='cuda':
+    if arch in ['cuda', 'pycuda']:
         from .accelerate.cuda_pycuda.engines import projectional_pycuda
         from .accelerate.cuda_pycuda.engines import projectional_pycuda_stream
         from .accelerate.cuda_pycuda.engines import stochastic
         from .accelerate.cuda_pycuda.engines import ML_pycuda
+    if arch in ['cuda', 'cupy']:
+        from .accelerate.cuda_cupy.engines import projectional_cupy
+        from .accelerate.cuda_cupy.engines import projectional_cupy_stream
+        from .accelerate.cuda_cupy.engines import stochastic
+        from .accelerate.cuda_cupy.engines import ML_cupy
     if arch=='serial':
         from .accelerate.base.engines import projectional_serial
         from .accelerate.base.engines import projectional_serial_stream

diff --git a/ptypy/accelerate/base/engines/ML_serial.py b/ptypy/accelerate/base/engines/ML_serial.py
@@ -348,6 +348,7 @@ def engine_finalize(self):
             prep = self.diff_info[d.ID]
             float_intens_coeff[label] = prep.float_intens_coeff
         self.ptycho.runtime["float_intens"] = parallel.gather_dict(float_intens_coeff)
+        super().engine_finalize()
 
 
 class BaseModelSerial(BaseModel):

diff --git a/ptypy/accelerate/base/kernels.py b/ptypy/accelerate/base/kernels.py
@@ -62,7 +62,7 @@ def fourier_error(self, b_aux, addr, mag, mask, mask_sum):
 
         ## Actual math ##
 
-        # build model from complex fourier magnitudes, summing up 
+        # build model from complex fourier magnitudes, summing up
         # all modes incoherently
         tf = aux.reshape(maxz, self.nmodes, sh[1], sh[2])
         af = np.sqrt((np.abs(tf) ** 2).sum(1))
@@ -86,7 +86,7 @@ def fourier_deviation(self, b_aux, addr, mag):
 
         ## Actual math ##
 
-        # build model from complex fourier magnitudes, summing up 
+        # build model from complex fourier magnitudes, summing up
         # all modes incoherently
         tf = aux.reshape(maxz, self.nmodes, sh[1], sh[2])
         af = np.sqrt((np.abs(tf) ** 2).sum(1))
@@ -109,7 +109,7 @@ def error_reduce(self, addr, err_sum):
         ## Actual math ##
 
         # Reduces the Fourier error along the last 2 dimensions.fd
-        #err_sum[:] = ferr.astype(np.double).sum(-1).sum(-1).astype(np.float)
+        #err_sum[:] = ferr.astype(np.double).sum(-1).sum(-1).astype(float)
         err_sum[:] = ferr.sum(-1).sum(-1)
         return
 
@@ -136,12 +136,12 @@ def fmag_all_update(self, b_aux, addr, mag, mask, err_sum, pbound=0.0):
 
         ## As opposed to DM we use renorm to differentiate the cases.
 
-        # pbound >= g_err_sum  
+        # pbound >= g_err_sum
         # fm = 1.0 (as renorm = 1, i.e. renorm[~ind])
         # pbound < g_err_sum :
-        # fm = (1 - g_mask) + g_mask * (g_mag + fdev * renorm) / (af + 1e-10) 
+        # fm = (1 - g_mask) + g_mask * (g_mag + fdev * renorm) / (af + 1e-10)
         # (as renorm in [0,1])
-        # pbound == 0.0 
+        # pbound == 0.0
         # fm = (1 - g_mask) + g_mask * g_mag / (af + 1e-10) (as renorm=0)
 
         ind = err_sum > pbound
@@ -192,7 +192,7 @@ def log_likelihood(self, b_aux, addr, mag, mask, err_phot):
         # batch buffers
         aux = b_aux[:maxz * self.nmodes]
 
-        # build model from complex fourier magnitudes, summing up 
+        # build model from complex fourier magnitudes, summing up
         # all modes incoherently
         tf = aux.reshape(maxz, self.nmodes, sh[1], sh[2])
         LL = (np.abs(tf) ** 2).sum(1)
@@ -516,7 +516,7 @@ def _build_exit_alpha_tau(self, b_aux, addr, ob, pr, ex, alpha=1, tau=1):
                   ex[exc[0], exc[1]:exc[1] + rows, exc[2]:exc[2] + cols] + \
                   (1 - tau * (1 + alpha)) * \
                   ob[obc[0], obc[1]:obc[1] + rows, obc[2]:obc[2] + cols] * \
-                  pr[prc[0], prc[1]:prc[1] + rows, prc[2]:prc[2] + cols] 
+                  pr[prc[0], prc[1]:prc[1] + rows, prc[2]:prc[2] + cols]
 
             ex[exc[0], exc[1]:exc[1] + rows, exc[2]:exc[2] + cols] += dex
             aux[ind, :, :] = dex
@@ -660,6 +660,40 @@ def pr_norm_local(self, addr, pr, prn):
             pr[prc[0], prc[1]:prc[1] + rows, prc[2]:prc[2] + cols]).real
         return
 
+    def ob_update_wasp(self, addr, ob, pr, ex, aux, ob_sum_nmr, ob_sum_dnm, alpha=1):
+        sh = addr.shape
+        flat_addr = addr.reshape(sh[0] * sh[1], sh[2], sh[3])
+        rows, cols = ex.shape[-2:]
+
+        for ind, (prc, obc, exc, mac, dic) in enumerate(flat_addr):
+            pr_conj = pr[prc[0], prc[1]:prc[1] + rows, prc[2]:prc[2] + cols].conj()
+            pr_abs2 = abs2(pr[prc[0], prc[1]:prc[1] + rows, prc[2]:prc[2] + cols])
+            deltaEW = ex[exc[0], exc[1]:exc[1] + rows, exc[2]:exc[2] + cols] - aux[ind, :, :]
+
+            ob[obc[0], obc[1]:obc[1] + rows, obc[2]:obc[2] + cols] += 0.5 * pr_conj * deltaEW / (pr_abs2.mean() * alpha + pr_abs2)
+
+            ob_sum_nmr[obc[0], obc[1]:obc[1] + rows, obc[2]:obc[2] + cols] += pr_conj * ex[exc[0], exc[1]:exc[1] + rows, exc[2]:exc[2] + cols]
+            ob_sum_dnm[obc[0], obc[1]:obc[1] + rows, obc[2]:obc[2] + cols] += pr_abs2
+
+    def pr_update_wasp(self, addr, pr, ob, ex, aux, pr_sum_nmr, pr_sum_dnm, beta=1):
+        sh = addr.shape
+        flat_addr = addr.reshape(sh[0] * sh[1], sh[2], sh[3])
+        rows, cols = ex.shape[-2:]
+
+        for ind, (prc, obc, exc, mac, dic) in enumerate(flat_addr):
+            ob_conj = ob[obc[0], obc[1]:obc[1] + rows, obc[2]:obc[2] + cols].conj()
+            ob_abs2 = abs2(ob[obc[0], obc[1]:obc[1] + rows, obc[2]:obc[2] + cols])
+            deltaEW = ex[exc[0], exc[1]:exc[1] + rows, exc[2]:exc[2] + cols] - aux[ind, :, :]
+
+            pr[prc[0], prc[1]:prc[1] + rows, prc[2]:prc[2] + cols] += ob_conj * deltaEW / (beta + ob_abs2)
+
+            pr_sum_nmr[prc[0], prc[1]:prc[1] + rows, prc[2]:prc[2] + cols] += ob_conj * ex[exc[0], exc[1]:exc[1] + rows, exc[2]:exc[2] + cols]
+            pr_sum_dnm[prc[0], prc[1]:prc[1] + rows, prc[2]:prc[2] + cols] += ob_abs2
+
+    def avg_wasp(self, arr, nmr, dnm):
+        is_zero = np.isclose(dnm, 0)
+        arr[:] = np.where(is_zero, nmr, nmr / dnm)
+
 
 class PositionCorrectionKernel(BaseKernel):
     from ptypy.accelerate.base import address_manglers
@@ -775,7 +809,7 @@ def log_likelihood(self, b_aux, addr, mag, mask, err_sum):
         # batch buffers
         aux = b_aux[:maxz * self.nmodes]
 
-        # build model from complex fourier magnitudes, summing up 
+        # build model from complex fourier magnitudes, summing up
         # all modes incoherently
         tf = aux.reshape(maxz, self.nmodes, sh[1], sh[2])
         LL = (np.abs(tf) ** 2).sum(1)

diff --git a/...y/accelerate/cuda_pycuda/cuda/__init__.py → ptypy/accelerate/cuda_common/__init__.py b/...y/accelerate/cuda_pycuda/cuda/__init__.py → ptypy/accelerate/cuda_common/__init__.py
diff --git a/ptypy/accelerate/cuda_pycuda/cuda/abs2sum.cu → ptypy/accelerate/cuda_common/abs2sum.cu b/ptypy/accelerate/cuda_pycuda/cuda/abs2sum.cu → ptypy/accelerate/cuda_common/abs2sum.cu
@@ -5,8 +5,7 @@
  * - OUT_TYPE: can be float/double
  */
 
-#include <thrust/complex.h>
-using thrust::complex;
+#include "common.cuh"
 
 extern "C" __global__ void abs2sum(const IN_TYPE* a,
                                    const int n,