Skip to content

Releases: OpenMathLib/OpenBLAS

OpenBLAS 0.2.18 version

12 Apr 19:34
Compare
Choose a tag to compare

Version 0.2.18
12-Apr-2016

common:

  • If you set MAKE_NB_JOBS flag less or equal than zero, make will be without -j.

x86/x86_64:

ARM:

  • Provide DGEMM 8x4 kernel for Cortex-A57 (Thanks, Ashwin Sekhar T K)

POWER:

  • Optimize S and C BLAS3 on Power8
  • Optimize BLAS2/1 on Power8

md5sum
4ca49eb1c45b3ca82a0034ed3cc2cef1 OpenBLAS-0.2.18.zip
805e7f660877d588ea7e3792cda2ee65 OpenBLAS-0.2.18.tar.gz
Download OpenBLAS

OpenBLAS 0.2.17 version

21 Mar 00:55
Compare
Choose a tag to compare

Version 0.2.17
20-Mar-2016

common:

  • Enable BUILD_LAPACK_DEPRECATED=1 by default.

md5sum
5f04c53183a5bc785b9900eb9bab44ac /tmp/OpenBLAS-0.2.17.zip
664a12807f2a2a7cda4781e3ab2ae0e1 /tmp/OpenBLAS-0.2.17.tar.gz

Download OpenBLAS

OpenBLAS 0.2.16 version

15 Mar 19:03
Compare
Choose a tag to compare

Version 0.2.16
15-Mar-2016

common:

  • Upgrade LAPACK to 3.6.0 version.
    Add BUILD_LAPACK_DEPRECATED option in Makefile.rule to build
    LAPACK deprecated functions.
  • Add MAKE_NB_JOBS option in Makefile.
    Force number of make jobs.This is particularly
    useful when using distcc. (#735. Thanks, Jerome Robert.)
  • Redesign unit test. Run unit/regression test at every build (Travis-CI and Appveyor).
  • Disable multi-threading for small size swap and ger. (#744. Thanks, Jerome Robert)
  • Improve small zger, zgemv, ztrmv using stack alloction (#727. Thanks, Jerome Robert)
  • Let openblas_get_num_threads return the number of active threads.
    (#760. Thanks, Jerome Robert)
  • Support illumos(OmniOS). (#749. Thanks, Lauri Tirkkonen)
  • Fix LAPACK Dormbr, Dormlq bug. (#711, #713. Thanks, Brendan Tracey)
  • Update scipy benchmark script. (#745. Thanks, John Kirkham)
  • Avoid potential getenv segfault. (#716)
  • Import LAPACK svn bugfix #142-#147,#150-#155

x86/x86_64:

  • Optimize trsm kernels for AMD Bulldozer, Piledriver, Steamroller.
  • Detect Intel Avoton.
  • Detect AMD Trinity, Richland, E2-3200.
  • Fix gemv performance bug on Mac OSX Intel Haswell.
  • Fix some bugs with CMake and Visual Studio
  • Optimize c/zgemv for AMD Bulldozer, Piledriver, Steamroller
  • Fix bug with scipy linalg test.

ARM:

  • Support and optimize Cortex-A57 AArch64.
    (#686. Thanks, Ashwin Sekhar TK)
  • Fix Android build on ARMV7 (#778. Thanks, Paul Mustiere)
  • Update ARMV6 kernels.
  • Improve DGEMM for ARM Cortex-A57. (Thanks, Ashwin Sekhar T K)

POWER:

  • Fix detection of POWER architecture
    (#684. Thanks, Sebastien Villemot)
  • Optimize D and Z BLAS3 functions for Power8.

md5sum
8fae7cebfefa073c8640e99c4454dc03 OpenBLAS-0.2.16.zip
fef46ab92463bdbb1479dcec594ef6dc OpenBLAS-0.2.16.tar.gz

Download OpenBLAS

OpenBLAS 0.2.15 version

27 Oct 20:51
Compare
Choose a tag to compare

Version 0.2.15
27-Oct-2015

common:

  • Support cmake on x86/x86-64. Natively compiling on MS Visual Studio.
    (experimental. Thank Hank Anderson for the initial cmake porting work.)

      On Linux and Mac OSX, OpenBLAS cmake supports assembly kernels.
      e.g. cmake .
           make
           make test (Optional)
    
      On Windows MS Visual Studio, OpenBLAS cmake only support C kernels.
      (OpenBLAS uses AT&T style assembly, which is not supported by MSVC.)
      e.g. cmake -G "Visual Studio 12 Win64" .
           Open OpenBLAS.sln and build.
    
  • Enable MAX_STACK_ALLOC flags by default.
    Improve ger and gemv for small matrices.

  • Improve gemv parallel with small m and large n case.

  • Improve ?imatcopy when lda==ldb (#633. Thanks, Martin Koehler)

  • Add vecLib benchmarks (#565. Thanks, Andreas Noack.)

  • Fix LAPACK lantr for row major matrices (#634. Thanks, Dan Kortschak)

  • Fix LAPACKE lansy (#640. Thanks, Dan Kortschak)

  • Import bug fixes for LAPACKE s/dormlq, c/zunmlq

  • Raise the signal when pthread_create fails (#668. Thanks, James K. Lowden)

  • Remove g77 from compiler list.

  • Enable AppVeyor Windows CI.

x86/x86-64:

  • Support pure C generic kernels for x86/x86-64.
  • Support Intel Boardwell and Skylake by Haswell kernels.
  • Support AMD Excavator by Steamroller kernels.
  • Optimize s/d/c/zdot for Intel SandyBridge and Haswell.
  • Optimize s/d/c/zdot for AMD Piledriver and Steamroller.
  • Optimize s/d/c/zapxy for Intel SandyBridge and Haswell.
  • Optimize s/d/c/zapxy for AMD Piledriver and Steamroller.
  • Optimize d/c/zscal for Intel Haswell, dscal for Intel SandyBridge.
  • Optimize d/c/zscal for AMD Bulldozer, Piledriver and Steamroller.
  • Optimize s/dger for Intel SandyBridge.
  • Optimize s/dsymv for Intel SandyBridge.
  • Optimize ssymv for Intel Haswell.
  • Optimize dgemv for Intel Nehalem and Haswell.
  • Optimize dtrmm for Intel Haswell.

ARM:

  • Support Android NDK armeabi-v7a-hard ABI (-mfloat-abi=hard)

      e.g. make HOSTCC=gcc CC=arm-linux-androideabi-gcc NO_LAPACK=1 TARGET=ARMV7
    
  • Fix lock, rpcc bugs (#616, #617. Thanks, Grazvydas Ignotas)

POWER:

  • Support ppc64le platform (ELF ABI v2. #612. Thanks, Matthew Brandyberry.)
  • Support POWER7/8 by POWER6 kernels. (#612. Thanks, Fábio Perez.)

md5sum
c9c181a981897e1cb0192f8589130a6c OpenBLAS-0.2.15.zip
b1190f3d3471685f17cfd1ec1d252ac9 OpenBLAS-0.2.15.tar.gz
Download OpenBLAS

OpenBLAS 0.2.14 Version

24 Mar 20:12
Compare
Choose a tag to compare

Version 0.2.14
24-Mar-2015

common:

  • Improve OpenBLASConfig.cmake. (#474, #475. Thanks, xantares.)
  • Improve ger and gemv for small matrices by stack allocation.
    e.g. make -DMAX_STACK_ALLOC=2048 (#482. Thanks, Jerome Robert.)
  • Introduce openblas_get_num_threads and openblas_get_num_procs. (#497. Thanks, Erik Schnetter.)
  • Add ATLAS-style ?geadd function. (#509. Thanks, Martin Köhler.)
  • Fix c/zsyr bug with negative incx. (#492.)
  • Fix race condition during shutdown causing a crash in gotoblas_set_affinity(). (#508. Thanks, Ton van den Heuvel.)

x86/x86-64:

  • Support AMD Streamroller.

ARM:

  • Add Cortex-A9 and Cortex-A15 targets.

md5sum
a57e197a34ba8b651347e6453961baab OpenBLAS-0.2.14.zip
53cda7f420e1ba0ea55de536b24c9701 OpenBLAS-0.2.14.tar.gz

OpenBLAS 0.2.13 version

03 Dec 15:18
Compare
Choose a tag to compare

Version 0.2.13
3-Dec-2014

common:

  • Add SYMBOLPREFIX and SYMBOLSUFFIX makefile options
    for adding a prefix or suffix to all exported symbol names
    in the shared library.(#459, Thanks Tony Kelman)
  • Provide OpenBLASConfig.cmake at installation.
  • Fix Fortran compiler detection on FreeBSD.
    (#470, Thanks Mike Nolta)

x86/x86-64:

  • Add generic kernel files for x86-64. make TARGET=GENERIC
  • Fix a bug of sgemm kernel on Intel Sandy Bridge.
  • Fix c_check bug on some amd64 systems. (#471, Thanks Mike Nolta)

ARM:

  • Support APM's X-Gene 1 AArch64 processors.
    Optimize trmm and sgemm. (#465, Thanks Dave Nuechterlein)

md5sum
8147e68c9e3e294a1a26280050bcf6a2 OpenBLAS-0.2.13.zip
bba7b37b5a5b6674fda91dcb2faab145 OpenBLAS-0.2.13.tar.gz

OpenBLAS 0.2.12 version

13 Oct 09:12
Compare
Choose a tag to compare

Version 0.2.12
13-Oct-2014
common:
* Added CBLAS interface for ?omatcopy and ?imatcopy.
* Enable ?gemm3m functions.
* Added benchmark for ?gemm3m.
* Optimized multithreading lower limits.
* Disabled SYMM3M and HEMM3M functions
because of segment violations.

x86/x86-64:
* Improved axpy and symv performance on AMD Bulldozer.
* Improved gemv performance on modern Intel and AMD CPUs.

md5sum
4889408c500aa7fa818d92c0c71d9098 /tmp/OpenBLAS-0.2.12.zip
5df7f175b2db6a2b02ca4ff932e39bc7 /tmp/OpenBLAS-0.2.12.tar.gz

OpenBLAS 0.2.11 version

26 Aug 08:19
Compare
Choose a tag to compare

OpenBLAS ChangeLog

Version 0.2.11
18-Aug-2014

common:

  • Added some benchmark codes.
  • Fix link error on Linux/musl.(Thanks Isaac Dunham)

x86/x86-64:

  • Improved s/c/zgemm performance for Intel Haswell.
  • Improved s/d/c/zgemv performance.
  • Support the big numa machine.(EXPERIMENT)

ARM:

  • Fix detection when cpuinfo uses "Processor". (Thanks Isaiah)

md5sum
946434ece1d7a12ba938902665b47434 OpenBLAS-0.2.11.zip
c456f3c5e84c3ab69ef89b22e616627a OpenBLAS-0.2.11.tar.gz

OpenBLAS 0.2.10 version

17 Jul 07:23
Compare
Choose a tag to compare

OpenBLAS ChangeLog

Version 0.2.10
16-Jul-2014

common:

  • Added BLAS extensions as following.
    s/d/c/zaxpby, s/d/c/zimatcopy, s/d/c/zomatcopy.
  • Added OPENBLAS_CORETYPE environment for dynamic_arch. (a86d349)
  • Added NO_AVX2 flag for old binutils. (#401)
  • Support outputing the CPU corename on runtime.(#407)
  • Patched LAPACK to fix bug 114, 117, 118.
    (http://www.netlib.org/lapack/bug_list.html)
  • Disabled ?gemm3m for a work-around fix. (#400)

x86/x86-64:

ARM:

  • Improved LAPACK testing.

md5sum
0bafb7fe70769bb00071d7170737e66d OpenBLAS-0.2.10.zip
eb3ff035734ce5f4e1ff9d3e5c7ac734 OpenBLAS-0.2.10.tar.gz

OpenBLAS 0.2.9 version

10 Jun 14:02
Compare
Choose a tag to compare

OpenBLAS Changelog

Version 0.2.9
10-Jun-2014
common:

Version 0.2.9.rc2
06-Mar-2014
common:

  • Added OPENBLAS_VERBOSE environment variable.(#338)
  • Make OpenBLAS thread-pool resilient to fork via pthread_atfork. (#294, Thank Olivier Grisel)
  • Rewrote rotmg
  • Fixed sdsdot bug.
    x86/x86-64:
  • Detect Intel Haswell for new Macbook.

Version 0.2.9.rc1
13-Jan-2013
common:

  • Update LAPACK to 3.5.0 version
  • Fixed compatiable issues with Clang and Pathscale compilers.

x86/x86-64:

  • Optimization on Intel Haswell.
  • Enable optimization kernels on AMD Bulldozer and Piledriver.

ARM:

  • Support ARMv6 and ARMv7 ISA.
  • Optimization on ARM Cortex-A9.

md5sum
84dd50ec05e6f6ebdcdd578a09b146ef OpenBLAS-0.2.9.zip
395052b930ec553a5cd6207dadaaef4a OpenBLAS-0.2.9.tar.gz