Skip to content

Hidet v0.3.1

Latest
Compare
Choose a tag to compare
@yaoyaoding yaoyaoding released this 03 Apr 15:21
· 2 commits to main since this release
33d8bdd

What's Changed

  • [Version] Bump version to v0.3.1.dev by @yaoyaoding in #361
  • [Option] Add an option to disable imperative execution by @serach24 in #362
  • [Graph][Benchmark] Update benchmark function by @Aalanli in #363
  • [Compile Server] Update deps for compilation server by @xinli-git in #365
  • [Utils] Changed the multiprocessing context by @destefy in #367
  • [Dynamo] Refactoring code for Hidet remote compilation by @destefy in #369
  • [Graph][Dynamo Backend] Lshift/Rshift/Mod by @Aalanli in #371
  • [Graph][Operator] Fix reduce bug, add uint8x4 by @Aalanli in #372
  • [CompiledGraph] Add option to store dispatch table option by @destefy in #377
  • [Graph][Tensor] remove unnecessary synchronization by @xiaocenxiaocen in #374
  • [Graph][Dynamo Backend] Minor imperative run bug fix by @Aalanli in #383
  • [Graph] Fix CompiledGraph aliasing bug by @Aalanli in #384
  • [Frontend] Add mapping for torch.sqrt by @yaoyaoding in #387
  • [Fix][Graph] Write compiled graph to tempfile first by @destefy in #392
  • [Operators] Improving fp32 matrix multiplication on x86 CPUs by @BolinSNLHM in #378
  • [Fixbug] Fix a bug related to c/c++ integer promotion by @yaoyaoding in #391
  • [Option] Add option to set class Var id attribute to 0 by default by @destefy in #393
  • [CI] Add CI workflow and scripts by @hjjq in #394
  • [CI] Fix deadlock by @hjjq in #395
  • [Operator] Enhancements to Reduce by @hjjq in #366
  • [CI] Launch and stop compile server via workflow by @hjjq in #396
  • [Operator] Support advanced options for pooling operators by @yaoyaoding in #399
  • [Torch] Implements torch_func protocol by @yaoyaoding in #400
  • [Docs] Add more documentation by @yaoyaoding in #401
  • [Fixbug] Fix a performance bug in auto-scheduler by @yaoyaoding in #402
  • [Library] Add cublas library by @yaoyaoding in #404
  • [Operator] Add hidet.ops.matmul_cublas operator by @yaoyaoding in #405
  • [Fusion] Allow shallow fusion of cublas operator by @yaoyaoding in #407
  • [CI] Clear op cache by @hjjq in #406
  • [Runtime] Add a new compiled format CompiledApp by @yaoyaoding in #408
  • CPU AVX implementation for Softmax, Norm by @fishingguy456 in #357
  • [CI] Reduce scope of secrets by @hjjq in #413
  • [Operator] Add a opaque operator base class by @yaoyaoding in #414
  • [IR] Support inplace operators by @yaoyaoding in #416
  • [Graph][Quantization] Multi-stage software pipelining and update parallel k rule by @Aalanli in #364
  • [CI] Trigger workflow by @hjjq in #417
  • [Scheduler] Add the fused task name to auto-scheduled kernels by @yaoyaoding in #418
  • [CI] Use cudagraph for benchmarks by @hjjq in #419
  • [CI] Remove unnecessary synchronization by @hjjq in #420
  • Update Netron viewer link by @KTong821 in #421
  • [Operator] Add cublas to matmul tune space by @hjjq in #422
  • [IR] Support integer subbyte by @xiaocenxiaocen in #403
  • [README] Fix ONNX link by @dbabokin in #425
  • [cuBLAS] Add cublas_gemm_batched and use cublasSetStream to set stream to the current stream in all cublas API calls by @yudi0201 in #423
  • [Fixbug] Fix dynamic memcpy bug by @KTong821 in #427
  • [Compile Server] Fetch repo before checking out by @hjjq in #429
  • [CI] Use slurm for runners by @hjjq in #430
  • [CI] CI migration by @hjjq in #433
  • [Fixbug] Fix graph metadata hash by @KTong821 in #428
  • [CI] Add back tests by @hjjq in #436
  • [Fix] Skip a failed test due to huggingface transformers update by @yaoyaoding in #439
  • [RC] Release candidate for version 0.3.1 by @yaoyaoding in #442

New Contributors

Full Changelog: v0.3.0...v0.3.1