Skip to content

Hidet v0.2.4

Compare
Choose a tag to compare
@yaoyaoding yaoyaoding released this 21 Jun 02:00
· 125 commits to main since this release
289377a

What's Changed

  • [Version] Bump version to v0.2.4.dev by @yaoyaoding in #188
  • [Dynamo] module tests + operator support by @AndreSlavescu in #148
  • Refactor compilation workflow to support CPU without CUDA by @LDY1998 in #189
  • [Stack] Allow the the ulimit stack size less than expected by @yaoyaoding in #195
  • [Readme] Add platform requirements by @yaoyaoding in #196
  • [DataType] Add complex64 and complex128 data type by @yaoyaoding in #200
  • [Example] Add an example of running GPT-2 model by @yaoyaoding in #203
  • [Fusion] Use inline pass in fusion to allow template call functions with kernel params by @yaoyaoding in #197
  • [Frontend][Operator] Add missing operators for dinov2 by @yaoyaoding in #206
  • [Backend] Add openmp support by @yaoyaoding in #208
  • [Operator] Update batch_matmul to use Hidet Script by @hjjq in #207
  • [Cache] Add cache management command line interface by @yaoyaoding in #212
  • [IR] Creation-time constant fold for constant expressions by @yaoyaoding in #209
  • [Torch][Operator] Allow change torch tensor device when possible by @yaoyaoding in #214
  • [Torch][Operator] Add op mapping for torch.min/max/minimum/maximum by @yaoyaoding in #216
  • [Typo] Fix a typo in resnext.py by @eltociear in #210
  • [Operator] Adding missing operators for llama by @yaoyaoding in #219
  • [IR] Adding more support for dynamic shape on Task and FlowGraph level by @yaoyaoding in #220
  • [Torch] Add mapping for torch.ops.aten.add and torch.ops.aten.cos by @yaoyaoding in #223
  • [Operator][Backend] Add nvcc flags for faster math and update Attention schedule by @hjjq in #221
  • [CI] Always clear the cache before tests by @yaoyaoding in #224
  • fix batch_matmul for invalid mma config for sm < 80 by @xinli-git in #227
  • [Dynamic Shape] Adding more dynamic shape support by @yaoyaoding in #228
  • [CI] Add importlib_metadata to requirements-dev.txt by @yaoyaoding in #233
  • [Script] Add list comprehension support in hidet script by @yaoyaoding in #235
  • [Refactor][Dynamic Shape] Introduce SymbolVar to implement dynamic shape by @yaoyaoding in #236
  • [Script] Add pointer arthematic by @yaoyaoding in #237
  • [Operator][Torch] Add causal fmha and torch sdpa mapping by @hjjq in #238
  • [Fixbug][Pass] Fix a bug in the inline_let_stmt pass by @yaoyaoding in #240
  • [Options] Add option for controlling parallel build with number of jobs or memory reserved for each job by @xinli-git in #230
  • [Typo] Fix a typo by @BolinSNLHM in #245
  • [Typo] Fix minor spelling mistake by @Aalanli in #246
  • [Fixbug] Fix a bug in StmtRewriter which discard declare scope information by @yaoyaoding in #248
  • [Refactor] Adding support for compiled model by @yaoyaoding in #247
  • [Operator] batch_matmul: Remove duplicate smem declaration by @hjjq in #249
  • [Operator] Adding CPU support for matrix multiplication by @BolinSNLHM in #251
  • [Hidet Script] Allow bind_tuple argument in mapping.on(...) and grid(...) by @yaoyaoding in #254
  • [Hidet Script] Add in and not in expression in hidet script by @yaoyaoding in #255
  • [Codegen] Include header files as needed by @yaoyaoding in #256
  • [Operator] Add new operator "normalize" that makes a group of layers (layer norm, group norm and instance norm) faster using hidet script by @xinli-git in #257
  • [Testing][Models] Add gpt2 module in testing models by @yaoyaoding in #252
  • [Fixbug] Fix test warnings and the incompatibility of two recent PRs by @yaoyaoding in #258
  • [Operator] Add sm75 support for attention by @hjjq in #259
  • [Operator] batch_matmul: Remove unroll and reduce tuning space by @hjjq in #260
  • [Fixbug] Fix a bug when fused operator has no input by @yaoyaoding in #263
  • [Graph] Translate softmax and reduce to hidet script by @Aalanli in #242
  • [Fixbug] batch_matmul: move cc checking inside schedule by @hjjq in #264
  • [Refactor] Refactor building system and adding compiled products by @yaoyaoding in #261
  • [Fixbug] Reduce the default unroll factor to 4 by @yaoyaoding in #266
  • [Torch] Add some torch frontend mappings for roberta-base by @hjjq in #267
  • [Refactor] Remove schedules submodule under hidet.graph.ops by @yaoyaoding in #269
  • [Device] Add support for mixed cpu and cuda kernels in the same flow graph by @yaoyaoding in #270
  • [Dynamic Shape] Adding dynamic shape support for reduce by @Aalanli in #268
  • [Complex Type] Add more support for complex data type by @yaoyaoding in #271
  • [Tools] Model translator by @Aalanli in #273
  • [Model] Llama model implementation in hidet by @Aalanli in #243
  • [Operator] Add support for cross attention by @hjjq in #275
  • [Operator] Add dynamic shape support and tests for Operators. by @Aalanli in #274
  • [Fusion] Enhance the prologue epilogue fusion by @yaoyaoding in #277
  • [Drivers] Suppress OSError by @hjjq in #278
  • [Dynamic Shape] More correctness guards by @Aalanli in #276
  • [Operator] Make Convolution gemms fusible by resolving to batch_matmul by @hjjq in #279
  • [External Tasks] Move task build into method call for external kernel support by @xinli-git in #282
  • [Distributed] add nccl primitives by @soodoshll in #280
  • [Operators] Conv2d fp16 implicit gemm kernel by @Aalanli in #283

New Contributors

Full Changelog: v0.2.3...v0.2.4