Skip to content

InternLM-v0.2.1dev20240102

Latest
Compare
Choose a tag to compare
@sunpengsdu sunpengsdu released this 02 Jan 05:05
5539f9d

What's Changed

  • fix(timeout): larger timeout by @JiaoPL in #495
  • feat(doc): add GPU memory info for 7B & 20B models by @li126com in #507
  • feat(model): add rope_base interface by @00INDEX in #512
  • Feat(QA): Check loss when swapping micro_num and micro_bsz && Check grad norm by @li126com in #510
  • Fix(QA): the py name in main is wrong by @li126com in #514
  • fix/feat: small fix and enhancement by @SolenoidWGT in #515
  • test(workflow): add workflow for loss test and change trigger event by @kkscilife in #513
  • fix(ci): fix test model ckpt ci test by @SolenoidWGT in #518
  • test(workflow): add unit test case by @kkscilife in #524
  • feat(storage): use multipart upload when using oss by @li126com in #520
  • Fix (QA checkpoint): fix test_model_checkpoint singleton import by @li126com in #526
  • fix(model): add IS_SEQUENCE_PARALLEL check for norm module by @yingtongxiong in #528
  • feat(model): add output embedding tf32 option by @JiaoPL in #523
  • feat(grad_norm): vocab grad norm profiling by @JiaoPL in #519
  • fix(data): fix the unpack for type_ids when use_flash_attn=False by @yingtongxiong in #516
  • fix(storage): unify the name of AK and SK by @li126com in #527
  • fix(test): fix type_ids unpack bug by @SolenoidWGT in #530
  • feat(model): support llama model with checkpoint loading by @li126com in #532
  • fix(metric): add metric dtype control by @Pryest in #533
  • feat(ckpt): support auto resume in Volc and Ali by @li126com in #529
  • fix(sequence_parallel): fix norm all-reduce in seq_parallel when not overlaping by @yingtongxiong in #534
  • fix(pp): fix no-packed dataset load micro batch error by @SolenoidWGT in #538
  • fix(model): change model_type LLAMA to LLAMA2 by @li126com in #539
  • fix(moe): fix moe zero mode bug by @blankde in #548
  • fix(grad_norm): token grad norm with tp by @JiaoPL in #547
  • test(workflow): change into reserved by @kkscilife in #550
  • fix(model): add ckpt_type constraint when loading ckpts by @li126com in #542
  • feat(logger): add tensorboard key value buffer by @SolenoidWGT in #549
  • fix(metrics): remove redundant cuda memory in metric calculations by @SolenoidWGT in #557
  • fix(lr_scheduler): fix when resuming lr_scheduler without loading optimizer by @gaoyang07 in #565

Full Changelog: v0.2.1dev20231121...v0.2.1dev20240102