Skip to content

PaddleVideo v2.0.0

Compare
Choose a tag to compare
@huangjun12 huangjun12 released this 01 Mar 12:09
4706814

Release Note

PaddleVideo 基于2.0动态图实现,使用模块化设计,将各部分功能拆分到不同组件中进行解耦。可以轻松的组合、配置和自定义组件来快速实现视频算法模型。

基础能力

  1. 支持更多的数据集和模型结构,包括: Kinectics400、UCF-101、YoutTube8M、ActivityNet等数据集。
  2. 发布多个视频分类和视频动作定位方向模型,包括: TSN、TSM、SlowFast、AttentionLSTM、BMN模型。
  3. 打通完整部署全流程。

亮点建设

  1. 发布2D SOTA模型ppTSM: 在Kinectics-400数据集上Top1精度为73.5% ,较标准版TSM提升3.5%,且模型参数量持平,模型训练和预测速度更快。
  2. 发布多种训练加速方案:SlowFast训练速度相较于原始实现提速100%,TSN+DALI训练速度相较于原始实现提速3.6倍
。

特色应用

  1. 发布大规模视频分类模型VideoTag: 使用千万量级数据集训练的视频标签预训练模型,支持3000个源于产业实践的实用标签。
  2. 发布足球动作检测算法FootballAction: 高效定位出视频中各种足球动作发生的起止时间以及该动作类别。

Release Note

Support dynamic graph programming paradigm, adapted to Paddle2.0. Including:

  1. Various dataset. PaddleVideo supports various datasets including Kinectics400, ucf101, YoutTube8M datasets.
  2. Various architectures. PaddleVideo supports more architectures, including video recognition models, such as TSN, TSM, SlowFast, AttentionLSTM and action localization model, like BMN.
  3. Deployable. PaddleVideo is powered by the Paddle Inference.
  4. Higher performance. PP-TSM, which is based on the standard TSM, already archive the best performance in the 2D recognition network, has the same size of parameters but improve the Top1 Acc to 73.5%.
  5. Faster training strategy. PaddleVideo supports faster training strategy, it accelerates by 100% compared with the standard Slowfast version. TSN+DALI speed up training 3.6x.
  6. VideoTag. 3k Large-Scale video classification model.
  7. FootballAction. Football action detection model.