Skip to content

Releases: PaddlePaddle/PaddleVideo

PaddleVideo v2.1.0

20 May 12:31
3e49bee
Compare
Choose a tag to compare

Release Note

PaddleVideo v2.1.0有如下升级点:

框架

  1. 重构framework架构,单卡和多卡下forward接口统一。
  2. 重构Inference架构,支持不同模型预测。
  3. 添加混合精度训练和分布式训练接口。

模型

  1. PP-TSM
    (1) 通过添加tricks,Uniform评估策略下精度由73.5提升至74.54。
    (2) 添加dense训练策略,蒸馏精度达到76.16,同等ResNet50 backbone下精度超过slowfast。
  2. Slowfast
    (1) 添加multigrid训练加速策略,在kinetics-400数据集上训练358个epoch仅需6.7天。
    (2) 评估精度由74.35提升至75.84。
  3. BMN
    (1) 添加Inference支持。

数据集

  1. 提供Kinetics-400数据集下载链接,包括百度网盘下载和脚本下载方式。

应用

  1. FootballAction:
    (1) 基础特征模型由TSN替换为ppTSM,准确率由84%提升到94%。
    (2) 准确率提升,precision和recall均有大幅提升,F1-score从0.57提升到0.82。

Release Note

Framework

  1. Refactoring code of model.framework to unify the forward interface of single card and multi card training.
  2. Refactoring code of utils.inference to support different model predictions.
  3. Add interface of Automatic Mixed Precision Training and Distributed training.

Model

  1. PP-TSM
    (1) Improve accuracy from 73.5 to 74.54 using uniform sampling method.
    (2) Improve accuracy to 76.16 using dense sampling method.
  2. Slowfast
    (1) Add multigrid training strategy. It only takes 6.7 days to train 358 epochs on the kinetics-400 dataset using v100.
    (2) Improve accuracy from 74.35 to 75.84.
  3. BMN
    (1) Support inference.

Dataset

  1. Provide the download link of kinetics-400 dataset, including Baidu network disk and script download.

Application

  1. FootballAction
    (1) Replace TSN with PP-TSM, and the accuracy is improved from 84% to 94%.
    (2) improve F1 score from 0.57 to 0.82.

PaddleVideo v2.0.0

01 Mar 12:09
4706814
Compare
Choose a tag to compare

Release Note

PaddleVideo 基于2.0动态图实现,使用模块化设计,将各部分功能拆分到不同组件中进行解耦。可以轻松的组合、配置和自定义组件来快速实现视频算法模型。

基础能力

  1. 支持更多的数据集和模型结构,包括: Kinectics400、UCF-101、YoutTube8M、ActivityNet等数据集。
  2. 发布多个视频分类和视频动作定位方向模型,包括: TSN、TSM、SlowFast、AttentionLSTM、BMN模型。
  3. 打通完整部署全流程。

亮点建设

  1. 发布2D SOTA模型ppTSM: 在Kinectics-400数据集上Top1精度为73.5% ,较标准版TSM提升3.5%,且模型参数量持平,模型训练和预测速度更快。
  2. 发布多种训练加速方案:SlowFast训练速度相较于原始实现提速100%,TSN+DALI训练速度相较于原始实现提速3.6倍
。

特色应用

  1. 发布大规模视频分类模型VideoTag: 使用千万量级数据集训练的视频标签预训练模型,支持3000个源于产业实践的实用标签。
  2. 发布足球动作检测算法FootballAction: 高效定位出视频中各种足球动作发生的起止时间以及该动作类别。

Release Note

Support dynamic graph programming paradigm, adapted to Paddle2.0. Including:

  1. Various dataset. PaddleVideo supports various datasets including Kinectics400, ucf101, YoutTube8M datasets.
  2. Various architectures. PaddleVideo supports more architectures, including video recognition models, such as TSN, TSM, SlowFast, AttentionLSTM and action localization model, like BMN.
  3. Deployable. PaddleVideo is powered by the Paddle Inference.
  4. Higher performance. PP-TSM, which is based on the standard TSM, already archive the best performance in the 2D recognition network, has the same size of parameters but improve the Top1 Acc to 73.5%.
  5. Faster training strategy. PaddleVideo supports faster training strategy, it accelerates by 100% compared with the standard Slowfast version. TSN+DALI speed up training 3.6x.
  6. VideoTag. 3k Large-Scale video classification model.
  7. FootballAction. Football action detection model.