GitHub - ZijiaLewisLu/CVPR2024-FACT: Official Repo for CVPR 2024 Paper "FACT: Frame-Action Cross-Attention Temporal Modeling for Efficient Fully-Supervised Action Segmentation"

FACT: Frame-Action Cross-Attention Temporal Modeling for Efficient Supervised Action Segmentation

In this work, we propose an efficient Frame-Action Cross-attention Temporal modeling (FACT) framework that (i) performs temporal modeling on frame and action levels in parallel and (ii) leverage this parallelism to achieve iterative bidirectional information transfer between action/frame features and refine them.

We achieve SOTA on four datasets while enjoy lower computational cost.

Preparation

1. Install the Requirements

pip3 install -r requirements.txt

2. Prepare Codes

mkdir FACT_actseg
cd FACT_actseg
git clone https://github.com/ZijiaLewisLu/CVPR2024-FACT.git
mv CVPR2024-FACT src
mkdir data

3. Prepare Data

download Breakfast and GTEA data from link1 or link2, and place them in FACT_actseg/data.
download EgoProcel and Epic-Kitchens data from here, and place them in FACT_actseg/data.
Features for Epic-Kitchens can be downloaded via this script and extracted with utils/extract_epic_kitchen.py.
After this, FACT_actseg/data should contain four folders, one for each dataset.

Training

The training is configured using YAML, and all the configurations are listed in configs. You can use the following commands to run the experiments.

cd FACT_actseg
# breakfast
python3 -m src.train --cfg src/configs/breakfast.yaml --set aux.gpu 0 split "split1"
# gtea
python3 -m src.train --cfg src/configs/gtea.yaml --set aux.gpu 0 split "split1"
# egoprocel
python3 -m src.train --cfg src/configs/egoprocel.yaml --set aux.gpu 0 split "split1"
# epic-kitchens
python3 -m src.train --cfg src/configs/epic-kitchens.yaml --set aux.gpu 0 split "split1"

By default, log will be saved to FACT_actseg/log/<experiment-path>. Evaluation results are saved as Checkpoint objects defined utils/evaluate.py. Loss and metrics are also visualized with wandb.

Pre-Trained Models

Pre-trained model weights can be downloaded from here. You can place the files under FACT_actseg/ckpts and test the models with the following command.

python3 -m src.eval

We lost the original data and model weights in a disk failure. These models are replicated afterward, thus the performance slightly varies from those in the papers.

Breakfast models
GTEA models
EgoProceL models
Epic-Kitchens models

Citation

@inproceedings{
    lu2024fact,
    title={{FACT}: Frame-Action Cross-Attention Temporal Modeling for Efficient Supervised Action Segmentation},
    author={Zijia Lu and Ehsan Elhamifar},
    booktitle={Conference on Computer Vision and Pattern Recognition 2024},
    year={2024},
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
configs		configs
models		models
utils		utils
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
eval.py		eval.py
home.py		home.py
overview.png		overview.png
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

configs

configs

models

models

utils

utils

.gitignore

.gitignore

README.md

README.md

init.py

init.py

eval.py

eval.py

home.py

home.py

overview.png

overview.png

requirements.txt

requirements.txt

train.py

train.py

Repository files navigation

FACT: Frame-Action Cross-Attention Temporal Modeling for Efficient Supervised Action Segmentation

Preparation

1. Install the Requirements

2. Prepare Codes

3. Prepare Data

Training

Pre-Trained Models

Citation

About

Releases

Packages

Languages

ZijiaLewisLu/CVPR2024-FACT

Folders and files

Latest commit

History

Repository files navigation

Preparation

1. Install the Requirements

2. Prepare Codes

3. Prepare Data

Training

Pre-Trained Models

Citation

About

Topics

Resources

Stars

Watchers

Forks

Languages