Skip to content

Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-modal Structured Representations

Notifications You must be signed in to change notification settings

zjukg/Structure-CLIP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Structure-CLIP

license arxiv badge AAAI Pytorch

This paper introduces an end-to-end framework Structure-CLIP, which integrates Scene Graph Knowledge to enhance multi-modal structured representations.

🔔 News

🌈 Model Architecture

Model_architecture

📚 Dataset Download

Training datasets are available here (Code: 33ri).

📕 Code Path

Code Structures

There are four parts in the code.

  • model: It contains the main files for Structure-CLIP network.
  • data: It contains the pre-training data splits and downstream dataset.
  • checkpoints: It saves checkpoint for reloading.
  • script: The training scripts for Structure-CLIP.

🔬 Dependencies

  • Python 3
  • PyTorch >= 1.8.0
  • Transformers>= 4.11.3
  • NumPy
  • All experiments are performed with one A100 GPU.

🚀 Train & Eval

The training script:

bash script/run.sh
[--train_path TRAIN_PATH] [--test_path TEST_PATH] [--nepoch NEPOCH] [--batch_size BATCH_SIZE] [--manualSeed MANUAL_SEED]
[--lr LEARNING-RATE] [--weight_decay WEIGHT_DECAY] [--knowledge_weight KNOWLEDGE_WEIGHT] [--transformer_layer_num NUMBER] [--model_name MODEL_NAME] [--neg_loss_weight NEG_LOSS_WEIGHT] 

Note:

  • you can open the .sh file for parameter modification.

🤝 Cite:

Please consider citing this paper if you use the code or data from our work. Thanks a lot :)

@inproceedings{DBLP:conf/aaai/StructureCLIP,
  author       = {Yufeng Huang and
                  Jiji Tang and
                  Zhuo Chen and
                  Rongsheng Zhang and
                  Xinfeng Zhang and
                  Weijie Chen and
                  Zeng Zhao and
                  Zhou Zhao and
                  Tangjie Lv and
                  Zhipeng Hu and
                  Wen Zhang},
  title        = {Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-modal Structured Representations},
  booktitle    = {{AAAI}},
  publisher    = {{AAAI} Press},
  year         = {2024}
}

About

Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-modal Structured Representations

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published