GitHub - zyj-2000/CUMT_2D_PhotoSpeaker: Official Repo for National Industrial Software Congress 2023："An Implementation of Multimodal Fusion System for Intelligent Digital Human Generation"

Usage for CUMT_2D_PhotoSpeaker🚀🚀🚀

We have tested the image in Ubuntu 18.04, and the docker image run successfully.

If you have any problems during the test, please email to aokzyj@126.com/zyj2000@cumt.edu.cn/zyj2000@sjtu.edu.cn

0. Requirements and Recommends 📑

Machine with GPU and installed with Docker (If you have problems when installing the Docker, please check the following link: Install Docker Engine.)

We recommend you use the VScode (installed with extensions such as Remote-SSH, Docker and Dev Container) to test the image.

1. Get the Docker Image 💡

Here are two methods provided to get the docker image.

Docker Hub

We have upload the docker image in the zyj2000/cumt_photo_speaker general | Docker Hub, you can download the image follow the command below:

sudo docker pull zyj2000/cumt_photo_speaker:v4

🌟**If the network status is good, you can download directly through the DockerHub. If you get the image from dockerhub, please skip '2. Validate the MD5 for the image'. **

Baidu yunpan

The Baidu yunpan LINK (code:cumt)

2. Validate the MD5 for the image

Please MAKE SURE the MD5 of cumt_photo_speaker.tar is 0746b9b2c8b77335c1e46d649ae26f86 in case of the failure in image loading.

# windows 
certutil -hashfile cumt_photo_speaker.tar MD5
# linux
md5sum cumt_photo_speaker.tar MD5

If the MD5 is not 0746b9b2c8b77335c1e46d649ae26f86, please delete the image and then download the docker image again.

3. Load the image in Linux

# load the image
sudo docker load -i cumt_photo_speaker.tar
# check the image
sudo docker images

After running the commands, you can see the information:

REPOSITORY                   TAG       IMAGE ID       CREATED        SIZE
zyj2000/cumt_photo_speaker   v4        61b180f6d268   2 months ago   69.5GB

if you encounter with the problem: invalid diffID for layer xx: expected "sha256:xxx", got "sha: xxx", please refer to 2. Validate the MD5 for the image

4. Run the image for demo or test in CLI

# Test in CLI
sudo docker run --gpus all --name test5 -idt zyj2000/cumt_photo_speaker:v4
# Run the docker
sudo docker exec -it test5 /bin/bash

--gpus all: Our model is run in GPUs, so '--gpus all' is necessary

if you encounter the problem when adding the --gpus all:

Error response from daemon: could not select device driver ““ with capabilities: [[gpu]]

please follow the instructions below, else you can skip the commands below.

distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker

When entering the container, you will find the information like:

(base) root@<container id>:/code# 
# examples: if your container id is 39f89f6a6e1e
(base) root@39f89f6a6e1e:/code#

The '/code' is the work path. We provided 11 demos in this folder and you can use the 'ls' command to find all the files in it：

(base) root@39f89f6a6e1e:/code# ls
BasicVSR_PlusPlus-master  DCT-Net      SadTalker_V2  curl2response.py  demo_10.py  demo_3.py  demo_6.py  demo_9.py    source_3.py  ttt.py        web
CUMT_Photospeaker         MockingBird  VITON-HD      demo1.py          demo_11.py  demo_4.py  demo_7.py   static       uploads       web.py
ChatGLM-6B                SAM          VSFA          demo_1.py         demo_2.py   demo_5.py  demo_8.py  model        templates    utils_web.py

All the demos are preset and you can just use the command like below to run the demos:

(base) root@39f89f6a6e1e:/code# python demo_x.py
# examples: x can be 1 to 11
(base) root@39f89f6a6e1e:/code# python demo_1.py

On the other hand, if you come across with the error when running the demo_2.py:

nvcc fatal   : Unsupported gpu architecture 'compute_86'

It means the GPU's hashrate is too higher, please make it lower by using the command:

# Ooen the setting file
vim ~/.bashrc
# Add the command at the bottom of the file
export TORCH_CUDA_ARCH_LIST="8.0"  # set the hashrate to 8.0
# Update
source ~/.bashrc

5. Get the results

The generation results of digital human are saved in corresponding files named as "test_demo_x". If you use VScode, you can find them in the sidebar after installing the plugins Dev Containers and Docker. Otherwise, you can copy the results to your host with the following commands in the host:

sudo docker cp <CONTAINER ID>:/code/<test_demo_x> <Destination directory>
# example:
sudo docker cp 39f89f6a6e1e:/code/test_demo_1 .

8. More information and Q&A ☕

If you encounter an error while running demo 2.py:

  File "/opt/conda/envs/chat/lib/python3.10/site-packages/requests/adapters.py", line 547, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))

It means the GPU's hashrate is too higher, please make it lower by using the command:

# The download link of GLM
https://huggingface.co/THUDM/chatglm-6b/tree/main
# load the downloaded model to container
sudo docker cp <folder/to/models/in/host> test5:/code

Related Works 🌟🌟🌟

No.	Module	Related Works
1	Language Model	ChatGLM 6B
2	Text2Speech Conversion	espeaker
3	Speech Clone	Mockingbird
4	Super Resolution	BasicVSR++
5	Quality Assessment	VSFA
6	Style Transfer	DCT-Net
7	Age Transformer	SAM
8	Person Independent Driver	SadTalker
9	Cloth Modification	VITION-HD

License

This work is made available under Creative Commons BY-NC 4.0. You can use, redistribute, and adapt the material for non-commercial purposes, as long as you give appropriate credit by citing our paper and indicate any changes that you've made. Besides, for details, follow the original LICENSE of each work listed above.

Citation 😸😸😸

If you find our paper useful, please cite our work as:

@article{zhou2023implementation,
  title={An Implementation of Multimodal Fusion System for Intelligent Digital Human Generation},
  author={Zhou, Yingjie and Chen, Yaodong and Bi, Kaiyue and Xiong, Lian and Liu, Hui},
  journal={arXiv preprint arXiv:2310.20251},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
LICENCE		LICENCE
README.md		README.md
framework_v1.png		framework_v1.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LICENCE

LICENCE

README.md

README.md

framework_v1.png

framework_v1.png

Repository files navigation

Usage for CUMT_2D_PhotoSpeaker🚀🚀🚀

0. Requirements and Recommends 📑

1. Get the Docker Image 💡

2. Validate the MD5 for the image

3. Load the image in Linux

4. Run the image for demo or test in CLI

5. Get the results

8. More information and Q&A ☕

Related Works 🌟🌟🌟

License

Citation 😸😸😸

About

Releases

Packages

License

zyj-2000/CUMT_2D_PhotoSpeaker

Folders and files

Latest commit

History

Repository files navigation

Usage for CUMT_2D_PhotoSpeaker🚀🚀🚀

0. Requirements and Recommends 📑

1. Get the Docker Image 💡

2. Validate the MD5 for the image

3. Load the image in Linux

4. Run the image for demo or test in CLI

5. Get the results

8. More information and Q&A ☕

Related Works 🌟🌟🌟

License

Citation 😸😸😸

About

Topics

Resources

License

Stars

Watchers

Forks