Skip to content

cvpaperchallenge/CVPR2023_Survey

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CVPR 2023 Survey

MIT License Code style: black Code style: flake8 Imports: isort Typing: mypy DOI

Prerequisites

NOTE: Example codes in the README.md are written for Docker Compose v2.

Prerequisites installation

Here, we show example prerequisites installation codes for Ubuntu. If prerequisites are already installed in your environment, please skip this section. If you want to install it in another environment, please follow the official documentation.

Install Docker and Docker Compose

# Set up the repository
$ sudo apt update
$ sudo apt install ca-certificates curl gnupg lsb-release
$ sudo mkdir -p /etc/apt/keyrings
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
$ echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
  $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

# Install Docker and Docker Compose
$ sudo apt update
$ sudo apt install docker-ce docker-ce-cli containerd.io docker-compose-plugin

If sudo docker run hello-world works, the installation succeeded.

Download all CVPR 2023 papers

(Optional) Parse CVF page

NOTE: JSON file created by this step is already included in the repo. So this step is optional.

Please run the following command inside the "core" container (This command generates papers.json under the data directory). This process will take 10-20 minutes.

$ poetry run python3 src/scripts/parse_cvf_page.py 

Download PDF

Please run the following command from inside the "core" container.

$ poetry run python3 src/scripts/download_papers.py 

Generate summaries of CVPR 2023 papers

Setup environmental variables

We use the following APIs to generate summaries:

  • Mathpix API: To convert PDF into Latex format text.
  • OpenAI API: To use LLM (GPT).

To use the above APIs, we need to set the following environmental variables:

  • MATHPIX_API_ID
  • MATHPIX_API_KEY
  • OPENAI_API_KEY

So please run the following command to create an envs.env file and replace sample values with actual ones.

% cp environments/envs.env.sample environments/envs.env

Values written in the envs.env file are automatically loaded by docker and stored as environmental variables in the container.

Convert PDF to Latex format text

Here convert PDF to Latex format using Mathpix API. This makes it possible to extract the original structure of papers.

Please run the following command from inside the "core" container.

$ poetry run python3 src/scripts/convert_to_latex.py

Generate summaries

Now we are ready to generate summaries by using LLM (GPT). Please run the following command from inside the "core" container.

% poetry run python3 src/scripts/generate_summaries.py