Bad matches running on GPU (related to non_blocking parameter) #99

swengeler · 2024-01-05T10:30:33Z

Hi, first off thanks for your work and releasing it in such a "nicely packaged" format!

While I managed to resolve my issue (described below), I figured it would be useful to document it in case others encounter it as well. In addition, if you have any further insight into why this might be happening (perhaps on my machine configurations specifically) that would be appreciated as well.

Encountered issue

I was getting strange/incorrect outputs running LightGlue on GPU. Using the two images below and the match_pair function gives the following output:

When running on CPU instead, I get the following output:

The code used for this minimal example is the following:

import matplotlib.pyplot as plt
import torch

from lightglue import LightGlue, SuperPoint, viz2d
from lightglue.utils import load_image, match_pair

torch.set_grad_enabled(False)

# load models
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")  # or just "cpu" for the second example
extractor = SuperPoint(max_num_keypoints=2048).eval().to(device)
matcher = LightGlue(features="superpoint").eval().to(device)

print(f"Using device: {device}")

# load images
image0 = load_image("cup_image_0.jpg")
image1 = load_image("cup_image_1.jpg")

# extract features + correspondences
feats0, feats1, matches01 = match_pair(
    extractor, matcher, image0.to(device), image1.to(device), non_blocking=True
)
kpts0, kpts1, matches = feats0["keypoints"], feats1["keypoints"], matches01["matches"]
m_kpts0, m_kpts1 = kpts0[matches[..., 0]], kpts1[matches[..., 1]]

# visualize results
viz2d.plot_images([image0, image1])
viz2d.plot_matches(m_kpts0, m_kpts1, color="lime", lw=0.2)
viz2d.add_text(0, f'Stop after {matches01["stop"]} layers')
plt.show()

Possible solutions

I eventually figured out that this was caused by the batch_to_device function called by match_pair, or more specifically the non_blocking=True parameter. The three solutions I found are:

Not using match_pair (as is e.g. done in the demo notebook), and moving the outputs I wanted to use to CPU "manually"
Setting non_blocking=False (the default)
Adding the following two lines after the call to match_pair (in the minimal example code above):
```
stream = torch.cuda.current_stream()
stream.synchronize()
```

Input data

`cup_image_0.jpg`	`cup_image_1.jpg`

Environment info

Ubuntu 22.04.3 (WSL)
conda environment with Python 3.10, torch==2.0.1, torchvision==0.15.2
CUDA version 12.2, NVIDIA driver version 537.13
NVIDIA Quadro P620

The text was updated successfully, but these errors were encountered:

Phil26AT · 2024-01-10T17:15:52Z

Hi @swengeler, thank you for reporting your issue and the solution to it! I could not reproduce it on my end, but I keep this issue open in case someone else faces this problem.

swengeler changed the title ~~Problems with matches running on GPU (related to non_blocking parameter)~~ Bad matches running on GPU (related to non_blocking parameter) Jan 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bad matches running on GPU (related to non_blocking parameter) #99

Bad matches running on GPU (related to non_blocking parameter) #99

swengeler commented Jan 5, 2024 •

edited

Phil26AT commented Jan 10, 2024

Bad matches running on GPU (related to non_blocking parameter) #99

Bad matches running on GPU (related to non_blocking parameter) #99

Comments

swengeler commented Jan 5, 2024 • edited

Encountered issue

Possible solutions

Input data

Environment info

Phil26AT commented Jan 10, 2024

swengeler commented Jan 5, 2024 •

edited