Alternative implementation in Refiners #127

hugojarkoff · 2024-03-21T16:43:10Z

Hello everyone, and thank you for the fantastic work!

We are building Refiners, an open source, PyTorch-based micro-framework made to easily train and run adapters on top of foundational models. Just wanted to let you know that HQ-SAM is now natively supported on top of our SAM implementation!

A MWE in Refiners (similar to demo_hqsam.py) would look like this:

First, install Refiners using our install guide;
Then, download and convert the weights for SAM and HQ-SAM (ViT-H models) in Refiners format using the following snippet:

from scripts.prepare_test_weights import convert_hq_sam, convert_sam, download_hq_sam, download_sam

download_sam()
download_hq_sam()
convert_sam()
convert_hq_sam()

Finally, run the snippet below to do some inference using HQ-SAM:

import torch
from PIL import Image

from refiners.fluxion.utils import load_from_safetensors, tensor_to_image
from refiners.foundationals.segment_anything import SegmentAnythingH
from refiners.foundationals.segment_anything.hq_sam import HQSAMAdapter

# Instantiate SAM model
sam_h = SegmentAnythingH(
    device=torch.device("cuda"),
    dtype=torch.float32,
    multimask_output=False,  # Multi-mask output is not supported by HQ-SAM
)
sam_h.load_from_safetensors("tests/weights/segment-anything-h.safetensors")

# Instantiate HQ-SAM adapter, with downloaded and converted weights
hq_sam_adapter = HQSAMAdapter(
    sam_h,
    hq_mask_only=True,
    weights=load_from_safetensors("tests/weights/refiners-sam-hq-vit-h.safetensors"),
)

# Patch SAM with HQ-SAM by “injecting” the adapter
hq_sam_adapter.inject()

# Define the image to segment and the prompt
tennis_image = Image.open("tests/foundationals/segment_anything/test_sam_ref/tennis.png")
box_points = [[(4, 13), (1007, 1023)]]

# Run inference
high_res_masks, *_ = sam_h.predict(input=tennis_image, box_points=box_points)

predicted_mask = tensor_to_image(high_res_masks)
predicted_mask.save("predicted_mask.png")

You should now have generated the following mask (note: the image has been downsized by 50% in postprocessing to fit on GitHub):

A few more things:

Refiners built-in training utils can be used to train/fine-tune HQ-SAM (see the 101 guide for an overview);
Adapters can be composed easily in Refiners, e.g. you may experiment injecting and training LoRAs in various part of SAM.

Feedback welcome!

hugojarkoff closed this as completed May 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Alternative implementation in Refiners #127

Alternative implementation in Refiners #127

hugojarkoff commented Mar 21, 2024

Alternative implementation in Refiners #127

Alternative implementation in Refiners #127

Comments

hugojarkoff commented Mar 21, 2024