Allow embeddings and loras to have different names #6053

brandonrising · 2024-03-26T00:59:51Z

Summary

Currently LoRA folders will only named if they're named precisely pytorch_lora_weights.bin or pytorch_lora_weights.safetensors, and embedding folders will only load if they're named "learned_embeds.bin". This PR is an attempt to support different lora/embedding types if they are installed as a folder. Even though we have this naming convention for the files in the folder, our backend supports many more types of models for embeddings, and loras seem to be rarely named "correctly" in Huggingface.

QA Instructions

Test this by copying one of your already existing embeddings into a folder under the same directory in your models. Ensure it's scanned correctly on startup and is usable forgeneration.

Merge Plan

Can be merged if approved by code owners. @lstein may know some reason why we had this naming requirement. If so, we can look for different ways to handle this.

psychedelicious · 2024-03-26T01:50:07Z

invokeai/backend/model_manager/probe.py

            )
-        return TextualInversionCheckpointProbe(path).get_base_type()
+        super().__init__(files.pop())


What if the first model file we find isn't the correct one?

I've seen a few TI models hosted on diffusers that have multiple checkpoints in them, presumably representing different stages of trainign.

What if the first model file we find isn't the correct one?

Nevermind that, I misread the length check. The other issue still stands.

I don't believe any of the checkpoints are actually used for inference but I could be wrong. This only scans the root directory of the folder so as long as there's only 1 model there it loads it fine. It does move it out of the folder though.

What I mean is that this logic would appear to fail probing TIs that have multiple model files in the directory.

Some look like this (struggling to find an example, filenames probably aren't accurate):

embedding.pt

checkpoint-001.pt

checkpoint-002.pt

This is a valid TI and the model loader knows how to load it, but I believe the logic in this PR would see the multiple model files and consider it an invalid model.

psychedelicious · 2024-03-26T01:50:33Z

invokeai/backend/model_manager/probe.py

-    def get_base_type(self) -> BaseModelType:
-        path = self.model_path / "learned_embeds.bin"
-        if not path.exists():
+class TextualInversionFolderProbe(TextualInversionCheckpointProbe):


I don't like how this results in a "folder" model having a path that points to a file...

Yeah, I'm not really sure what value of storing it as a folder even are for these though. I haven't found any lora or embeddings that have other contextual files that affect generations. Just seems to put us in a position where we have to enforce folder structure for no reason to me.

The folder was introduced by hugging face. TIs usually contain two files--one with the weights and another with the name of the concept.

For Lora's the folder is used to give the model a name

RyanJDick · 2024-03-26T13:59:11Z

I've totally lost track of all the model probing pathways at this point, but I see that learned_embeds and pytorch_lora_weights are still hard-coded in several places. Do those not conflict with this change?

brandonrising · 2024-03-26T14:34:44Z

I've totally lost track of all the model probing pathways at this point, but I see that learned_embeds and pytorch_lora_weights are still hard-coded in several places. Do those not conflict with this change?

Yeah this PR loads all Lora and embedding as checkpoint single file models. This bypasses all of the checks post install since they're tied to folder models.

If you install a model while passing in the model type, that also bypasses the strict naming requirements. Currently there's no way to do that in the UI but the import api call accepts it.

All said though, I'm not sure if there's unexpected consequences to treating a "diffuser lora/ti" as a single checkpoint model. I couldn't find examples of models requiring a config.json on huggingface

lstein · 2024-03-26T20:11:57Z

Indeed most of these are just single file checkpoints, but HuggingFace gives them the same file name. The enclosing folder name is used during probing to get a name for the model and to prevent path collisions.

lstein

Here's what I get when I probe one of HuggingFace's concept library textual inversions:

% probe-model.py 001glitch-core
001glitch-core:{
    "key": "cfe3205d-b881-458f-8be7-99f875452039",
    "hash": "blake3:7e62bfe8ff7c27a2747c146fe86afeda1ba3bba3db75d3644a40fe418b9b3d7e",
    "path": "/home/lstein/invokeai-main/models/sd-1/embedding/001glitch-core/learned_embeds.bin",
    "name": "learned_embeds",
    "base": "sd-1",
    "description": "sd-1 embedding model learned_embeds",
    "source": "/home/lstein/invokeai-main/models/sd-1/embedding/001glitch-core/learned_embeds.bin",
    "source_type": "path",
    "source_api_response": null,
    "cover_image": null,
    "type": "embedding",
    "format": "embedding_file"
}

As noted by @psychedelicious, the format is now embedding_file, which is deceptive. But the big issue is that the model is now named "learned_embeds.bin" I confirmed by installing a hugging face concept using its repo_id, and sure enough the system installed a model named learned_embeds. This will make it hard to distinguish one TI from another!

lstein · 2024-04-15T13:43:07Z

@brandonrising Are you still working on this?

brandonrising · 2024-04-15T14:45:16Z

@brandonrising Are you still working on this?

Yeah it seemed like an issue that could only really come up from manual manipulation of the models directory, so it's fallen down my priority list. I think we can just safely close this PR unless you see value in it somewhere.

brandonrising · 2024-04-15T14:46:31Z

@brandonrising Are you still working on this?

Yeah it seemed like an issue that could only really come up from manual manipulation of the models directory, so it's fallen down my priority list. I think we can just safely close this PR unless you see value in it somewhere.

As in, models can't even be installed with different names without manual intervention currently. We probably want to find some way to support different filenames, but it seems like this isn't the right way

brandonrising requested review from lstein, blessedcoolant, GreggHelt2, RyanJDick and hipsterusername as code owners March 26, 2024 00:59

github-actions bot added python PRs that change python files backend PRs that change backend files labels Mar 26, 2024

brandonrising force-pushed the allow-embeddings-and-loras-to-have-different-names branch from 97f0e7b to 3c8c944 Compare March 26, 2024 01:00

psychedelicious reviewed Mar 26, 2024

View reviewed changes

brandonrising and others added 6 commits March 26, 2024 15:10

Allow Embedding and Lora Folder Models to have Different Names

bbecb99

For loras/embeddings treat folders the same as single files

18098cc

Fix paths in probe

befcf77

Use model_path defined in probe run

a7b8bbc

Use correct format type on model

90f32d1

Run ruff

653a8f6

brandonrising force-pushed the allow-embeddings-and-loras-to-have-different-names branch from 3c8c944 to 653a8f6 Compare March 26, 2024 19:10

lstein requested changes Mar 27, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow embeddings and loras to have different names #6053

Allow embeddings and loras to have different names #6053

brandonrising commented Mar 26, 2024 •

edited

psychedelicious Mar 26, 2024

psychedelicious Mar 26, 2024

brandonrising Mar 26, 2024

psychedelicious Mar 26, 2024

psychedelicious Mar 26, 2024

brandonrising Mar 26, 2024

lstein Mar 26, 2024

RyanJDick commented Mar 26, 2024 •

edited

brandonrising commented Mar 26, 2024

lstein commented Mar 26, 2024

lstein left a comment •

edited

lstein commented Apr 15, 2024

brandonrising commented Apr 15, 2024

brandonrising commented Apr 15, 2024

Allow embeddings and loras to have different names #6053

Are you sure you want to change the base?

Allow embeddings and loras to have different names #6053

Conversation

brandonrising commented Mar 26, 2024 • edited

Summary

QA Instructions

Merge Plan

psychedelicious Mar 26, 2024

Choose a reason for hiding this comment

psychedelicious Mar 26, 2024

Choose a reason for hiding this comment

brandonrising Mar 26, 2024

Choose a reason for hiding this comment

psychedelicious Mar 26, 2024

Choose a reason for hiding this comment

psychedelicious Mar 26, 2024

Choose a reason for hiding this comment

brandonrising Mar 26, 2024

Choose a reason for hiding this comment

lstein Mar 26, 2024

Choose a reason for hiding this comment

RyanJDick commented Mar 26, 2024 • edited

brandonrising commented Mar 26, 2024

lstein commented Mar 26, 2024

lstein left a comment • edited

Choose a reason for hiding this comment

lstein commented Apr 15, 2024

brandonrising commented Apr 15, 2024

brandonrising commented Apr 15, 2024

brandonrising commented Mar 26, 2024 •

edited

RyanJDick commented Mar 26, 2024 •

edited

lstein left a comment •

edited