Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Structured Generation Prompt Info Bug / Potentional Security Issue #525

Open
drone540 opened this issue Feb 25, 2024 · 4 comments
Open

Structured Generation Prompt Info Bug / Potentional Security Issue #525

drone540 opened this issue Feb 25, 2024 · 4 comments

Comments

@drone540
Copy link

drone540 commented Feb 25, 2024

I recently updated the package for this and noticed there is now a structured generation tab that is now the default, and was kind of hating how it was presenting the data in such a weird way and not showing the actual prompts.

I noticed the issue when viewing image information that contains LORAs there is a lot of blank space showing in the prompts where the LORA's should be

Screenshot_20240224_222712

Examining this in the Browser inspector shows that the LORAs are being processed as HTML tags, which also means that any valid HTML code can be put into the prompt.

<br data-v-51c7854f="">
<h3 data-v-51c7854f="">Prompt</h3>
<code data-v-51c7854f="">
  <span class="">by Anatoly Metlan and Hikari Shimoda in the style of John Bauer</span>, <span class="">adorable happy woman holding two swords</span>, <span class="">maniacal laughter <lora:happy:0.5>
      <lora:maniacal_laughter:2.0>
        <lora:dual_pistols:0.50></lora:dual_pistols:0.50>
      </lora:maniacal_laughter:2.0>
    </lora:happy:0.5>
  </span>, <span class="">digital painting <lora:digital_painting_envy02:0.60>
      <lora:more_art:0.30>
        <lora:midjourney:-0.15>
          <lora:rmsdxl_enhance:0.50>
            <lora:rmsdxl_creative:0.50>
              <lora:rmsdxl_photo:0.50>
                <lora:rmsdxl_darkness_cinema:0.50>
                  <lora:clothing_slider:2.00>
                    <lora:great_lighting:1.50>
                      <lora:aesthetic:1.50>
                        <lora:photorealistic_portrait:1.50>
                          <lora:extremely_detailed:1.50>
                            <lora:dpo:0.25>
                              <lora:koreangirllora:1></lora:koreangirllora:1>
                            </lora:dpo:0.25>
                          </lora:extremely_detailed:1.50>
                        </lora:photorealistic_portrait:1.50>
                      </lora:aesthetic:1.50>
                    </lora:great_lighting:1.50>
                  </lora:clothing_slider:2.00>
                </lora:rmsdxl_darkness_cinema:0.50>
              </lora:rmsdxl_photo:0.50>
            </lora:rmsdxl_creative:0.50>
          </lora:rmsdxl_enhance:0.50>
        </lora:midjourney:-0.15>
      </lora:more_art:0.30>
    </lora:digital_painting_envy02:0.60>
  </span>
</code>
<br data-v-51c7854f="">
<h3 data-v-51c7854f="">Negative Prompt</h3>
<code data-v-51c7854f="">
  <span class="">poorly drawn</span>, <span class="">deviantart</span>, <span class="">mess</span>, <span class="">low quality</span>, <span class="">blurry</span>, <span class="">doll</span>, <span class="">painted face</span>
</code>

Some kind of basic HTML escape need to be implemented for some of these characters:

"<" : "&lt;",
">" : "&gt;",
"&" : "&amp;",
'"' : "&quot;",
"'" : "&#39;",

Edit: Also are the prompts even supposed to be displayed this way with the comma separation? example, right, here. Because this seemed strange to me since that serves no purpose except changing colors on mouse over for some reason.

I'd rather it be a normal text box: example. right, here or ideally similar to civitai's generation data, with maybe a copy button on prompt and negative prompt boxes for when copying the text is useful

@zanllp
Copy link
Owner

zanllp commented Feb 26, 2024

You can temporarily switch to the "source text" tab. You only need to switch once, and it will remember the operation.
image

@fg-uulm
Copy link

fg-uulm commented Feb 29, 2024

Author of this feature here – I just opened PR #527 hopefully fixing the issue, thanks for reporting. Will have to keep this in mind for any other place where generation info is being used.

As for your comments about the UX side of this feature, I'm open for discussions about this. The basic idea for the design of the "tokenized" view was that a lot of people are prompting using mainly single words or very short sentences, but at the same time lots of it (see screenshot of zanllp for a good example). So the idea was to subdivide the prompt by highlighting the comma-separated parts to get a better overview, instead of just a long string of often poorly formatted text.

If your style of prompts goes more towards long-ish full sentences, then I totally get how this view decreases readability however. Maybe it would be also better to go with the source text view as a default, and have the structured data view as the "second" tab.

@drone540
Copy link
Author

drone540 commented Mar 1, 2024

Thanks for the PR of the issue.

As far as the UX. I'm not really 100% sure how useful the prompt tokens are, unless maybe they did something when clicked on, but i figured the structured view was a chance to get closer to maybe improving the readability of the generation information. I do notice that prompthero has prompt tokens, but you don't really notice it unless you mouse over it, and they are links to search with. Not sure who else has them.

For the prompt and negative prompt. I figured it would look a bit cleaner with maybe a solid background similar to how Civitai presents it:

Screenshot_20240229_231923

Kinda like this:
Screenshot_20240301_001935

I didn't fully remove the tokens on this example, just added the color of the token background to the code block to show what it might look like.

The comma's are not appearing as originally typed in the prompt, as they are surrounded by extra spaces. If you wanted to keep the tokens, I'd like the prompt to preserve original comma spacing if possible.

@fg-uulm
Copy link

fg-uulm commented Mar 1, 2024

Thanks for your input! I think we'd need to separate visual presentation and function here a little bit more, I get that the very strong visual subdivision between the tokens seems to be more annoying than useful in certain situations, and also the missing "frame" / background seems to make things worse. Also, as a dark mode user I'm not yet happy with the contrast ratios overall.

I'll play around with it a bit more along your example in the screenshot, and will also do some more detailed research on patterns other tools use (CivitAI, PromptHero,...). I have to admit that I didn't do this before, and just tried to create something that works better for me than the unstructured string of generation data that we had before.

As for for the functionality, your comment actually sparked some other ideas about this (just writing them out here for future reference):

  • Maybe a click on a prompt token can lead directly to the search tab, searching for images with the same token in the prompt
  • Maybe hovering the token could provide additional information, like total usage of this token (stats) or the possibility to just copy a single token

zanllp added a commit that referenced this issue Mar 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants