Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create well-defined constraints for config NLP/update readme config #892

Open
Keyrxng opened this issue Nov 19, 2023 · 6 comments
Open

Create well-defined constraints for config NLP/update readme config #892

Keyrxng opened this issue Nov 19, 2023 · 6 comments

Comments

@Keyrxng
Copy link
Contributor

Keyrxng commented Nov 19, 2023

Relates to #889

The /config command will benefit greatly from having a list of do's and don'ts in regards to the various bot config settings. The current trouble is that the only parameters it knows is what it can infer from the key and what type the value of the property should be but not the implications of changing any one value.

This will also be beneficial for hunters/partners to better understand.

The current readme configuration section is slightly out of shape and doesn't fit the requirements I'm stating here.

Higher order items (imo):

  • keys
  • publicAccessControl
  • both multipliers
  • labels: both time and priority as it determines payout
  • fundExternalClosedIssue ?
  • incentives and each of it's children

@rndquu what's the situation with NFT permits, I assume they'll have their own config section or are we piggybacking the current options like permitPrice = amount == 1 for 1 POAP?

Example
  1. Without an OpenAi key all AI features are disabled: /ask, /config, /review, incentives, issue relevance etc but should be set only in a private org config repo or the current repo if private.
  2. Label naming must follow a strict schema of Time: <1 Hour & Priority: 1 (Normal) or it will break pricing functionality.
  3. Permit price is parsed at execution time so only USD values are accepted.
  4. With assistivePricing == true & setting arbitrary multipliers care should be taken, pricing is automatic and based on label time & priority and will update issues across the board once set.
  5. Incentive element types are restricted to {validHTMLElements[]}
  6. If taskFollowUpDuration > taskDisqualifyDuration then the functionality breaks.
  7. A good equilibrium for timers is: EG: 2 days, 5 days, 10 days, 20 days

$$ \text{reviewDelayTolerance} \leq \text{taskFollowUpDuration} < \text{taskDisqualifyDuration} < \text{taskStaleTimeoutDuration} $$

These are just a few of the top of my head but I need more input.


I pulled molecula451s readme and tried to keep it mostly the same in regards to updating the actual readme, the list of constraints can be added there too or used only for the /config command whatever.

keys contains optional keys for different services:

  • evmPrivateEncrypted is an optional encrypted private key for EVM-compatible networks used for payouts.
  • openAi without an OpenAi key all AI features are disabled: /ask, /config, /review, incentives, issue relevance etc but should be set only in a private org config repo or the current repo if private.

features are settings related to the bot's features:

  • assistivePricing can be true or false, to create a new pricing label if it doesn't exist.
  • defaultLabels is an array of default labels applied to issues. Empty by default.
  • newContributorGreeting settings for greeting new contributors, including:
    • enabled can be true or false.
    • header is the greeting message header.
    • displayHelpMenu can be true or false to display a help menu.
    • footer is the greeting message footer.
  • publicAccessControl settings for public access, including:
    • setLabel can be true or false for setting labels.
    • fundExternalClosedIssue can be true or false for funding externally closed issues.

timers are settings for various time-related aspects of tasks:

  • reviewDelayTolerance is the tolerance for delay in reviews targeting reviewers.
  • taskStaleTimeoutDuration is the duration after which a task is considered stale.
  • taskFollowUpDuration is the duration for the bot to follow-up on tasks with assignees.
  • taskDisqualifyDuration is the duration after which an assignee can be disqualified.

payments settings related to payments:

  • maxPermitPrice is the maximum price for automatic payouts. EG: 50 == $50
  • evmNetworkId is the ID of the EVM-compatible network for payouts.
  • basePriceMultiplier is the base multiplier for calculating task prices.
  • issueCreatorMultiplier is the multiplier for calculating rewards for the issue creator.

incentives defines incentive rewards:

  • comment for comment rewards, including:
    • elements with values for HTML elements like p, img, a.
    • totals defining rewards for characters, words, sentences, paragraphs, and comments.

labels are settings for task labels:

  • time is an array of time labels for tasks. EG: Time: <1 Hour or Time: <1 Week etc.
  • priority is an array of priority labels for tasks. EG: Priority: 1 (Normal) or Priority: 5 (Emergency) etc.

miscellaneous settings include:

  • maxConcurrentTasks is the maximum number of tasks assignable at once. This excludes tasks with delayed or approved pull request reviews.
  • promotionComment is a message appended to payment-related comments.
  • registerWalletWithVerification can be true or false for wallet verification requirement.
  • openAiTokenLimit is the token limit for OpenAI API usage.

disabledCommands is an array of commands that can should be disabled. Empty means all commands are active.

@0x4007
Copy link
Member

0x4007 commented Nov 20, 2023

assume they'll have their own config section

I think this makes sense. It just needs to be a boolean basically. Ideally the POAP should specify the GitHub organization name, repository name, issue number, and contribution role (i.e. issuer, assignee, commenter)

@rndquu
Copy link
Member

rndquu commented Nov 20, 2023

@rndquu what's the situation with NFT permits, I assume they'll have their own config section or are we piggybacking the current options like permitPrice = amount == 1 for 1 POAP?

Not sure what will be the final decision, but it seems like a new bot config param like use-nft-rewards solves an issue. So if use-nft-rewards = true then the bot generates an additional NFT permit to our standard cash reward permit.

Ideally the POAP should specify the GitHub organization name, repository name, issue number, and contribution role

Those values (organization name, repository name, etc...) can be derived from github event context so it doesn't make sense to move those values to the bot's config unless partner would want to overwrite those values.

@0x4007
Copy link
Member

0x4007 commented Nov 20, 2023

  • Yes perhaps poap-rewards: true
  • That's what I meant. I was just trying to provide additional context, but I see how it can be confusing as part of my response. No need for override, we can just get it from Context as you suggest and make sure to pass it to our minting method on chain which should support encoding this metadata.

Regarding the POAP image, at some point in the future we can use https://github.com/transitive-bullshit/puppeteer-render-text

@Keyrxng
Copy link
Contributor Author

Keyrxng commented Nov 23, 2023

I appreciate you taking the time with this for me cheers! ^

For the sake of development I'm going to assume that the partner has a solid understanding of the bot config, it's parameters and the implications of setting any one value (they likely will as they'll have had to set things up, docs, readmes etc). This is a safe assumption.

Assuming anything else about the user I think is unsafe.

I struggle with the following:

  • Your prompt is specific and requires HTML knowledge and the exp you have which I and they may also lack

  • Supplying the LLM with that specific knowledge, your relevance scoring is the backbone of everything basically and by nature is based on the subjective importance of each comment which will vary from call to call but be relatively consistent.

  • Scoring is deterministic and for me to help the llm understand we need to 'tell' it a rough but comprehensive scoring system that is objective by nature. What I mean by that is like you said "comments written with lists generally are higher quality (i.e. more informative and expressive) than those without".

So taking for example your comment about using <blockquote> being a bad thing, contextually they may be a positive. I'm thinking in terms of commands like /config research comments should be higher priced than others in this scenario a blockquote would be a good thing.

A workaround for this could be to say that the use of <cite> would be the replacement for the <blockquote> weight but then this would mean formatting various 'template weights' for the sake of the llm and may have to to be highlighted (I expect research folks often use <cite> whereas I have about 3 times ever).

So I can't decide whether the template approach would be best. It would help determinism with non-key specific commands (I presume that we won't be restricting command input but for commands like this which are non-deterministic will a best-practices or do's and don'ts be written once we have live results? You mentioned results from incentive scoring being suggested that's why I ask) but would require that I inject a lot of bias into things which may not be inline with your experience and/or what the user thinks/wants.

A chat where I'm trying to understand whether me injecting my own personal bias into the mix would, while distinctly different, impact the intention of the original assessment as I assume in the 'blackbox' it's likely that it will be applying its own sort of 'weight' to tags as a form of quantifying a comment's worth.

  • That train of thought has brought me to the question, if the bot is committing straight to the main branch and it's applying it's own bias towards general/blanket statements (and we are not assuming that the user will not overestimate like I did and will provide a specific change like you did) then again, is it safe to be committing straight to the main branch?

So obviously covering every tag is out of the question. I've tried to cover a few here but imposter syndrome kicking in so I'm rallying the troops 🤣 If any needed added please do but more importantly the contextual reason behind why that tag should/shouldn't have a higher weight.

- `<ul>`, `<ol>`: Typically more informative than those without.
- `<cite>`: Referencing sources, indicating well-researched content.
- `<a>`: External links provide additional context or evidence.
- `<img>`: Visual aids for clearer explanations, they speak a thousand words.
- `<code>`, `<pre>`: Essential for technical discussions, indicating engagement with code.
- `<strong>`, `<em>`: Highlighting critical aspects of the discussion increasing engagement.
- `<table>`: Systematic data or comparison presentation, can be tiresome if extensive.
- `<kbd>`: Representing keyboard inputs in technical guides.
- `<samp>`: Demonstrating expected results or issues in code.
- `<blockquote>`: Typically refers to another user's comment (contextual?).
- `<hr>`: Thematically separating comment sections, indicating clear sections and thoughtfulness.
- `<dl>`, `<dt>`, `<dd>`: Structuring definitions or explanations.
- `<h1>`-`<h6>`: Organizing lengthy discussions with headings.

P.S: In writing these I realised that I never use html tags only markdown, are comments with markdown as opposed to tags included? When editing comments it remains markdown form but when we parse the html does gh convert those md objects into their respective html tag?

P.P.S: For my purposes:

:octocat: <-- is actually :octocat: but displayed within an img tag so gh emojis get tagged as an img
😊 <-- is actually the unicode char would be displayed as &#x1F60A;
gifs have to be uploaded (can't be embedded) and are just links, kl

feces

@0x4007
Copy link
Member

0x4007 commented Nov 24, 2023

  1. Challenge with Prompt Specificity: I presume that ChatGPT understands the intent behind each HTML entity type so I think that saying something like "/config credit list items $1 each" it would know to target <li>

  2. Relevance Scoring System: this is determined by the partner, that is why we designed this to be all configurable. However I personally will determine the "recommended default settings" based on my testing with our own repositories.

  3. Contextual Use of HTML Tags: this is handled by the configuration. Configurations are determined by admins of partner organizations. We don't need to complicate this further.

  4. Template Approach vs. Personal Bias: this point is not clear to me. I don't think we should make templates. I think instead we should experiment with providing a great prompt with sufficient context in order to make this as ergonomic as possible to use. Unless if it is fairly stable, we will need to rely on manually updating the yml.

  5. Concern about Direct Committing to Main Branch: it must be really stable before this is enabled.

  6. HTML Tags Weighting Discussion: the partners determine the scoring of the html entities.

  7. Markdown vs. HTML in Comments: under the hood we convert markdown to html because it is less ambiguous inside of the code to use html entities vs markdown symbols.

@Keyrxng
Copy link
Contributor Author

Keyrxng commented Nov 24, 2023

100% agree with you there, I was coming from the perspective of not having your contextual understanding of why typically blockquotes are bad, li are good etc. I'll assume that when being invoked they are being specific to what properties/entities and what value they should be instead of the broad strokes prompts I have been thinking of.

I've been thinking of the command as a helper to the user in how-to/best-practice in setup sort of thing but I realise now I've been wrong in thinking that. It's just a QOL utility for updating the config on the fly, being direct and knowing what it is you want to change is how it'll be used not abstract statements.

Now that I fully understand the intention behind it I know what needs to be done

3, 4, 6. Again, my mindset has been wrong I see that now.

  1. Good to know

I'll assume that the use of this command will be variations of this style where the user input is always direct, specifying exactly what should be changed. I did begin testing with input like this but thought that broad strokes was the intention behind the command.

/config features.assistivePricing=true features.defaultLabels=["label1", "label2"]
That's a clear way to set config. It's cool to also have that but to make it more ergonomic to use.

Generally I like the AI's idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants