Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow adding header and footer, autoconfirm, add js-tiktoken, add faster check for going to file-only mode #343

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

elee1766
Copy link

lots of changes in this pr that ive been making for myself as ive been using the tool.

dont really think its ready to merge but thought i would put it here

it also fixes the punycode warning.

refactor: remove unused fs operations from esbuild.config.js
chore: update dependencies and add overrides in package.json
docs: update commit message guidelines in prompts.ts
refactor: switch to js-tiktoken for token encoding in tokenCount.ts
chore(config): add new config keys for prompt header and footer

style: reformat code for better readability and consistency across files

refactor(prompts.ts): restructure prompt generation for clarity and maintainability

chore: update dependencies and improve environment variable handling

style(server.ts): rename port variable to uppercase for consistency and clarity

chore(server.ts): add support for configurable port via process.env.PORT

style(engine.ts, tokenCount.ts, version.ts): standardize import quotes to single quotes for consistency
… directory

chore(package.json): add tiktoken dependency

style(config.ts, prompts.ts): fix formatting and indentation issues

refactor(tokenCount.ts): add native token counting using Tiktoken and conditional export based on environment variable
chore(commit.ts): add check for empty remotes array before selecting remote

perf(tokenCount.ts): move encoding initialization inside function to reduce memory usage
Introduce OCO_GIT_STAGE_ALWAYS config key to bypass user confirmation
for staging all files. This enhances automation and streamlines the
commit process when the config is set to true.
refactor(commit.ts): add auto-confirmation for commit and stage actions

refactor(config.ts): rename OCO_GIT_STAGE_ALWAYS to OCO_AUTOCONFIRM_STAGE and add OCO_AUTOCONFIRM_COMMIT

fix(openAi.ts): handle undefined message content in token count

style(prompts.ts): move OCO_PROMPT_FOOTER to the end of the prompt

style(mergeDiffs.ts): adjust import statement and spacing in token count check
@di-sukharev
Copy link
Owner

if you want this to be merged at some point, please describe the changes, this will help me to review the code

@elee1766
Copy link
Author

elee1766 commented May 24, 2024

i have some more changes and cleanup i would want to make, as i added the features just to make them work for me, and i wouldn't want to merge half baked hacks into your repo.

if i find some time to make this merge ready, i will probably redo much of it, because i am not happy with the hacky solutions i came up with. regardless, i'll still detail what i did in case it might be useful for your own development

FWIW, you can fix the punycode warning by adding this version override in package.json
this is probably the most annoying thing that can be fixed for a normal user.

  "overrides": {
    "whatwg-url": "13.0.0"
  },

otherwise, these changes two solve two issues with this app for my usage:

  1. you do tokenization in order to get token count a lot, and that's a rather slow process. i regularly have pretty large diffs that i want it to try to summarize, and it would lock up the process trying to do token counts when trying to figure out if 1. it needs to be split up file by file, and 2. if a file itself is too big.

i monkey patch this by adding a function which estimates the token count based off character count, but i think this is a bad solution and probably shouldnt be merged as it currently stands. it is good enough for my use cases. for instance, if the diff is 20,000 characters, it most likely will not fit in a 1000 token context.

  1. i need to generate semantic tags which do not use the keywords fix or feat, as these trigger my autoci, and i dont want the ai commits to decide when releases happen.

my hack was to add PREFIX/SUFFIX env vars to patch the prompt for the behavior i wanted, but i feel this is a bad solution, and really the entire prompt builder and configuration around prompt generation needs to be redone for me to feel happy about merging. im happy with my hack for personal use, but i really dont think it is something that people should use. the way that the prompt strings are created in the existing library is a mess. the repeated code across the two different prompt types really bothers me, along with the lack of customization around the prompt in general.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants