Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About code explanation #15

Open
enmengyi opened this issue Aug 30, 2023 · 2 comments
Open

About code explanation #15

enmengyi opened this issue Aug 30, 2023 · 2 comments

Comments

@enmengyi
Copy link

I found the original data format of Commitpackft is like this:
image

I don't really understand how to use it to finetune my model on code-explanation task, because it seems that there is no information about what this piece of code is doing.

@Muennighoff
Copy link
Collaborator

You can finetune it to predict the commit subject which usually explains what the change is doing but not what the entire code is doing.

To get data that explains what the entire code is doing you could filter for commits where old_contents is empty. Then you may get commit subjects that explain the entire new_contents. We haven't tried this though, but I'd love to know how well it works.

@enmengyi
Copy link
Author

You can finetune it to predict the commit subject which usually explains what the change is doing but not what the entire code is doing.

To get data that explains what the entire code is doing you could filter for commits where old_contents is empty. Then you may get commit subjects that explain the entire new_contents. We haven't tried this though, but I'd love to know how well it works.

Thank you so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants