Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fine-tuning process #43

Open
smruti241 opened this issue Sep 12, 2023 · 4 comments
Open

Fine-tuning process #43

smruti241 opened this issue Sep 12, 2023 · 4 comments

Comments

@smruti241
Copy link

Hi @Zhihan1996 , thanks for providing the code for finetuning DNABERT2. But there is no mention of how to generate the dev.tsv, test.csv and train.csv from our own dataset and how to provide the label 1 and 0 to the sequences. can you please let me know how to do that?

@TheRainInSpain
Copy link

I generated the csv files using the python csv library by writing each sequence and label into one row. But when I ran the code, error happened (described in #42 ). I wonder whether this is the right way to generate the csv files.

@smruti241
Copy link
Author

smruti241 commented Sep 13, 2023

but how did you label the sequence, based on which parameter? or randomly assigning 1 or 0? because randomly assigning wont make any sense.

@smruti241
Copy link
Author

#42

@smruti241
Copy link
Author

I have covid data and DNABERT-2 has already covid data for finetuning. I just want to know how did you know each sequence should be given 1,2,3,4 , etc upto 9 as label? Please let me know @Zhihan1996

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants