Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problem in dataloader.py #13

Open
DimPenCHEN opened this issue Jan 3, 2024 · 9 comments
Open

problem in dataloader.py #13

DimPenCHEN opened this issue Jan 3, 2024 · 9 comments

Comments

@DimPenCHEN
Copy link

I have solved the data problems(I think),but I found that the file "dataloader.py" using the "label" key in data.json which is no in my data.json
self.data_complete = self.data_complete[self.data_complete['label']!=2] # label: 0-real, 1-fake, 2-debunk
361e5823f04fd2d2e66eb7fff9493ae

BTW, I haven't obatined the data directory "vids" and **"ptvgg19_frame_thumb"**which aren't mentioned in the public dataset.
I wonder how to sovle this problem in dataloader.py TwT

@pp-jia
Copy link

pp-jia commented Jan 3, 2024

How did you obtain the Dataset? Did you crawl it based on data.json? @DimPenCHEN

@DimPenCHEN
Copy link
Author

How did you obtain the Dataset? Did you crawl it based on data.json? @DimPenCHEN

I obtaioned the data from the public dataset which in huggingface.co. I can send the detail address to you later(I can't find it right now).
Maybe We can exchange the contact information,we can communicate with this project more.

@DimPenCHEN
Copy link
Author

@pp-jia https://huggingface.co/datasets/MischaQI/FakeSV/tree/main
This link is available to obtain the data you need

@pp-jia
Copy link

pp-jia commented Jan 4, 2024

Thanks, you can contact me via the email address on my homepage. @DimPenCHEN

@TODO-main
Copy link

I have solved the data problems(I think),but I found that the file "dataloader.py" using the "label" key in data.json which is no in my data.json self.data_complete = self.data_complete[self.data_complete['label']!=2] # label: 0-real, 1-fake, 2-debunk 361e5823f04fd2d2e66eb7fff9493ae

BTW, I haven't obatined the data directory "vids" and **"ptvgg19_frame_thumb"**which aren't mentioned in the public dataset. I wonder how to sovle this problem in dataloader.py TwT

I have the same problem, have you solved this problem

@TODO-main
Copy link

I have solved the data problems(I think),but I found that the file "dataloader.py" using the "label" key in data.json which is no in my data.json self.data_complete = self.data_complete[self.data_complete['label']!=2] # label: 0-real, 1-fake, 2-debunk 361e5823f04fd2d2e66eb7fff9493ae
BTW, I haven't obatined the data directory "vids" and **"ptvgg19_frame_thumb"**which aren't mentioned in the public dataset. I wonder how to sovle this problem in dataloader.py TwT

I have the same problem, have you solved this problem

There is no ”lable“ attribute in the supplied "data.json" file

@andr2w
Copy link

andr2w commented Apr 8, 2024

The data.json file does not contain the 'label' column.
My way to solve this problem is to modify the SVFENDDataset as follows:

replace_values = {'辟谣': 2, '假': 1, '真':0}
self.data_complete['annotation'] = self.data_complete['annotation'].replace(replace_values)
self.data_complete = self.data_complete[self.data_complete['annotation']!=2]
 #self.data_complete = self.data_complete[self.data_complete['label']!=2] # label: 0-real, 1-fake, 2-debunk

@andr2w
Copy link

andr2w commented Apr 8, 2024

The data.json file does not contain the 'label' column. My way to solve this problem is to modify the SVFENDDataset as follows:

replace_values = {'辟谣': 2, '假': 1, '真':0}
self.data_complete['annotation'] = self.data_complete['annotation'].replace(replace_values)
self.data_complete = self.data_complete[self.data_complete['annotation']!=2]
 #self.data_complete = self.data_complete[self.data_complete['label']!=2] # label: 0-real, 1-fake, 2-debunk

If you follow the above, be sure to modify the following code into

label = 0 if item['annotation'] == '真' else 1

to

label = 0 if item['annotation'] == 0 else  1

Otherwise, there will be a significant error during the training phase.

@TODO-main
Copy link

The data.json file does not contain the 'label' column. My way to solve this problem is to modify the SVFENDDataset as follows:

replace_values = {'辟谣': 2, '假': 1, '真':0}
self.data_complete['annotation'] = self.data_complete['annotation'].replace(replace_values)
self.data_complete = self.data_complete[self.data_complete['annotation']!=2]
 #self.data_complete = self.data_complete[self.data_complete['label']!=2] # label: 0-real, 1-fake, 2-debunk

If you follow the above, be sure to modify the following code into

label = 0 if item['annotation'] == '真' else 1

to

label = 0 if item['annotation'] == 0 else  1

Otherwise, there will be a significant error during the training phase.

Thank you for your reply, which gave me a good solution to this problem. However, after solving this problem, I met a new problem. I wonder if you have also met the same problem?
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants