Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset issues #1

Open
qiljj opened this issue Dec 10, 2022 · 11 comments
Open

Dataset issues #1

qiljj opened this issue Dec 10, 2022 · 11 comments

Comments

@qiljj
Copy link

qiljj commented Dec 10, 2022

Hello, I want to ask about the problem of the dataset, after opening the .pkl file, I found that the content in the middle is compressed, I don't know how to deal with it. Moreover, I see that this dataset also contains text, can you upload the dataset you have processed? Thank you very much.
您好,我想请教一下数据集的问题,打开pkl文件后,发现中间的内容都被压缩了,我不知道该怎么处理了。而且,我看这份数据集还包含文字,能否上传一下您处理过的数据集呢?万分感谢。

@easezyc
Copy link
Collaborator

easezyc commented Dec 10, 2022

现在这个数据集就是处理过的,解压之后能直接跑程序

@qiljj
Copy link
Author

qiljj commented Dec 11, 2022

数据集问题.pdf
1.那您这个.pkl文件时怎么打开的呢?我用pycharm打开,用UTF-8和GBK都出现乱码问题
2.而且文字怎么传进模型里吖?在模型里跑的不应该都是数字吗?
图2是我想把pkl文件转成txt文件,发现中间内容都被省略了

@easezyc
Copy link
Collaborator

easezyc commented Dec 12, 2022

1.参考pickle库,代码里也有示范如何用pickle打开
2.代码里有把token转index的部分

@qiljj
Copy link
Author

qiljj commented Jan 7, 2023

2
1
1.我参考了您的读取pkl文件的代码,图1是提取的代码:用来将pkl文件转换为txt文件,但是得出的txt文件好多内容都被省略了(图2),此处我真不知道如何处理了呜呜呜~
2.假如说我把pkl文件成功转换为txt文件后,content和style_feature在同一个文档中,是否需要进一步处理?我看有的数据集就只包含经过处理后得到的标签。我没有看到您的代码中有做这种处理的,那我是否需要自己再进行数据处理?
万分感谢

@qiljj
Copy link
Author

qiljj commented Jan 7, 2023 via email

@qiljj
Copy link
Author

qiljj commented Jan 7, 2023 via email

@easezyc
Copy link
Collaborator

easezyc commented Jan 7, 2023

简单的做法,读取pkl,遍历每一条样本,转json格式,保存json

@qiljj
Copy link
Author

qiljj commented Jan 19, 2023

1.我将数据集都转化为了json格式,之前的pkl格式属于中间结果吗(可不回答)?数据集的话最后用的是json格式去训练模型吗?
2../logs/param 下的m3fend_oneloss_param.txt里是空的,但是这里会用到:parser.add_argument('--param_log_dir', default = './logs/param')

@qiljj
Copy link
Author

qiljj commented Feb 2, 2023

你好,您写的您用的pytorch>1.0 ,方便透漏您具体使用的pytorch版本和显卡型号吗?感谢

@easezyc
Copy link
Collaborator

easezyc commented Feb 3, 2023

好像是pytorch1.6,显卡V100

@Hashirnihal
Copy link

Hi sir
i am student from NITPY ,India
i am doing my micro project on this paper
while running your code some error are coming. can you please help me to solve that error
error i mentioned below

lr: 0.0001; model name: m3fend; batchsize: 64; epoch: 50; gpu: 1; domain_num: 3
Traceback (most recent call last):
File "main.py", line 112, in
Run(config = config).main()
File "C:\Users\hashi\OneDrive\Desktop\Project\Fake news\grid_search.py", line 134, in main
trainer = M3FENDTrainer(emb_dim = self.emb_dim, mlp_dims = self.mlp_dims, use_cuda = self.use_cuda,
TypeError: init() got an unexpected keyword argument 'dataset'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants