Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for LIWC 2022's .dicx Format #2183

Open
jonkush opened this issue Sep 1, 2022 · 3 comments
Open

Support for LIWC 2022's .dicx Format #2183

jonkush opened this issue Sep 1, 2022 · 3 comments
Milestone

Comments

@jonkush
Copy link

jonkush commented Sep 1, 2022

Requested feature

Thank you for all you work on quanteda. LIWC released a new version of their software in May that uses a new format for their dictionary files with a new extension (.dicx). The format is similar so I hope it won't be too difficult to support it with the dictionary() command. I've copied an example of the code format below [first Github post so apologies if anything is off with this post].

#Original format (.dic)
%
1	Correct
2	Incorrect
%
rope*	1
candlestick*	2
candle stick	2
pistol*	2
knife	2
knives	2
pipe*	2

#New format (.dicx)
Entry,Correct,Incorrect
rope*,1,
candle stick*,,1
candlestick*,,1
knife,,1
knives,,1
pipe*,,1
pistol*,,1

Use case

LIWC 2022 offers a very handy dictionary building tool but it only exports into .dicx.

@kbenoit
Copy link
Collaborator

kbenoit commented Sep 2, 2022

Thanks - I will get on this asap. I hadn't realised they just issued a new release. Looks like I will need to purchase a license for the new version.

In the dictionary format, it looks like they went in their format from really bad to slightly bad. Why are we the only ones suggesting something sensible like a YAML format?

@lluaces1
Copy link

Would love to know if you or someone else has a fix for reading "dicx" in R (terrible format, worse name...)

@kbenoit
Copy link
Collaborator

kbenoit commented Jan 17, 2023

Not yet but it's on the short list. Was hoping someone might take this issue on, but if not, I'll tackle it soon.

Yep, pretty terrible name!

@kbenoit kbenoit added this to the v4 release milestone Apr 12, 2023
@kbenoit kbenoit modified the milestones: v4 release, v4.1 Oct 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants