Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NameError: global name 'Corpus' is not defined #30

Open
nanocombi opened this issue May 12, 2016 · 14 comments
Open

NameError: global name 'Corpus' is not defined #30

nanocombi opened this issue May 12, 2016 · 14 comments

Comments

@nanocombi
Copy link

nanocombi commented May 12, 2016

After I've installed the latest corpkit (2.1.1), I wanted to parse my corpus (which worked in the previous version - at least for approx. 40% of the texts) and received the message NameError: global name 'Corpus' is not defined. What did I do wrong?
Below you will find the log - hoping that it provides a clue...
Thx for your help!

log-00.txt

@interrogator
Copy link
Owner

Hey! You didn't do anything wrong, I've forgotten an import statement. Give me an hour or two and I'll fix it up for you.

Let me know if you have any other comments on the tool, too!

@nanocombi
Copy link
Author

Wow. Thank you for your fast reply! From what I could see so far, corpkit definitely has the potential to replace the established tools that I previously worked with. I am looking forward to challenging this igenious application ;)

@interrogator
Copy link
Owner

I've just uploaded version 2.1.2. It seems like everything is working again. Let me know!

Thanks for the positive feedback, too! Be in touch if you have any suggestions.

@nanocombi
Copy link
Author

Hey. Sorry to bother you again. The app now opens the parser options but prints an UnboundLocalError: local variable 'possible_paths' referenced before assignment

@interrogator
Copy link
Owner

Whoops. OK, hopefully fixed in 2.1.3.

P.S. Are you using the auto-update feature, or redownloading the app? I'm curious if the automatic update is working properly on others' machines.

@nanocombi
Copy link
Author

Cheers! I am forced to redownload. A reason for that might be that I am working on a non-locally administered university workstation.

@interrogator
Copy link
Owner

Yeah, that makes it a bit more tricky. Oh well.

Let me know if you can now parse a corpus/interrogate it, so I can close the issue.

Also, I'm wondering about what you said before, about at least for approx. 40% of the texts being parsed. Can you elaborate?

@interrogator
Copy link
Owner

@nanocombi A few updates in the last days. Let me know how it goes!

@nanocombi
Copy link
Author

Cheers! I am forced to redownload. A reason for that might be that I am working on a non-locally administered university workstation.

@nanocombi
Copy link
Author

nanocombi commented May 18, 2016

I am affraid, it still doesn't work. Attached you'll find the log. The error reads: OSError: [Errno 13] Permission denied: '/data'
log-01.txt

@interrogator
Copy link
Owner

I haven’t looked at this in the code much, but it seems to me that there’s a chance that the problem is that you aren’t in a project when you add/parse the corpus.

Does it still appear if you make a new project and add a corpus that way?

I’ll take a look at the code a bit later, anyway, and see what I find.

On 18 May 2016, at 10:17 AM, nanocombi <notifications@github.commailto:notifications@github.com> wrote:

I am affraid, it still doesn't work. Attached the log.
log-01.txthttps://github.com/interrogator/corpkit/files/269929/log-01.txt


You are receiving this because you commented.
Reply to this email directly or view it on GitHubhttps://github.com//issues/30#issuecomment-219957439

@nanocombi
Copy link
Author

I set up a new project and the error changed into: UnicodeDecodeError: 'utf8' codec can't decode byte 0xfc in position 3244: invalid start byte - Tried using a selection of txts from different corpora to rule out formating errors in my files. Thank you for your effords!
log-00.txt

@interrogator
Copy link
Owner

Hey. So, that error basically means that corpkit is expecting UTF-8 encoded data, but isn't getting it. I'm happy to add code that will try to detect encoding, but it might take me a little while. In the meantime, you could also try to convert your files to UTF-8 encoding yourself.

@interrogator
Copy link
Owner

OK @nanocombi, I put up a fix. It tries to guess file encodings and convert to UTF-8. Getting your data into consistent UTF-8 beforehand is still probably a good idea, though.

Let me know if it works!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants