New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem running run_nlpbook.py #4
Comments
I have spent a good part of the morning trying to solve this problem on Windows 10. I have got it working. Everything will work on Linux, but the problem is when BookNLP tries to call the directories for each model The key to solving is this portion of the error message: OSError: C:\Users\denis\booknlps\entities_google/ BookNLP does not create the folder booknlps or the subfolders. You will need to do this manually. Go to Users/{username}/ and here create 3 subdirectories: entities_google, coref_google, and speaker_google . Next download the appropriate models from hugging face. You can use git clone (here is a good tutorial for how to do ithttps://stackoverflow.com/questions/67595500/how-to-download-model-from-huggingface) https://huggingface.co/google/bert_uncased_L-6_H-768_A-12 - entities I am about to make a video on this whole process as I am preparing a YouTube series on using BookNLP and resolving Windows issues was priority number 1 since most of my viewers use Windows |
Thank you - I will try it out this weekend. Also glad to hear of the YouTube series. |
No problem! Please do let me know if it works/does not work for you. I have only tested it on one machine. Best of luck! |
Hey, I created a fix for this problem. It comes from the fact that So I updated all |
After running:
booknlp=BookNLP("en", model_params)
I get the following;
(It seems to refer to my model location by booknlps, but what is created is booknlp_models, and tacking a local directory path to the huggingface url also seems like an issue. I'm glad to help and try things here, though my experience with big pyhon code bases is limited )
404 Client Error: Repository Not Found for url: https://huggingface.co/C:%5CUsers%5Cdenis%5Cbooknlps%5Centities_google/bert_uncased_L-6_H-768_A-12/resolve/main/tokenizer_config.json
RepositoryNotFoundError Traceback (most recent call last)
c:\Users\denis\Anaconda3\envs\booknlp\lib\site-packages\transformers\file_utils.py in get_file_from_repo(path_or_repo, filename, cache_dir, force_download, resume_download, proxies, use_auth_token, revision, local_files_only)
2241 local_files_only=local_files_only,
-> 2242 use_auth_token=use_auth_token,
2243 )
c:\Users\denis\Anaconda3\envs\booknlp\lib\site-packages\transformers\file_utils.py in cached_path(url_or_filename, cache_dir, force_download, proxies, resume_download, user_agent, extract_compressed_file, force_extract, use_auth_token, local_files_only)
1853 use_auth_token=use_auth_token,
-> 1854 local_files_only=local_files_only,
1855 )
c:\Users\denis\Anaconda3\envs\booknlp\lib\site-packages\transformers\file_utils.py in get_from_cache(url, cache_dir, force_download, proxies, etag_timeout, resume_download, user_agent, use_auth_token, local_files_only)
2049 r = requests.head(url, headers=headers, allow_redirects=False, proxies=proxies, timeout=etag_timeout)
-> 2050 _raise_for_status(r)
2051 etag = r.headers.get("X-Linked-Etag") or r.headers.get("ETag")
c:\Users\denis\Anaconda3\envs\booknlp\lib\site-packages\transformers\file_utils.py in _raise_for_status(request)
1970 if error_code == "RepoNotFound":
-> 1971 raise RepositoryNotFoundError(f"404 Client Error: Repository Not Found for url: {request.url}")
1972 elif error_code == "EntryNotFound":
RepositoryNotFoundError: 404 Client Error: Repository Not Found for url: https://huggingface.co/C:%5CUsers%5Cdenis%5Cbooknlps%5Centities_google/bert_uncased_L-6_H-768_A-12/resolve/main/tokenizer_config.json
During handling of the above exception, another exception occurred:
OSError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_2352\2094341818.py in
4 }
5
----> 6 booknlp=BookNLP("en", model_params)
c:\Users\denis\Anaconda3\envs\booknlp\lib\site-packages\booknlp\booknlp.py in init(self, language, model_params)
12
13 if language == "en":
---> 14 self.booknlp=EnglishBookNLP(model_params)
15
16 def process(self, inputFile, outputFolder, idd):
c:\Users\denis\Anaconda3\envs\booknlp\lib\site-packages\booknlp\english\english_booknlp.py in init(self, model_params)
146
147 if self.doEntities:
--> 148 self.entityTagger=LitBankEntityTagger(self.entityPath, tagsetPath)
149 aliasPath = pkg_resources.resource_filename(name, "data/aliases.txt")
150 self.name_resolver=NameCoref(aliasPath)
c:\Users\denis\Anaconda3\envs\booknlp\lib\site-packages\booknlp\english\entity_tagger.py in init(self, model_file, model_tagset)
17 base_model=re.sub(".model", "", base_model)
18
---> 19 self.model = Tagger(freeze_bert=False, base_model=base_model, tagset_flat={"EVENT":1, "O":1}, supersense_tagset=self.supersense_tagset, tagset=self.tagset, device=device)
20
21 self.model.to(device)
c:\Users\denis\Anaconda3\envs\booknlp\lib\site-packages\booknlp\english\tagger.py in init(self, freeze_bert, base_model, tagset, supersense_tagset, tagset_flat, hidden_dim, flat_hidden_dim, device)
56 self.num_labels_flat=len(tagset_flat)
57
---> 58 self.tokenizer = BertTokenizer.from_pretrained(modelName, do_lower_case=False, do_basic_tokenize=False)
59 self.bert = BertModel.from_pretrained(modelName)
60
c:\Users\denis\Anaconda3\envs\booknlp\lib\site-packages\transformers\tokenization_utils_base.py in from_pretrained(cls, pretrained_model_name_or_path, *init_inputs, **kwargs)
1662 use_auth_token=use_auth_token,
1663 revision=revision,
-> 1664 local_files_only=local_files_only,
1665 )
1666 if resolved_config_file is not None:
c:\Users\denis\Anaconda3\envs\booknlp\lib\site-packages\transformers\file_utils.py in get_file_from_repo(path_or_repo, filename, cache_dir, force_download, resume_download, proxies, use_auth_token, revision, local_files_only)
2246 logger.error(err)
2247 raise EnvironmentError(
-> 2248 f"{path_or_repo} is not a local folder and is not a valid model identifier "
2249 "listed on 'https://huggingface.co/models'\nIf this is a private repository, make sure to "
2250 "pass a token having permission to this repo with
use_auth_token
or log in with "OSError: C:\Users\denis\booknlps\entities_google/bert_uncased_L-6_H-768_A-12 is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo with
use_auth_token
or log in withhuggingface-cli login
and passuse_auth_token=True
The text was updated successfully, but these errors were encountered: