Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download speeds very slow on initial startup #5

Open
Ori-Pixel opened this issue Mar 23, 2022 · 0 comments
Open

Download speeds very slow on initial startup #5

Ori-Pixel opened this issue Mar 23, 2022 · 0 comments

Comments

@Ori-Pixel
Copy link

Hi, the download seems to take 4 hours for the bert .model files from the server end. Is there a way to wget or curl them into a directory? Also, if one terminates the program, the files are still partially written in and cause an unzipping error in pytorch. Is there a plan to mitigate this in the future with tempfile downloads?

minimal example:

import booknlp
from booknlp.booknlp import BookNLP
import spacy
spacy.load('en_core_web_sm')
model_params = {
    "pipeline": "entity,quote,supersense,event,coref",
    "model": "big"
}

booknlp = BookNLP("en", model_params)

# Input file to process
input_file = "input_dir/bartleby.txt"

# Output directory to store resulting files in
output_directory = "output_dir/bartleby/"

# File within this directory will be named ${book_id}.entities, ${book_id}.tokens, etc.
book_id = "bartleby"

booknlp.process(input_file, output_directory, book_id)

https://i.imgur.com/FZIqNsC.png

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant