Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Becomes slow with huge text #156

Open
deepaksinghtopwal opened this issue Jun 1, 2021 · 4 comments
Open

Becomes slow with huge text #156

deepaksinghtopwal opened this issue Jun 1, 2021 · 4 comments

Comments

@deepaksinghtopwal
Copy link

it seems to work fine with small text data however when i tried to use the same for document(approx 2000 lines) , it became way too slow..
and took around 20 mins to summarize 50 documents.
So is there any parameter , specific algo which can be used to solve this issue.

@miso-belica
Copy link
Owner

Hi, well, it's hard to say from the description. Can you provide the example text and the command/code you tried?

@miso-belica
Copy link
Owner

ping @deepaksinghtopwal 🏓 🙂

@mrx23dot
Copy link

mrx23dot commented Jul 8, 2022

run this on it, upload prof.txt
python -m cProfile -s tottime test.py > prof.txt

@mrx23dot
Copy link

I can summarise books in 30secs, with segments of ~10 sentences with LSA.
Bottlenecks are:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   231014    3.884    0.000    7.534    0.000 snowball.py:1406(stem)
 12916193    2.776    0.000    2.776    0.000 {method 'endswith' of 'str' objects}
   367571    1.442    0.000    1.858    0.000 {method 'sub' of 're.Pattern' objects}
      947    0.823    0.001    0.883    0.001 lsa.py:89(_compute_term_frequency)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants