Becomes slow with huge text #156

deepaksinghtopwal · 2021-06-01T11:26:59Z

it seems to work fine with small text data however when i tried to use the same for document(approx 2000 lines) , it became way too slow..
and took around 20 mins to summarize 50 documents.
So is there any parameter , specific algo which can be used to solve this issue.

miso-belica · 2021-06-02T09:28:13Z

Hi, well, it's hard to say from the description. Can you provide the example text and the command/code you tried?

miso-belica · 2022-01-07T17:12:59Z

ping @deepaksinghtopwal 🏓 🙂

mrx23dot · 2022-07-08T21:10:51Z

run this on it, upload prof.txt
python -m cProfile -s tottime test.py > prof.txt

mrx23dot · 2022-07-10T11:58:22Z

I can summarise books in 30secs, with segments of ~10 sentences with LSA.
Bottlenecks are:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   231014    3.884    0.000    7.534    0.000 snowball.py:1406(stem)
 12916193    2.776    0.000    2.776    0.000 {method 'endswith' of 'str' objects}
   367571    1.442    0.000    1.858    0.000 {method 'sub' of 're.Pattern' objects}
      947    0.823    0.001    0.883    0.001 lsa.py:89(_compute_term_frequency)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Becomes slow with huge text #156

Becomes slow with huge text #156

deepaksinghtopwal commented Jun 1, 2021

miso-belica commented Jun 2, 2021

miso-belica commented Jan 7, 2022

mrx23dot commented Jul 8, 2022

mrx23dot commented Jul 10, 2022

Becomes slow with huge text #156

Becomes slow with huge text #156

Comments

deepaksinghtopwal commented Jun 1, 2021

miso-belica commented Jun 2, 2021

miso-belica commented Jan 7, 2022

mrx23dot commented Jul 8, 2022

mrx23dot commented Jul 10, 2022