Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error while using biobert PubMed PMC #154

Open
aozorahime opened this issue Nov 3, 2022 · 0 comments
Open

error while using biobert PubMed PMC #154

aozorahime opened this issue Nov 3, 2022 · 0 comments

Comments

@aozorahime
Copy link

Hi, I am totally interested in this NLU biobert library. its totally easy to implement yet understandable. However, I faced difficulties while to use this NLU biobert for my project. So I wanna run this code:

`import nlu

embeddings_df2 = nlu.load('en.embed.biobert.pubmed_pmc_base_cased', gpu=True).predict(df['text'], output_level='token')
embeddings_df2`

I am using google colab with GPU. After approximately 40 mins, its suddenly stopped and resulted the error

biobert_pubmed_pmc_base_cased download started this may take some time.
Approximate size to download 386.7 MB
[OK!]
sentence_detector_dl download started this may take some time.
Approximate size to download 354.6 KB
[OK!]


Exception happened during processing of request from ('127.0.0.1', 40522)
ERROR:root:Exception while sending command.
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/py4j/java_gateway.py", line 1207, in send_command
raise Py4JNetworkError("Answer from Java side is empty")
py4j.protocol.Py4JNetworkError: Answer from Java side is empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/py4j/java_gateway.py", line 1033, in send_command
response = connection.send_command(command)
File "/usr/local/lib/python3.7/dist-packages/py4j/java_gateway.py", line 1212, in send_command
"Error while receiving", e, proto.ERROR_ON_RECEIVE)
py4j.protocol.Py4JNetworkError: Error while receiving
Traceback (most recent call last):
File "/usr/lib/python3.7/socketserver.py", line 316, in _handle_request_noblock
self.process_request(request, client_address)
File "/usr/lib/python3.7/socketserver.py", line 347, in process_request
self.finish_request(request, client_address)
File "/usr/lib/python3.7/socketserver.py", line 360, in finish_request
self.RequestHandlerClass(request, client_address, self)
File "/usr/lib/python3.7/socketserver.py", line 720, in init
self.handle()
File "/usr/local/lib/python3.7/dist-packages/pyspark/accumulators.py", line 268, in handle
poll(accum_updates)
File "/usr/local/lib/python3.7/dist-packages/pyspark/accumulators.py", line 241, in poll
if func():
File "/usr/local/lib/python3.7/dist-packages/pyspark/accumulators.py", line 245, in accum_updates
num_updates = read_int(self.rfile)
File "/usr/local/lib/python3.7/dist-packages/pyspark/serializers.py", line 595, in read_int
raise EOFError
EOFError

ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (127.0.0.1:35473)
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/nlu/pipe/pipeline.py", line 438, in predict
self.configure_light_pipe_usage(data.count(), multithread)
File "/usr/local/lib/python3.7/dist-packages/pyspark/sql/dataframe.py", line 585, in count
return int(self._jdf.count())
File "/usr/local/lib/python3.7/dist-packages/py4j/java_gateway.py", line 1305, in call
answer, self.gateway_client, self.target_id, self.name)
File "/usr/local/lib/python3.7/dist-packages/pyspark/sql/utils.py", line 128, in deco
return f(*a, **kw)
File "/usr/local/lib/python3.7/dist-packages/py4j/protocol.py", line 336, in get_return_value
format(target_id, ".", name))
py4j.protocol.Py4JError: An error occurred while calling o1231.count

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/py4j/java_gateway.py", line 977, in _get_connection
connection = self.deque.pop()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/py4j/java_gateway.py", line 1115, in start
self.socket.connect((self.address, self.port))
ConnectionRefusedError: [Errno 111] Connection refused
Exception occured
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/nlu/pipe/pipeline.py", line 438, in predict
self.configure_light_pipe_usage(data.count(), multithread)
File "/usr/local/lib/python3.7/dist-packages/pyspark/sql/dataframe.py", line 585, in count
return int(self._jdf.count())
File "/usr/local/lib/python3.7/dist-packages/py4j/java_gateway.py", line 1305, in call
answer, self.gateway_client, self.target_id, self.name)
File "/usr/local/lib/python3.7/dist-packages/pyspark/sql/utils.py", line 128, in deco
return f(*a, **kw)
File "/usr/local/lib/python3.7/dist-packages/py4j/protocol.py", line 336, in get_return_value
format(target_id, ".", name))
py4j.protocol.Py4JError: An error occurred while calling o1231.count

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/nlu/pipe/pipeline.py", line 435, in predict
data, stranger_features, output_datatype = DataConversionUtils.to_spark_df(data, self.spark, self.raw_text_column)
TypeError: cannot unpack non-iterable NoneType object
Exception occured
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/nlu/pipe/pipeline.py", line 438, in predict
self.configure_light_pipe_usage(data.count(), multithread)
File "/usr/local/lib/python3.7/dist-packages/pyspark/sql/dataframe.py", line 585, in count
return int(self._jdf.count())
File "/usr/local/lib/python3.7/dist-packages/py4j/java_gateway.py", line 1305, in call
answer, self.gateway_client, self.target_id, self.name)
File "/usr/local/lib/python3.7/dist-packages/pyspark/sql/utils.py", line 128, in deco
return f(*a, **kw)
File "/usr/local/lib/python3.7/dist-packages/py4j/protocol.py", line 336, in get_return_value
format(target_id, ".", name))
py4j.protocol.Py4JError: An error occurred while calling o1231.count

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/nlu/pipe/pipeline.py", line 435, in predict
data, stranger_features, output_datatype = DataConversionUtils.to_spark_df(data, self.spark, self.raw_text_column)
TypeError: cannot unpack non-iterable NoneType object
Exception occured
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/nlu/pipe/pipeline.py", line 438, in predict
self.configure_light_pipe_usage(data.count(), multithread)
File "/usr/local/lib/python3.7/dist-packages/pyspark/sql/dataframe.py", line 585, in count
return int(self._jdf.count())
File "/usr/local/lib/python3.7/dist-packages/py4j/java_gateway.py", line 1305, in call
answer, self.gateway_client, self.target_id, self.name)
File "/usr/local/lib/python3.7/dist-packages/pyspark/sql/utils.py", line 128, in deco
return f(*a, **kw)
File "/usr/local/lib/python3.7/dist-packages/py4j/protocol.py", line 336, in get_return_value
format(target_id, ".", name))
py4j.protocol.Py4JError: An error occurred while calling o1231.count

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/nlu/pipe/pipeline.py", line 435, in predict
data, stranger_features, output_datatype = DataConversionUtils.to_spark_df(data, self.spark, self.raw_text_column)
TypeError: cannot unpack non-iterable NoneType object
Exception occured
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/nlu/pipe/pipeline.py", line 438, in predict
self.configure_light_pipe_usage(data.count(), multithread)
File "/usr/local/lib/python3.7/dist-packages/pyspark/sql/dataframe.py", line 585, in count
return int(self._jdf.count())
File "/usr/local/lib/python3.7/dist-packages/py4j/java_gateway.py", line 1305, in call
answer, self.gateway_client, self.target_id, self.name)
File "/usr/local/lib/python3.7/dist-packages/pyspark/sql/utils.py", line 128, in deco
return f(*a, **kw)
File "/usr/local/lib/python3.7/dist-packages/py4j/protocol.py", line 336, in get_return_value
format(target_id, ".", name))
py4j.protocol.Py4JError: An error occurred while calling o1231.count

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/nlu/pipe/pipeline.py", line 435, in predict
data, stranger_features, output_datatype = DataConversionUtils.to_spark_df(data, self.spark, self.raw_text_column)
TypeError: cannot unpack non-iterable NoneType object
Exception occured
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/nlu/pipe/pipeline.py", line 438, in predict
self.configure_light_pipe_usage(data.count(), multithread)
File "/usr/local/lib/python3.7/dist-packages/pyspark/sql/dataframe.py", line 585, in count
return int(self._jdf.count())
File "/usr/local/lib/python3.7/dist-packages/py4j/java_gateway.py", line 1305, in call
answer, self.gateway_client, self.target_id, self.name)
File "/usr/local/lib/python3.7/dist-packages/pyspark/sql/utils.py", line 128, in deco
return f(*a, **kw)
File "/usr/local/lib/python3.7/dist-packages/py4j/protocol.py", line 336, in get_return_value
format(target_id, ".", name))
py4j.protocol.Py4JError: An error occurred while calling o1231.count

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/nlu/pipe/pipeline.py", line 435, in predict
data, stranger_features, output_datatype = DataConversionUtils.to_spark_df(data, self.spark, self.raw_text_column)
TypeError: cannot unpack non-iterable NoneType object
ERROR:nlu:Exception occured
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/nlu/pipe/pipeline.py", line 438, in predict
self.configure_light_pipe_usage(data.count(), multithread)
File "/usr/local/lib/python3.7/dist-packages/pyspark/sql/dataframe.py", line 585, in count
return int(self._jdf.count())
File "/usr/local/lib/python3.7/dist-packages/py4j/java_gateway.py", line 1305, in call
answer, self.gateway_client, self.target_id, self.name)
File "/usr/local/lib/python3.7/dist-packages/pyspark/sql/utils.py", line 128, in deco
return f(*a, **kw)
File "/usr/local/lib/python3.7/dist-packages/py4j/protocol.py", line 336, in get_return_value
format(target_id, ".", name))
py4j.protocol.Py4JError: An error occurred while calling o1231.count

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/nlu/pipe/pipeline.py", line 435, in predict
data, stranger_features, output_datatype = DataConversionUtils.to_spark_df(data, self.spark, self.raw_text_column)
TypeError: cannot unpack non-iterable NoneType object
No accepted Data type or usable columns found or applying the NLU models failed.
Make sure that the first column you pass to .predict() is the one that nlu should predict on OR rename the column you want to predict on to 'text'
On try to reset restart Jupyter session and run the setup script again, you might have used too much memory
Full Stacktrace was (<class 'TypeError'>, TypeError('cannot unpack non-iterable NoneType object'), <traceback object at 0x7f4ed5dd60f0>)
Additional info:
<class 'TypeError'> pipeline.py 435
cannot unpack non-iterable NoneType object
Stuck? Contact us on Slack! https://join.slack.com/t/spark-nlp/shared_invite/zt-lutct9gm-kuUazcyFKhuGY3_0AMkxqA

I already tried 2-3 times. in my opinion, probably due to RAM exceeding. However, I already activated the GPU itself. Any solution for this? Thanks in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant