Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tissue Detection Tutorial Broken? #1018

Open
hannah413 opened this issue Jul 18, 2023 · 3 comments
Open

Tissue Detection Tutorial Broken? #1018

hannah413 opened this issue Jul 18, 2023 · 3 comments

Comments

@hannah413
Copy link

New histomicstk user here. Trying to add histomicstk modules to an existing image analysis pipeline and running into a dead end in a digital slide archive tutorial. Any solutions/advice would be much appreciated.

Use case:
I want to perform tissue extraction on my own images. The only tutorial I've found is here, but either it's broken or not clear enough for new users.

Background:
I'm on a linux based cluster using a GPU partition.
I am using jupyter.
I have a working installation of histomicstk.
I have installed girder-jupyter.
I have a kitware account for use of a public girder instance.

Reproducing the error:
If I run through the tutorial verbatim, it pends for a few minutes and then fails at this line with a time out error.
_ = gc.authenticate(apiKey='kri19nTIGOkWH01TbzRqfohaaDWb6kPecRqGmemb')

If I try to get through by using the public API access info by replacing these three lines:

APIURL = 'http://candygram.neurology.emory.edu:8080/api/v1/'
gc = girder_client.GirderClient(apiUrl=APIURL)
_ = gc.authenticate(apiKey='kri19nTIGOkWH01TbzRqfohaaDWb6kPecRqGmemb')

with these ones:

APIURL = 'https://data.kitware.com/api/v1'
gc = girder_client.GirderClient(apiUrl=APIURL)
_ = gc.authenticate('username', 'password')

then I'm unable to use the tutorial as the images can't be accessed at that url (it throws an HttpError).

The Two Error Messages in full:


---------------------------------------------------------------------------
TimeoutError                              Traceback (most recent call last)
File /home/groups/precepts/holly/mamba/envs/nvidia/lib/python3.11/site-packages/urllib3/connection.py:174, in HTTPConnection._new_conn(self)
    173 try:
--> 174     conn = connection.create_connection(
    175         (self._dns_host, self.port), self.timeout, **extra_kw
    176     )
    178 except SocketTimeout:

File /home/groups/precepts/holly/mamba/envs/nvidia/lib/python3.11/site-packages/urllib3/util/connection.py:95, in create_connection(address, timeout, source_address, socket_options)
     94 if err is not None:
---> 95     raise err
     97 raise socket.error("getaddrinfo returns an empty list")

File /home/groups/precepts/holly/mamba/envs/nvidia/lib/python3.11/site-packages/urllib3/util/connection.py:85, in create_connection(address, timeout, source_address, socket_options)
     84     sock.bind(source_address)
---> 85 sock.connect(sa)
     86 return sock

TimeoutError: [Errno 110] Connection timed out

During handling of the above exception, another exception occurred:

ConnectTimeoutError                       Traceback (most recent call last)
File /home/groups/precepts/holly/mamba/envs/nvidia/lib/python3.11/site-packages/urllib3/connectionpool.py:714, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    713 # Make the request on the httplib connection object.
--> 714 httplib_response = self._make_request(
    715     conn,
    716     method,
    717     url,
    718     timeout=timeout_obj,
    719     body=body,
    720     headers=headers,
    721     chunked=chunked,
    722 )
    724 # If we're going to release the connection in ``finally:``, then
    725 # the response doesn't need to know about the connection. Otherwise
    726 # it will also try to release it and we'll have a double-release
    727 # mess.

File /home/groups/precepts/holly/mamba/envs/nvidia/lib/python3.11/site-packages/urllib3/connectionpool.py:415, in HTTPConnectionPool._make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
    414     else:
--> 415         conn.request(method, url, **httplib_request_kw)
    417 # We are swallowing BrokenPipeError (errno.EPIPE) since the server is
    418 # legitimately able to close the connection after sending a valid response.
    419 # With this behaviour, the received response is still readable.

File /home/groups/precepts/holly/mamba/envs/nvidia/lib/python3.11/site-packages/urllib3/connection.py:244, in HTTPConnection.request(self, method, url, body, headers)
    243     headers["User-Agent"] = _get_default_user_agent()
--> 244 super(HTTPConnection, self).request(method, url, body=body, headers=headers)

File /home/groups/precepts/holly/mamba/envs/nvidia/lib/python3.11/http/client.py:1283, in HTTPConnection.request(self, method, url, body, headers, encode_chunked)
   1282 """Send a complete request to the server."""
-> 1283 self._send_request(method, url, body, headers, encode_chunked)

File /home/groups/precepts/holly/mamba/envs/nvidia/lib/python3.11/http/client.py:1329, in HTTPConnection._send_request(self, method, url, body, headers, encode_chunked)
   1328     body = _encode(body, 'body')
-> 1329 self.endheaders(body, encode_chunked=encode_chunked)

File /home/groups/precepts/holly/mamba/envs/nvidia/lib/python3.11/http/client.py:1278, in HTTPConnection.endheaders(self, message_body, encode_chunked)
   1277     raise CannotSendHeader()
-> 1278 self._send_output(message_body, encode_chunked=encode_chunked)

File /home/groups/precepts/holly/mamba/envs/nvidia/lib/python3.11/http/client.py:1038, in HTTPConnection._send_output(self, message_body, encode_chunked)
   1037 del self._buffer[:]
-> 1038 self.send(msg)
   1040 if message_body is not None:
   1041 
   1042     # create a consistent interface to message_body

File /home/groups/precepts/holly/mamba/envs/nvidia/lib/python3.11/http/client.py:976, in HTTPConnection.send(self, data)
    975 if self.auto_open:
--> 976     self.connect()
    977 else:

File /home/groups/precepts/holly/mamba/envs/nvidia/lib/python3.11/site-packages/urllib3/connection.py:205, in HTTPConnection.connect(self)
    204 def connect(self):
--> 205     conn = self._new_conn()
    206     self._prepare_conn(conn)

File /home/groups/precepts/holly/mamba/envs/nvidia/lib/python3.11/site-packages/urllib3/connection.py:179, in HTTPConnection._new_conn(self)
    178 except SocketTimeout:
--> 179     raise ConnectTimeoutError(
    180         self,
    181         "Connection to %s timed out. (connect timeout=%s)"
    182         % (self.host, self.timeout),
    183     )
    185 except SocketError as e:

ConnectTimeoutError: (<urllib3.connection.HTTPConnection object at 0x7f310982c490>, 'Connection to candygram.neurology.emory.edu timed out. (connect timeout=None)')

During handling of the above exception, another exception occurred:

MaxRetryError                             Traceback (most recent call last)
File /home/groups/precepts/holly/mamba/envs/nvidia/lib/python3.11/site-packages/requests/adapters.py:486, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
    485 try:
--> 486     resp = conn.urlopen(
    487         method=request.method,
    488         url=url,
    489         body=request.body,
    490         headers=request.headers,
    491         redirect=False,
    492         assert_same_host=False,
    493         preload_content=False,
    494         decode_content=False,
    495         retries=self.max_retries,
    496         timeout=timeout,
    497         chunked=chunked,
    498     )
    500 except (ProtocolError, OSError) as err:

File /home/groups/precepts/holly/mamba/envs/nvidia/lib/python3.11/site-packages/urllib3/connectionpool.py:798, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    796     e = ProtocolError("Connection aborted.", e)
--> 798 retries = retries.increment(
    799     method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
    800 )
    801 retries.sleep()

File /home/groups/precepts/holly/mamba/envs/nvidia/lib/python3.11/site-packages/urllib3/util/retry.py:592, in Retry.increment(self, method, url, response, error, _pool, _stacktrace)
    591 if new_retry.is_exhausted():
--> 592     raise MaxRetryError(_pool, url, error or ResponseError(cause))
    594 log.debug("Incremented Retry for (url='%s'): %r", url, new_retry)

MaxRetryError: HTTPConnectionPool(host='candygram.neurology.emory.edu', port=8080): Max retries exceeded with url: /api/v1/api_key/token?key=kri19nTIGOkWH01TbzRqfohaaDWb6kPecRqGmemb (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7f310982c490>, 'Connection to candygram.neurology.emory.edu timed out. (connect timeout=None)'))

During handling of the above exception, another exception occurred:

ConnectTimeout                            Traceback (most recent call last)
Cell In[20], line 1
----> 1 _ = gc.authenticate(apiKey='kri19nTIGOkWH01TbzRqfohaaDWb6kPecRqGmemb')

File /home/groups/precepts/holly/mamba/envs/nvidia/lib/python3.11/site-packages/girder_client/__init__.py:288, in GirderClient.authenticate(self, username, password, interactive, apiKey)
    257 """
    258 Authenticate to Girder, storing the token that comes back to be used in
    259 future requests. This method can be used in two modes, either username
   (...)
    285 :type apiKey: str
    286 """
    287 if apiKey:
--> 288     resp = self.post('api_key/token', parameters={
    289         'key': apiKey
    290     })
    291     self.setToken(resp['authToken']['token'])
    292 else:

File /home/groups/precepts/holly/mamba/envs/nvidia/lib/python3.11/site-packages/girder_client/__init__.py:478, in GirderClient.post(self, path, parameters, files, data, json, headers, jsonResp)
    473 def post(self, path, parameters=None, files=None, data=None, json=None, headers=None,
    474          jsonResp=True):
    475     """
    476     Convenience method to call :py:func:`sendRestRequest` with the 'POST' HTTP method.
    477     """
--> 478     return self.sendRestRequest('POST', path, parameters, files=files,
    479                                 data=data, json=json, headers=headers, jsonResp=jsonResp)

File /home/groups/precepts/holly/mamba/envs/nvidia/lib/python3.11/site-packages/girder_client/__init__.py:452, in GirderClient.sendRestRequest(self, method, path, parameters, data, files, json, headers, jsonResp, **kwargs)
    449 if isinstance(headers, dict):
    450     _headers.update(headers)
--> 452 result = f(
    453     url, params=parameters, data=data, files=files, json=json, headers=_headers,
    454     **kwargs)
    456 # If success, return the json object. Otherwise throw an exception.
    457 if result.ok:

File /home/groups/precepts/holly/mamba/envs/nvidia/lib/python3.11/site-packages/requests/api.py:115, in post(url, data, json, **kwargs)
    103 def post(url, data=None, json=None, **kwargs):
    104     r"""Sends a POST request.
    105 
    106     :param url: URL for the new :class:`Request` object.
   (...)
    112     :rtype: requests.Response
    113     """
--> 115     return request("post", url, data=data, json=json, **kwargs)

File /home/groups/precepts/holly/mamba/envs/nvidia/lib/python3.11/site-packages/requests/api.py:59, in request(method, url, **kwargs)
     55 # By using the 'with' statement we are sure the session is closed, thus we
     56 # avoid leaving sockets open which can trigger a ResourceWarning in some
     57 # cases, and look like a memory leak in others.
     58 with sessions.Session() as session:
---> 59     return session.request(method=method, url=url, **kwargs)

File /home/groups/precepts/holly/mamba/envs/nvidia/lib/python3.11/site-packages/requests/sessions.py:589, in Session.request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
    584 send_kwargs = {
    585     "timeout": timeout,
    586     "allow_redirects": allow_redirects,
    587 }
    588 send_kwargs.update(settings)
--> 589 resp = self.send(prep, **send_kwargs)
    591 return resp

File /home/groups/precepts/holly/mamba/envs/nvidia/lib/python3.11/site-packages/requests/sessions.py:703, in Session.send(self, request, **kwargs)
    700 start = preferred_clock()
    702 # Send the request
--> 703 r = adapter.send(request, **kwargs)
    705 # Total elapsed time of the request (approximately)
    706 elapsed = preferred_clock() - start

File /home/groups/precepts/holly/mamba/envs/nvidia/lib/python3.11/site-packages/requests/adapters.py:507, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
    504 if isinstance(e.reason, ConnectTimeoutError):
    505     # TODO: Remove this in 3.0.0: see #2811
    506     if not isinstance(e.reason, NewConnectionError):
--> 507         raise ConnectTimeout(e, request=request)
    509 if isinstance(e.reason, ResponseError):
    510     raise RetryError(e, request=request)

ConnectTimeout: HTTPConnectionPool(host='candygram.neurology.emory.edu', port=8080): Max retries exceeded with url: /api/v1/api_key/token?key=kri19nTIGOkWH01TbzRqfohaaDWb6kPecRqGmemb (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7f310982c490>, 'Connection to candygram.neurology.emory.edu timed out. (connect timeout=None)'))

Second error message starts here:


HttpError                                 Traceback (most recent call last)
Cell In[6], line 1
----> 1 thumbnail_rgb = get_slide_thumbnail(gc, SAMPLE_SLIDE_ID)

File /home/groups/precepts/holly/mamba/envs/nvidia/lib/python3.11/site-packages/histomicstk/saliency/tissue_detection.py:38, in get_slide_thumbnail(gc, slide_id)
     22 """Get slide thumbnail using girder client.
     23 
     24 Parameters
   (...)
     35 
     36 """
     37 getStr = '/item/%s/tiles/thumbnail' % (slide_id)
---> 38 resp = gc.get(getStr, jsonResp=False)
     39 return get_image_from_htk_response(resp)

File /home/groups/precepts/holly/mamba/envs/nvidia/lib/python3.11/site-packages/girder_client/__init__.py:471, in GirderClient.get(self, path, parameters, jsonResp)
    467 def get(self, path, parameters=None, jsonResp=True):
    468     """
    469     Convenience method to call :py:func:`sendRestRequest` with the 'GET' HTTP method.
    470     """
--> 471     return self.sendRestRequest('GET', path, parameters, jsonResp=jsonResp)

File /home/groups/precepts/holly/mamba/envs/nvidia/lib/python3.11/site-packages/girder_client/__init__.py:463, in GirderClient.sendRestRequest(self, method, path, parameters, data, files, json, headers, jsonResp, **kwargs)
    461         return result
    462 else:
--> 463     raise HttpError(
    464         status=result.status_code, url=result.url, method=method, text=result.text,
    465         response=result)

HttpError: HTTP error 400: GET https://data.kitware.com/api/v1//item/5d817f5abd4404c6b1f744bb/tiles/thumbnail
Response text: {"message": "No matching route for \"GET 5d817f5abd4404c6b1f744bb/tiles/thumbnail\"", "type": "rest"}

@cooperlab
Copy link
Contributor

Hello Hannah, the example could admittedly be better.

This cell is using the girder client to get a thumbnail image from a server that is hosting whole-slide images. It is not a necessary step - you can bypass this and provide thumbnail images by other means like using tifftools, large_image, or openslide.

APIURL = 'http://candygram.neurology.emory.edu:8080/api/v1/'
# SAMPLE_SLIDE_ID = '5d586d57bd4404c6b1f28640'
SAMPLE_SLIDE_ID = "5d817f5abd4404c6b1f744bb"

gc = girder_client.GirderClient(apiUrl=APIURL)
# gc.authenticate(interactive=True)
_ = gc.authenticate(apiKey='kri19nTIGOkWH01TbzRqfohaaDWb6kPecRqGmemb')

@hannah413
Copy link
Author

Thanks for the quick response--I ended up using histolab as it was more intuitive and quicker to get up and running. I may still try to work with histomicstk for things like Reinhard stain normalization as histolab hasn't made that available in their most recent conda installation. Still, I talked to some colleagues with more image analysis experience and confusion around girder is what barred them from using histomicstk. Hoping I can bypass it like you suggested.

@cooperlab
Copy link
Contributor

Hannah - Girder and DSA is intended as an enterprise data management solution for large digital pathology datasets. The only connection to HistomicsTK is that you can use Girder as a source for reading remote data (as this example illustrates poorly). There is also a container that deploys HistomicsTK algorithms through the platform user interface.

HistomicsTK is a stand-alone python library that can use any method of loading images that you like. You can use it completely independent from Girder if you are working with local data or another hosting solution.

These things need to be illustrated better. It's difficult to get students to do it, and the people who have the requisite knowledge never seem to have the time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants