Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apollo 2.6.5 Upgrade Issue #2630

Open
cross12tamu opened this issue Aug 25, 2021 · 2 comments
Open

Apollo 2.6.5 Upgrade Issue #2630

cross12tamu opened this issue Aug 25, 2021 · 2 comments

Comments

@cross12tamu
Copy link

cross12tamu commented Aug 25, 2021

Howdy,

I'm in the midst of trying to uncover various issues with our Galaxy<-->Apollo stack and bridge.

Specifically, we were plagued recently by the findAllOrganisms call (see #2626), which was bricking some of the Galaxy<-->Apollo API tools, causing workflows (and sometimes the individual tool itself) in galaxy to timeout. It takes some of the tools > 5 minutes to load.

We are currently running the following Apollo version to have our system work:

Version: 2.6.2-SNAPSHOT
Grails version: 2.5.5
Groovy version: 2.4.4
JVM version: 1.8.0_265
Servlet Container Version: Apache Tomcat/9.0.16 (Ubuntu)
JBrowse config: 1.16.10-release
JBrowse url: https://github.com/GMOD/jbrowse

note: at least with the timeout increase I made in Apache, we can have the workflows work for now.

However, this issue persists (I have the timeout set to 7 minutes) when using the newest Apollo version, 2.6.5. Additionally, I'm seeing insane server load spikes on the machine that we have Apollo on with the 2.6.5 image.

So, our stack currently works with 2.6.2. However, we still have a different issue, as we have been trying to address the problems in #2607 which we think is fixed in the new version of Apollo, but we are unable to upgrade due to the issues we have listed (see below).

The high load, that I thought was okay when taking this screenshot:

compute1_apollo_load

but after about ~18 hours on the image, we ran out of memory:

8/19/2021 7:58:36 AM2021-08-19 12:58:36,154 [http-nio-8080-exec-48] ERROR errors.GrailsExceptionResolver  - OutOfMemoryError occurred when processing request: [POST] /apollo/organism/findAllOrganisms
8/19/2021 7:58:36 AMJava heap space. Stacktrace follows:
8/19/2021 7:58:36 AMorg.codehaus.groovy.grails.web.servlet.mvc.exceptions.ControllerExecutionException: Executing action [findAllOrganisms] of controller [org.bbop.apollo.OrganismController]  caused exception: Runtime error executing action
8/19/2021 7:58:36 AM	at grails.plugin.cache.web.filter.PageFragmentCachingFilter.doFilter(PageFragmentCachingFilter.java:198)
8/19/2021 7:58:36 AM	at grails.plugin.cache.web.filter.AbstractFilter.doFilter(AbstractFilter.java:63)
8/19/2021 7:58:36 AM	at org.apache.shiro.web.servlet.AbstractShiroFilter.executeChain(AbstractShiroFilter.java:449)
8/19/2021 7:58:36 AM	at org.apache.shiro.web.servlet.AbstractShiroFilter$1.call(AbstractShiroFilter.java:365)
8/19/2021 7:58:36 AM	at org.apache.shiro.subject.support.SubjectCallable.doCall(SubjectCallable.java:90)
8/19/2021 7:58:36 AM	at org.apache.shiro.subject.support.SubjectCallable.call(SubjectCallable.java:83)
8/19/2021 7:58:36 AM	at org.apache.shiro.subject.support.DelegatingSubject.execute(DelegatingSubject.java:383)
8/19/2021 7:58:36 AM	at org.apache.shiro.web.servlet.AbstractShiroFilter.doFilterInternal(AbstractShiroFilter.java:362)
8/19/2021 7:58:36 AM	at org.apache.shiro.web.servlet.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:125)
8/19/2021 7:58:36 AM	at com.brandseye.cors.CorsFilter.doFilter(CorsFilter.java:82)
8/19/2021 7:58:36 AM	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
8/19/2021 7:58:36 AM	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
8/19/2021 7:58:36 AM	at java.lang.Thread.run(Thread.java:748)
8/19/2021 7:58:36 AMCaused by: org.codehaus.groovy.grails.web.servlet.mvc.exceptions.ControllerExecutionException: Runtime error executing action
8/19/2021 7:58:36 AM	... 13 more
8/19/2021 7:58:36 AMCaused by: java.lang.reflect.InvocationTargetException
8/19/2021 7:58:36 AM	... 13 more
8/19/2021 7:58:36 AMCaused by: java.lang.OutOfMemoryError: Java heap space

Our API remapper error from the findAllOrganisms call:

021/08/17 15:10:44 [error] 17#17: *68845 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 10.42.0.1, server: , request: "POST /apollo_api/organism/findAllOrganisms HTTP/1.1", upstream: "http://192.168.0.4:8999/apollo/organism/findAllOrganisms", host: "192.168.0.1:9990"

Some various galaxy logs/errors while trying to run the workflow, although some are individual tools too iirc:

galaxy.tools.parameters.basic DEBUG 2021-08-17 09:39:25,504 [p:3081815,w:1,m:0] [uWSGIWorker1Core3] Error determining dy
namic options for parameter 'org_select' in tool 'export':
Traceback (most recent call last):
  File "lib/galaxy/tools/parameters/basic.py", line 820, in get_options
    return eval(self.dynamic_options, self.tool.code_namespace, call_other_values)
  File "<string>", line 1, in <module>
  File "/galaxy/tools/cpt2/galaxy-tools/tools/webapollo/gga-apollo/webapollo.py", line 675, in galaxy_list_orgs
    data = _galaxy_list_orgs(wa, gx_user, *args, **kwargs)
  File "/galaxy/tools/cpt2/galaxy-tools/tools/webapollo/gga-apollo/webapollo.py", line 689, in _galaxy_list_orgs
    all_orgs = wa.organisms.findAllOrganisms()
  File "/galaxy/tools/cpt2/galaxy-tools/tools/webapollo/gga-apollo/webapollo.py", line 581, in findAllOrganisms
    orgs = self.request('findAllOrganisms', {})
  File "/galaxy/tools/cpt2/galaxy-tools/tools/webapollo/gga-apollo/webapollo.py", line 545, in request
    (r.status_code, r.text))
Exception: Unexpected response from apollo 504: <html>
<head><title>504 Gateway Time-out</title></head>
<body>
<center><h1>504 Gateway Time-out</h1></center>
<hr><center>nginx/1.19.5</center>
</body>
</html>
galaxy.tools.parameters.basic DEBUG 2021-08-17 09:40:54,238 [p:3081818,w:2,m:0] [uWSGIWorker2Core0] Error determining dy
namic options for parameter 'org_select' in tool 'fetch_jbrowse':
Traceback (most recent call last):
  File "lib/galaxy/tools/parameters/basic.py", line 820, in get_options
    return eval(self.dynamic_options, self.tool.code_namespace, call_other_values)
  File "<string>", line 1, in <module>
  File "/galaxy/tools/cpt2/galaxy-tools/tools/webapollo/gga-apollo/webapollo.py", line 675, in galaxy_list_orgs
    data = _galaxy_list_orgs(wa, gx_user, *args, **kwargs)
  File "/galaxy/tools/cpt2/galaxy-tools/tools/webapollo/gga-apollo/webapollo.py", line 689, in _galaxy_list_orgs
    all_orgs = wa.organisms.findAllOrganisms()
  File "/galaxy/tools/cpt2/galaxy-tools/tools/webapollo/gga-apollo/webapollo.py", line 581, in findAllOrganisms
    orgs = self.request('findAllOrganisms', {})
  File "/galaxy/tools/cpt2/galaxy-tools/tools/webapollo/gga-apollo/webapollo.py", line 545, in request
    (r.status_code, r.text))
Exception: Unexpected response from apollo 504: <html>
<head><title>504 Gateway Time-out</title></head>
<body>
<center><h1>504 Gateway Time-out</h1></center>
<hr><center>nginx/1.19.5</center>
</body>
</html>

Let me know what y'all think I should do to get this Apollo updated and working. Let me know also if you need anything else from me.

@garrettjstevens
Copy link
Contributor

I'm not sure what would have happened in the latest releases to make #2626 worse. Have you tried any intermediate versions between 2.6.2 and 2.6.5?

@cross12tamu
Copy link
Author

I'll try and 2.6.4 and get back to y'all. The best case is I'll give it a whirl this weekend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants