Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

occasional error: SSL SYSCALL error: EOF detected #150

Open
miguelcleon opened this issue Jun 23, 2017 · 18 comments
Open

occasional error: SSL SYSCALL error: EOF detected #150

miguelcleon opened this issue Jun 23, 2017 · 18 comments
Labels

Comments

@miguelcleon
Copy link
Member

miguelcleon commented Jun 23, 2017

Seemingly at random, I'll get the below error. Then WOFpy stops working and I need to reload apache to get it working again. I've been trying to figure out a reproducible way to get this error but I haven't found it yet. If I do I'll update this issue. I was also going to post the 2nd error you get after this one but again because I can't reproduce it, now I'm not getting it. I'll add the second error when it happens again.


<ns0:Fault xmlns:ns0="http://schemas.xmlsoap.org/soap/envelope/">
<faultcode>soap11env:Server</faultcode>
<faultstring>
(psycopg2.OperationalError) SSL SYSCALL error: EOF detected [SQL: 'SELECT DISTINCT odm2.sites.samplingfeatureid AS odm2_sites_samplingfeatureid, odm2.samplingfeatures.samplingfeatureid AS odm2_samplingfeatures_samplingfeatureid, odm2.sites.spatialreferenceid AS odm2_sites_spatialreferenceid, odm2.sites.sitetypecv AS odm2_sites_sitetypecv, odm2.sites.latitude AS odm2_sites_latitude, odm2.sites.longitude AS odm2_sites_longitude, odm2.samplingfeatures.samplingfeatureuuid AS odm2_samplingfeatures_samplingfeatureuuid, odm2.samplingfeatures.samplingfeaturetypecv AS odm2_samplingfeatures_samplingfeaturetypecv, odm2.samplingfeatures.samplingfeaturecode AS odm2_samplingfeatures_samplingfeaturecode, odm2.samplingfeatures.samplingfeaturename AS odm2_samplingfeatures_samplingfeaturename, odm2.samplingfeatures.samplingfeaturedescription AS odm2_samplingfeatures_samplingfeaturedescription, odm2.samplingfeatures.samplingfeaturegeotypecv AS odm2_samplingfeatures_samplingfeaturegeotypecv, odm2.samplingfeatures.elevation_m AS odm2_samplingfeatures_elevation_m, odm2.samplingfeatures.elevationdatumcv AS odm2_samplingfeatures_elevationdatumcv, odm2.samplingfeatures.featuregeometrywkt AS odm2_samplingfeatures_featuregeometrywkt, CASE WHEN (odm2.samplingfeatures.samplingfeaturetypecv = %(samplingfeaturetypecv_1)s) THEN %(param_1)s WHEN (odm2.samplingfeatures.samplingfeaturetypecv = %(samplingfeaturetypecv_2)s) THEN %(param_2)s ELSE %(param_3)s END AS _sa_polymorphic_on \nFROM odm2.samplingfeatures JOIN odm2.sites ON odm2.samplingfeatures.samplingfeatureid = odm2.sites.samplingfeatureid JOIN odm2.featureactions ON odm2.samplingfeatures.samplingfeatureid = odm2.featureactions.samplingfeatureid JOIN (odm2.results JOIN odm2.timeseriesresults ON odm2.results.resultid = odm2.timeseriesresults.resultid) ON odm2.featureactions.featureactionid = odm2.results.featureactionid \nWHERE odm2.featureactions.samplingfeatureid = odm2.sites.samplingfeatureid AND odm2.results.featureactionid = odm2.featureactions.featureactionid AND odm2.sites.latitude >= %(latitude_1)s AND odm2.sites.latitude <= %(latitude_2)s AND odm2.sites.longitude >= %(longitude_1)s AND odm2.sites.longitude <= %(longitude_2)s'] [parameters: {'longitude_1': -114.0, 'longitude_2': -110.0, 'param_1': 'Specimen', 'param_3': 'samplingfeatures', 'param_2': 'Site', 'latitude_2': 42.0, 'latitude_1': 40.0, 'samplingfeaturetypecv_2': 'Site', 'samplingfeaturetypecv_1': 'Specimen'}]
</faultstring>
<faultactor/>
</ns0:Fault>
@miguelcleon
Copy link
Member Author

I'm thinking this may well be a the system running out of ram to store the data in local memory while it's pulling sql records into Django querysets. I've run into that problem while doing data ingestion and had to write some server side scripts to break files into smaller pieces. I don't get this same error but I suspect it is just manifesting differently in this setting.

@lsetiawan
Copy link
Member

@miguelcleon It seems like you have encountered this problem before and solved it? #73 (comment)

@miguelcleon
Copy link
Member Author

@lsetiawan So it's an issue that is appearing intermittently, I initially thought it was the file system running out of space but that doesn't appear to be the case. When you reload apache the error goes away and might not appear again for a bit.

@lsetiawan
Copy link
Member

Hmm... okay.

system running out of ram to store the data in local memory while it's pulling sql records into Django querysets.

how does Django come into play with WOFpy?

@emiliom
Copy link
Member

emiliom commented Jun 26, 2017

The "lazy-apps" setting @lsetiawan has mentioned before, which fixed this issue with a postgresql backend, is probably specific to ngingx, right Don? Assuming it is, maybe there's an equivalent setting/flag in Apache?

@lsetiawan
Copy link
Member

lazy-apps is specific to uWSGI settings. Seems like Apache is usually paired with mod_wsgi, at least from Flask documentation (http://flask.pocoo.org/docs/0.12/deploying/mod_wsgi/)

@miguelcleon
Copy link
Member Author

with the DAO, I would think, loading querysets with lots of SQL records. I'm kind of guessing with the RAM thing, it would need more testing to figure out if that is really happening. Actually all I'd need to do is pull a huge time series and watch top.

@miguelcleon
Copy link
Member Author

miguelcleon commented Jul 18, 2017

after you get the EOF error:

From here http://dev-odm2admin.cuahsi.org/wofpy/odm2timeseries/rest/1_1/GetVariableInfo?variable=odm2timeseries:DO%20Concentration you get:

<ns0:Fault xmlns:ns0="http://schemas.xmlsoap.org/soap/envelope/">
<faultcode>soap11env:Server</faultcode>
<faultstring>
(sqlalchemy.exc.InvalidRequestError) Can't reconnect until invalid transaction is rolled back [SQL: u'SELECT DISTINCT ON (odm2.variables.variableid) odm2.timeseriesresultvalues.valueid AS odm2_timeseriesresultvalues_valueid, odm2.timeseriesresultvalues.resultid AS odm2_timeseriesresultvalues_resultid, odm2.timeseriesresultvalues.datavalue AS odm2_timeseriesresultvalues_datavalue, odm2.timeseriesresultvalues.valuedatetime AS odm2_timeseriesresultvalues_valuedatetime, odm2.timeseriesresultvalues.valuedatetimeutcoffset AS odm2_timeseriesresultvalues_valuedatetimeutcoffset, odm2.timeseriesresultvalues.censorcodecv AS odm2_timeseriesresultvalues_censorcodecv, odm2.timeseriesresultvalues.qualitycodecv AS odm2_timeseriesresultvalues_qualitycodecv, odm2.timeseriesresultvalues.timeaggregationinterval AS odm2_timeseriesresultvalues_timeaggregationinterval, odm2.timeseriesresultvalues.timeaggregationintervalunitsid AS odm2_timeseriesresultvalues_timeaggregationintervalunitsi_1 \nFROM odm2.timeseriesresultvalues JOIN (odm2.results JOIN odm2.timeseriesresults ON odm2.results.resultid = odm2.timeseriesresults.resultid) ON odm2.timeseriesresults.resultid = odm2.timeseriesresultvalues.resultid JOIN odm2.variables ON odm2.variables.variableid = odm2.results.variableid \nWHERE odm2.variables.variableid = odm2.results.variableid AND odm2.variables.variablecode = %(variablecode_1)s'] [parameters: [{}]]
</faultstring>
<faultactor/>
</ns0:Fault>

from here http://dev-odm2admin.cuahsi.org/wofpy/odm2timeseries/rest/1_1/GetSites?site=odm2timeseries:Rio%20Icacos%20Trib-IO you get:

<ns0:Fault xmlns:ns0="http://schemas.xmlsoap.org/soap/envelope/">
<faultcode>soap11env:Server</faultcode>
<faultstring>Site odm2timeseries:Rio Icacos Trib-IO Not Found</faultstring>
<faultactor/>
</ns0:Fault>

from here http://dev-odm2admin.cuahsi.org/wofpy/odm2timeseries/rest/1_1/GetValues?location=odm2timeseries:Rio%20Icacos%20Trib-IO&variable=odm2timeseries:DO%20Concentration you get:


<ns0:Fault xmlns:ns0="http://schemas.xmlsoap.org/soap/envelope/">
<faultcode>soap11env:Server</faultcode>
<faultstring>
Values Not Found for Rio Icacos Trib-IO:DO Concentration for dates None - None
</faultstring>
<faultactor/>
</ns0:Fault>

@lsetiawan
Copy link
Member

@miguelcleon if you restart WOFpy, what happens then?

@miguelcleon
Copy link
Member Author

It will work again.

@lsetiawan
Copy link
Member

I am currently using your ODM2LCZO Database for testing. I am not seeing the problem there so far.

@miguelcleon
Copy link
Member Author

miguelcleon commented Jul 18, 2017

prior to the EOF error I got a timeout error below. The problem doesn't seem to be reproducible unfortunately.

Fault: Fault(Server: "(psycopg2.DatabaseError) SSL SYSCALL error: Connection timed out\\n [SQL: 'SELECT DISTINCT odm2.sites.samplingfeatureid AS odm2_sites_samplingfeatureid, odm2.samplingfeatures.samplingfeatureid AS odm2_samplingfeatures_samplingfeatureid, odm2.sites.spatialreferenceid AS odm2_sites_spatialreferenceid, odm2.sites.sitetypecv AS odm2_sites_sitetypecv, odm2.sites.latitude AS odm2_sites_latitude, odm2.sites.longitude AS odm2_sites_longitude, odm2.samplingfeatures.samplingfeatureuuid AS odm2_samplingfeatures_samplingfeatureuuid, odm2.samplingfeatures.samplingfeaturetypecv AS odm2_samplingfeatures_samplingfeaturetypecv, odm2.samplingfeatures.samplingfeaturecode AS odm2_samplingfeatures_samplingfeaturecode, odm2.samplingfeatures.samplingfeaturename AS odm2_samplingfeatures_samplingfeaturename, odm2.samplingfeatures.samplingfeaturedescription AS odm2_samplingfeatures_samplingfeaturedescription, odm2.samplingfeatures.samplingfeaturegeotypecv AS odm2_samplingfeatures_samplingfeaturegeotypecv, odm2.samplingfeatures.elevation_m AS odm2_samplingfeatures_elevation_m, odm2.samplingfeatures.elevationdatumcv AS odm2_samplingfeatures_elevationdatumcv, odm2.samplingfeatures.featuregeometrywkt AS odm2_samplingfeatures_featuregeometrywkt, CASE WHEN (odm2.samplingfeatures.samplingfeaturetypecv = %(samplingfeaturetypecv_1)s) THEN %(param_1)s WHEN (odm2.samplingfeatures.samplingfeaturetypecv = %(samplingfeaturetypecv_2)s) THEN %(param_2)s ELSE %(param_3)s END AS _sa_polymorphic_on \\\\nFROM odm2.samplingfeatures JOIN odm2.sites ON odm2.samplingfeatures.samplingfeatureid = odm2.sites.samplingfeatureid JOIN odm2.featureactions ON odm2.samplingfeatures.samplingfeatureid = odm2.featureactions.samplingfeatureid JOIN (odm2.results JOIN odm2.timeseriesresults ON odm2.results.resultid = odm2.timeseriesresults.resultid) ON odm2.featureactions.featureactionid = odm2.results.featureactionid \\\\nWHERE odm2.featureactions.samplingfeatureid = odm2.sites.samplingfeatureid AND odm2.results.featureactionid = odm2.featureactions.featureactionid'] [parameters: {'param_1': 'Specimen', 'param_2': 'Site', 'samplingfeaturetypecv_2': 'Site', 'param_3': 'samplingfeatures', 'samplingfeaturetypecv_1': 'Specimen'}]")

@miguelcleon
Copy link
Member Author

I pulled the timeout error from the apache error log.

@lsetiawan
Copy link
Member

Are you using mod_wsgi or uWSGI?

@miguelcleon
Copy link
Member Author

mod_wsgi

@lsetiawan
Copy link
Member

I think you don't have a "graceful" reloading in place. So everytime you reload the browser, it's not killing the session, so eventually your database gets overwhelmed. lazy-apps fixes that with uWSGI/NGINX setup. I am not sure what the equivalent is for mod_wsgi/Apache setup, unless you already figured it out and it's still not working.

@miguelcleon
Copy link
Member Author

ok, I'll look into that.

@lsetiawan
Copy link
Member

lsetiawan commented Aug 29, 2017

@miguelcleon I have hopefully provided some fix to your EOF problem.. Please try to deploy your WOFpy Server again with the latest copy of master and let me know if you encountered the error still. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants