Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault when Rails enters parallell testing mode #39

Open
buffpojken opened this issue Mar 27, 2023 · 24 comments
Open

Segfault when Rails enters parallell testing mode #39

buffpojken opened this issue Mar 27, 2023 · 24 comments

Comments

@buffpojken
Copy link

buffpojken commented Mar 27, 2023

Steps to reproduce

Include rgeo in a Rails 7 project. Add enough tests to trigger tests running in parallell (default is 50 tests). RGeo will segfault with the attached stack trace.

Expected behavior

Tests to run as usual

Actual behavior

Segfault in proj4-communication.

System configuration

Ruby version:
3.1.0

OS:
MacOS

Stack trace

/Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/rgeo-proj4-3.1.1/lib/rgeo/coord_sys/proj4.rb:191: [BUG] Segmentation fault at 0x0000000104c38a70 ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [arm64-darwin21]

c:0049 p:---- s:0264 e:000263 CFUNC  :_create
c:0048 p:0052 s:0258 e:000257 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/rgeo-proj4-3.1.1/lib/rgeo/coord_sys/proj4.rb:191
c:0047 p:0392 s:0251 e:000250 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/rgeo-2.4.0/lib/rgeo/geos/capi_factory.rb:66
c:0046 p:0097 s:0234 e:000233 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/rgeo-2.4.0/lib/rgeo/geos/interface.rb:200
c:0045 p:0060 s:0228 e:000227 BLOCK  /Users/pekkaakerstrom/repos/pattern/app/models/local_plan.rb:113 [FINISH]
c:0044 p:---- s:0217 e:000216 IFUNC 
c:0043 p:---- s:0214 e:000213 CFUNC  :each
c:0042 p:---- s:0211 e:000210 CFUNC  :each_with_index
c:0041 p:0006 s:0207 e:000206 BLOCK  /Users/pekkaakerstrom/repos/pattern/app/models/local_plan.rb:107
c:0040 p:0012 s:0204 e:000203 BLOCK  /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activerecord-7.0.3.1/lib/active_record/connection_adapters/abstract/transaction [FINISH]
c:0039 p:---- s:0197 e:000196 CFUNC  :handle_interrupt
c:0038 p:0029 s:0192 e:000191 BLOCK  /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activesupport-7.0.3.1/lib/active_support/concurrency/load_interlock_aware_monit [FINISH]
c:0037 p:---- s:0189 e:000188 CFUNC  :handle_interrupt
c:0036 p:0021 s:0184 e:000183 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activesupport-7.0.3.1/lib/active_support/concurrency/load_interlock_aware_monit
c:0035 p:0008 s:0179 e:000178 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activerecord-7.0.3.1/lib/active_record/connection_adapters/abstract/transaction
c:0034 p:0053 s:0172 e:000171 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activerecord-7.0.3.1/lib/active_record/connection_adapters/abstract/database_st
c:0033 p:0011 s:0163 e:000162 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activerecord-7.0.3.1/lib/active_record/transactions.rb:209
c:0032 p:0016 s:0157 e:000156 METHOD /Users/pekkaakerstrom/repos/pattern/app/models/local_plan.rb:106
c:0031 p:0032 s:0147 e:000146 BLOCK  /Users/pekkaakerstrom/repos/pattern/test/models/local_plan_test.rb:35 [FINISH]
c:0030 p:0018 s:0142 e:000141 BLOCK  /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/minitest-5.16.3/lib/minitest/test.rb:98
c:0029 p:0002 s:0139 e:000138 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/minitest-5.16.3/lib/minitest/test.rb:195
c:0028 p:0004 s:0134 e:000133 BLOCK  /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/minitest-5.16.3/lib/minitest/test.rb:95
c:0027 p:0015 s:0131 e:000130 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/minitest-5.16.3/lib/minitest.rb:296
c:0026 p:0004 s:0126 e:000125 BLOCK  /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/minitest-5.16.3/lib/minitest/test.rb:94
c:0025 p:0029 s:0123 E:001ac0 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/minitest-5.16.3/lib/minitest.rb:391
c:0024 p:0044 s:0115 E:001f78 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/minitest-5.16.3/lib/minitest/test.rb:243
c:0023 p:0004 s:0108 E:0014f0 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/minitest-5.16.3/lib/minitest/test.rb:93
c:0022 p:0008 s:0104 e:000103 BLOCK  /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activesupport-7.0.3.1/lib/active_support/executor/test_helper.rb:5
c:0021 p:0012 s:0101 e:000100 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activesupport-7.0.3.1/lib/active_support/execution_wrapper.rb:105
c:0020 p:0016 s:0096 e:000095 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activesupport-7.0.3.1/lib/active_support/executor/test_helper.rb:5
c:0019 p:0008 s:0090 e:000089 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/minitest-5.16.3/lib/minitest.rb:1059
c:0018 p:0015 s:0083 e:000082 BLOCK  /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activesupport-7.0.3.1/lib/active_support/testing/parallelization/worker.rb:50
c:0017 p:0029 s:0080 E:001440 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/minitest-5.16.3/lib/minitest.rb:391
c:0016 p:0029 s:0072 E:001238 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/minitest-5.16.3/lib/minitest.rb:378
c:0015 p:0049 s:0065 E:001338 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activesupport-7.0.3.1/lib/active_support/testing/parallelization/worker.rb:49
c:0014 p:0011 s:0056 e:000055 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activesupport-7.0.3.1/lib/active_support/testing/parallelization/worker.rb:38
c:0013 p:0053 s:0051 e:000050 BLOCK  /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activesupport-7.0.3.1/lib/active_support/testing/parallelization/worker.rb:27 [FINISH]
c:0012 p:---- s:0048 e:000047 CFUNC  :fork
c:0011 p:0004 s:0044 e:000043 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activesupport-7.0.3.1/lib/active_support/testing/parallelization/worker.rb:15
c:0010 p:0018 s:0040 e:000039 BLOCK  /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activesupport-7.0.3.1/lib/active_support/testing/parallelization.rb:37 [FINISH]
c:0009 p:---- s:0036 e:000035 IFUNC 
c:0008 p:---- s:0033 e:000032 CFUNC  :times
c:0007 p:---- s:0030 e:000029 CFUNC  :each
c:0006 p:---- s:0027 e:000026 CFUNC  :map
c:0005 p:0008 s:0023 e:000022 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activesupport-7.0.3.1/lib/active_support/testing/parallelization.rb:36
c:0004 p:0023 s:0019 e:000018 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activesupport-7.0.3.1/lib/active_support/testing/parallelize_executor.rb:18
c:0003 p:0162 s:0015 e:000014 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/minitest-5.16.3/lib/minitest.rb:156
c:0002 p:0073 s:0008 E:001c50 BLOCK  /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/minitest-5.16.3/lib/minitest.rb:83 [FINISH]
c:0001 p:0000 s:0003 E:001960 (none) [FINISH]```
@keithdoggett
Copy link
Member

@buffpojken thanks for the report and the stack trace. This issue is failing in the proj4 create method, so I'm going to transfer it to the RGeo-Proj4 repo.

I noticed you're using version 3.1.1 of rgeo-proj4. We recently released version 4 which includes a few fixes to the C extension and fixed a bug where segfaults would occasionally happen. Are you able to upgrade to test if it's still occurring in the new version?

@keithdoggett keithdoggett transferred this issue from rgeo/rgeo Mar 29, 2023
@Pe-co
Copy link

Pe-co commented Apr 4, 2023

@keithdoggett, I'm taking the ball from @buffpojken since his plate is fuller than mine atm.
I tested again after upgrading and get the same result,

-- Control frame information -----------------------------------------------
c:0066 p:---- s:0353 e:000352 CFUNC  :_create
c:0065 p:0079 s:0347 e:000346 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/rgeo-proj4-4.0.0/lib/rgeo/coord_sys/proj4.rb:249
c:0064 p:0099 s:0340 e:000339 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/rgeo-3.0.0/lib/rgeo/impl_helper/utils.rb:28
c:0063 p:0250 s:0333 e:000332 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/rgeo-3.0.0/lib/rgeo/geos/capi_factory.rb:56
c:0062 p:0099 s:0318 e:000317 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/rgeo-3.0.0/lib/rgeo/geos/interface.rb:174
c:0061 p:0038 s:0312 e:000311 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/rgeo-3.0.0/lib/rgeo/cartesian/interface.rb:31
c:0060 p:0045 s:0307 e:000306 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/rgeo-activerecord-7.0.1/lib/rgeo/active_record/spatial_factory_store.rb:42
c:0059 p:0016 s:0302 e:000301 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/rgeo-activerecord-7.0.1/lib/rgeo/active_record/spatial_factory_store.rb:21
c:0058 p:0014 s:0297 e:000296 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/rgeo-activerecord-7.0.1/lib/rgeo/active_record/spatial_factory_store.rb:29
c:0057 p:0031 s:0292 e:000291 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activerecord-postgis-adapter-8.0.1/lib/active_record/connection_adapters/postgi
c:0056 p:0055 s:0288 e:000286 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activerecord-postgis-adapter-8.0.1/lib/active_record/connection_adapters/postgi
c:0055 p:0005 s:0282 e:000281 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activerecord-postgis-adapter-8.0.1/lib/active_record/connection_adapters/postgi
c:0054 p:0028 s:0277 e:000276 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activerecord-postgis-adapter-8.0.1/lib/active_record/connection_adapters/postgi
c:0053 p:0013 s:0272 e:000271 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activerecord-postgis-adapter-8.0.1/lib/active_record/connection_adapters/postgi
c:0052 p:0027 s:0266 e:000264 BLOCK  /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activerecord-7.0.3.1/lib/active_record/connection_adapters/abstract/database_st [FINISH]
c:0051 p:---- s:0259 e:000258 IFUNC 
c:0050 p:---- s:0256 e:000255 CFUNC  :each
c:0049 p:---- s:0253 e:000252 CFUNC  :map
c:0048 p:0076 s:0249 e:000248 BLOCK  /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activerecord-7.0.3.1/lib/active_record/connection_adapters/abstract/database_st [FINISH]
c:0047 p:---- s:0244 e:000243 CFUNC  :map
c:0046 p:0017 s:0240 e:000239 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activerecord-7.0.3.1/lib/active_record/connection_adapters/abstract/database_st
c:0045 p:0016 s:0228 e:000227 BLOCK  /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activerecord-7.0.3.1/lib/active_record/connection_adapters/abstract/database_st [FINISH]
c:0044 p:---- s:0223 e:000222 IFUNC 
c:0043 p:---- s:0220 e:000219 CFUNC  :each
c:0042 p:---- s:0217 e:000216 CFUNC  :filter_map
c:0041 p:0005 s:0213 e:000212 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activerecord-7.0.3.1/lib/active_record/connection_adapters/abstract/database_st
c:0040 p:0009 s:0208 e:000207 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activerecord-7.0.3.1/lib/active_record/connection_adapters/abstract/database_st
c:0039 p:0031 s:0199 E:001c68 BLOCK  /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activerecord-7.0.3.1/lib/active_record/fixtures.rb:630 [FINISH]
c:0038 p:---- s:0193 e:000192 CFUNC  :each
c:0037 p:0012 s:0189 E:0026c8 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activerecord-7.0.3.1/lib/active_record/fixtures.rb:621
c:0036 p:0024 s:0182 e:000181 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activerecord-7.0.3.1/lib/active_record/fixtures.rb:607
c:0035 p:0088 s:0172 E:000850 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activerecord-7.0.3.1/lib/active_record/fixtures.rb:567
c:0034 p:0026 s:0160 e:000159 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activerecord-7.0.3.1/lib/active_record/test_fixtures.rb:271
c:0033 p:0140 s:0155 E:000058 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activerecord-7.0.3.1/lib/active_record/test_fixtures.rb:125
c:0032 p:0003 s:0150 e:000149 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activerecord-7.0.3.1/lib/active_record/test_fixtures.rb:10
c:0031 p:0004 s:0146 e:000145 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activesupport-7.0.3.1/lib/active_support/testing/setup_and_teardown.rb:40
c:0030 p:0003 s:0142 e:000141 BLOCK  /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/minitest-5.16.3/lib/minitest/test.rb:96
c:0029 p:0002 s:0139 e:000138 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/minitest-5.16.3/lib/minitest/test.rb:195
c:0028 p:0004 s:0134 e:000133 BLOCK  /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/minitest-5.16.3/lib/minitest/test.rb:95
c:0027 p:0015 s:0131 e:000130 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/minitest-5.16.3/lib/minitest.rb:296
c:0026 p:0004 s:0126 e:000125 BLOCK  /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/minitest-5.16.3/lib/minitest/test.rb:94
c:0025 p:0029 s:0123 E:001840 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/minitest-5.16.3/lib/minitest.rb:391
c:0024 p:0044 s:0115 E:002428 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/minitest-5.16.3/lib/minitest/test.rb:243
c:0023 p:0004 s:0108 E:000240 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/minitest-5.16.3/lib/minitest/test.rb:93
c:0022 p:0008 s:0104 e:000103 BLOCK  /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activesupport-7.0.3.1/lib/active_support/executor/test_helper.rb:5
c:0021 p:0012 s:0101 e:000100 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activesupport-7.0.3.1/lib/active_support/execution_wrapper.rb:105
c:0020 p:0016 s:0096 e:000095 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activesupport-7.0.3.1/lib/active_support/executor/test_helper.rb:5
c:0019 p:0008 s:0090 e:000089 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/minitest-5.16.3/lib/minitest.rb:1059
c:0018 p:0015 s:0083 e:000082 BLOCK  /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activesupport-7.0.3.1/lib/active_support/testing/parallelization/worker.rb:50
c:0017 p:0029 s:0080 E:000ef0 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/minitest-5.16.3/lib/minitest.rb:391
c:0016 p:0029 s:0072 E:001398 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/minitest-5.16.3/lib/minitest.rb:378
c:0015 p:0049 s:0065 E:001b98 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activesupport-7.0.3.1/lib/active_support/testing/parallelization/worker.rb:49
c:0014 p:0011 s:0056 e:000055 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activesupport-7.0.3.1/lib/active_support/testing/parallelization/worker.rb:38
c:0013 p:0053 s:0051 e:000050 BLOCK  /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activesupport-7.0.3.1/lib/active_support/testing/parallelization/worker.rb:27 [FINISH]
c:0012 p:---- s:0048 e:000047 CFUNC  :fork
c:0011 p:0004 s:0044 e:000043 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activesupport-7.0.3.1/lib/active_support/testing/parallelization/worker.rb:15
c:0010 p:0018 s:0040 e:000039 BLOCK  /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activesupport-7.0.3.1/lib/active_support/testing/parallelization.rb:37 [FINISH]
c:0009 p:---- s:0036 e:000035 IFUNC 
c:0008 p:---- s:0033 e:000032 CFUNC  :times
c:0007 p:---- s:0030 e:000029 CFUNC  :each
c:0006 p:---- s:0027 e:000026 CFUNC  :map
c:0005 p:0008 s:0023 e:000022 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activesupport-7.0.3.1/lib/active_support/testing/parallelization.rb:36
c:0004 p:0023 s:0019 e:000018 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/activesupport-7.0.3.1/lib/active_support/testing/parallelize_executor.rb:18
c:0003 p:0162 s:0015 e:000014 METHOD /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/minitest-5.16.3/lib/minitest.rb:156
c:0002 p:0073 s:0008 E:0023b0 BLOCK  /Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/minitest-5.16.3/lib/minitest.rb:83 [FINISH]
c:0001 p:0000 s:0003 E:001d50 (none) [FINISH]

@keithdoggett
Copy link
Member

@Pe-co thanks for trying that.

I tried to replicate in proj4 with a test like this

  def test_create_parallel
    epsg_codes = (2000..2049).to_a
    Parallel.each(epsg_codes, in_parallel: 8) do |epsg_code|
      proj = RGeo::CoordSys::Proj4.create(epsg_code)
      assert_equal("EPSG:#{epsg_code}", proj.auth_name)
    end
  end

but couldn't reproduce the issue.

I also created a rails app from scratch and added enough tests to trigger parallel tests, but still couldn't get a seg fault.

A few questions:

  • What version of proj are you using?
  • The segfault is happening when minitest is spawning child processes not during actual testing?
  • If possible can you share your SpatialFactoryStore configuration?

@Pe-co
Copy link

Pe-co commented Apr 5, 2023

This is the rgeo initializer we use

RGeo::ActiveRecord::SpatialFactoryStore.instance.tap do |store|
  store.register(RGeo::Geographic.spherical_factory(srid: 4326, uses_lenient_assertions: true), {sql_type: 'geography'})
end

Proj version is - Rel. 9.0.1, June 15th, 2022

The segfault is happening when minitest is spawning child processes not during actual testing?
Seems like it from the second callstack, i did not notice that difference. In the first callstack it was triggered by a test case, but in the second I cannot see that it is related to a specific test. We have a couple of fixtures that have geometry specified. I assume that is what causes it to trigger at that stage. If I nuke them locally, i will get a callstack refering to a test again, that gets the same error in _create.

With some more local testing, I have found that the store register call in the initializer seems to the be trigger. If I skip that line I'm able to get a clean run of my tests. I assume a couple of test or fixtures that need to use the SpatialFactoryStore is needed as well.

@keithdoggett
Copy link
Member

keithdoggett commented Apr 17, 2023

Thanks for that extra info @Pe-co and apologies about the delay, I was away the last few weeks.

I'll try to replicate with that version of Proj.

I've been thinking about this some and I am a little confused why this issue is happening with multiprocessing. Issues like this typically happen when trying multi-threading (ex. rgeo/rgeo#307), but I'm not sure why this issue would only be happening when we have multiple processes. Maybe there's some kind of shared resource that Proj writes to that could be causing the issue. Maybe the way minitest spawns additional processes is causing issues.

@keithdoggett
Copy link
Member

I also noticed that you only define a spherical factory in the spatial factory store which doesn't use proj. Do you have a lot of other SRIDs being used in your app or is it just 4326 for geographical data and one other projection?

@Pe-co
Copy link

Pe-co commented Apr 17, 2023

Input can come in various formats and gets converted to a few that we persist. I don't think the tests use more than 4326 and 3006. We do not declare an RGeo::Geographic.spherical_factory for any other srid than 4326, but we have an open RGeo::Geos.factory that can be fed with other srid. I think the error triggers even with that part disabled.
Most other conversions are done in postgresql, so they should not affect this i guess?

@Pe-co
Copy link

Pe-co commented Apr 17, 2023

Maybe the way minitest spawns additional processes is causing issues.
There seems to be some child process relation from the prints, so perhaps you are right.

Running 108 tests in parallel using 10 processes
DEBUGGER: Attaching after process 30727 fork to child process 30732
DEBUGGER[bin/rails#30733]: Attaching after process 30727 fork to child process 30733
DEBUGGER[bin/rails#30734]: Attaching after process 30727 fork to child process 30734
DEBUGGER[bin/rails#30735]: Attaching after process 30727 fork to child process 30735
DEBUGGER[bin/rails#30736]: Attaching after process 30727 fork to child process 30736
DEBUGGER[bin/rails#30737]: Attaching after process 30727 fork to child process 30737
DEBUGGER[bin/rails#30738]: Attaching after process 30727 fork to child process 30738
DEBUGGER[bin/rails#30739]: Attaching after process 30727 fork to child process 30739
DEBUGGER[bin/rails#30740]: Attaching after process 30727 fork to child process 30740
DEBUGGER[bin/rails#30741]: Attaching after process 30727 fork to child process 30741
Run options: --seed 36391

# Running:

/Users/pekkaakerstrom/.rvm/gems/ruby-3.1.2/gems/rgeo-proj4-4.0.0/lib/rgeo/coord_sys/proj4.rb:249: [BUG] Segmentation fault at 0x0000000108bf4ade
ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [arm64-darwin21]

@keithdoggett
Copy link
Member

I tried to replicate the issue in a fresh rails app with the following setup. I'm also using proj 9.0.1.

# db/schema.rb

ActiveRecord::Schema[7.0].define(version: 2023_04_17_124002) do
  # These are extensions that must be enabled in order to support this database
  enable_extension "plpgsql"
  enable_extension "postgis"

  create_table "buildings", force: :cascade do |t|
    t.string "name"
    t.geometry "boundary", limit: {:srid=>3006, :type=>"st_polygon"}
    t.datetime "created_at", null: false
    t.datetime "updated_at", null: false
  end

  create_table "geo_buildings", force: :cascade do |t|
    t.string "name"
    t.geography "boundary", limit: {:srid=>4326, :type=>"st_polygon", :geographic=>true}
    t.datetime "created_at", null: false
    t.datetime "updated_at", null: false
  end

end
# config/initializers/spatial_factory_store.rb
# frozen_string_literal: true

RGeo::ActiveRecord::SpatialFactoryStore.instance.tap do |config|
  config.register(RGeo::Geographic.spherical_factory(srid: 4326, uses_lenient_assertions: true),
                  { sql_type: 'geography' })
end
# test/models/building_test.rb

require 'test_helper'

class BuildingTest < ActiveSupport::TestCase
  1000.times do |i|
    define_method "test_number_#{i}" do
      building = Building.new
      polygon = "POLYGON ((0 0, 0 #{i}, #{i} #{i}, #{i} 0, 0 0))"
      building.boundary = polygon
      building.name = i.to_s

      assert_equal(building.boundary.as_text, polygon)
      building.save

      building = Building.last
      assert_equal(building.boundary.as_text, polygon)

      geo_fac = RGeo::Geographic.spherical_factory(srid: 4326)
      geo_building = GeoBuilding.new

      polygon = geo_fac.parse_wkt("POLYGON ((0.0 0.0, 0.0 #{(i % 5) + 1}.0, #{(i % 5) + 1}.0 #{(i % 5) + 1}.0, #{(i % 5) + 1}.0 0.0, 0.0 0.0))")

      geo_building.boundary = polygon
      geo_building.name = i.to_s

      assert_equal(geo_building.boundary, polygon)
      geo_building.save

      geo_building = GeoBuilding.last
      assert_equal(geo_building.boundary, polygon)
    end
  end
end
# Gemfile
source 'https://rubygems.org'
git_source(:github) { |repo| "https://github.com/#{repo}.git" }

ruby '3.1.2'

# Bundle edge Rails instead: gem "rails", github: "rails/rails", branch: "main"
gem 'rails', '~> 7.0.4', '>= 7.0.4.3'

# Use postgresql as the database for Active Record
gem 'pg', '~> 1.1'

# Use the Puma web server [https://github.com/puma/puma]
gem 'puma', '~> 5.0'

# Build JSON APIs with ease [https://github.com/rails/jbuilder]
# gem "jbuilder"

# Use Redis adapter to run Action Cable in production
# gem "redis", "~> 4.0"

# Use Kredis to get higher-level data types in Redis [https://github.com/rails/kredis]
# gem "kredis"

# Use Active Model has_secure_password [https://guides.rubyonrails.org/active_model_basics.html#securepassword]
# gem "bcrypt", "~> 3.1.7"

# Windows does not include zoneinfo files, so bundle the tzinfo-data gem
gem 'tzinfo-data', platforms: %i[mingw mswin x64_mingw jruby]

# Reduces boot times through caching; required in config/boot.rb
gem 'bootsnap', require: false

# Use Active Storage variants [https://guides.rubyonrails.org/active_storage_overview.html#transforming-images]
# gem "image_processing", "~> 1.2"

# Use Rack CORS for handling Cross-Origin Resource Sharing (CORS), making cross-origin AJAX possible
# gem "rack-cors"

gem 'activerecord-postgis-adapter'
gem 'rgeo'
gem 'rgeo-proj4'

group :development, :test do
  # See https://guides.rubyonrails.org/debugging_rails_applications.html#debugging-with-the-debug-gem
  gem 'debug', platforms: %i[mri mingw x64_mingw]
end

group :development do
  # Speed up commands on slow machines / big apps [https://github.com/rails/spring]
  # gem "spring"
end

The tests are mainly nonsense but I wanted to test writing and reading spatial data.

Unfortunately I'm still not able to replicate the issue with this. Are you able to reproduce the issue in a minimal rails app? Or even if you modify your local proj4 gem to see what data is being passed into the _create cmethod that could be helpful because I'm having issues reproducing this.

@Pe-co
Copy link

Pe-co commented Apr 27, 2023

Sorry for the late reply, I have not found time to try to reproduce the issue in clean project, but here are 2 prints with the inputs to the _create function.

I get to the function 2 times before the segfault
First time:
Variable - defn_
value - "EPSG:4055"
Variable - opts_[:radians]
value - nil

Second time:
Variable - defn_
value - "EPSG:4326"
Variable - opts_[:radians]
value - nil

I did not expect the 4055 value, I don't know what causes that to be passed to proj-4.

@Pe-co
Copy link

Pe-co commented Apr 27, 2023

If I run the tests with the registered instance commented out of the initializer, i get a bunch of prints, with both 4326, 3006 and 4055.

@BuonOmo
Copy link
Member

BuonOmo commented May 3, 2023

@Pe-co do you think you could share an example app link that reproduces the error for you ? That would help us a lot to fix the error.

Here is an example app guide to set you up quickly!

@Pe-co
Copy link

Pe-co commented May 17, 2023

I think that creating an app that recreates the issue could be time consuming, and it does not look like I will spend that time in a near future. I will let you know if I make some process. Sorry for not being more helpful, I do understand that fixing problems you cannot reproduce is painful.

@Pe-co
Copy link

Pe-co commented May 17, 2023

I sunk a few hours into it today, and managed to get a repro in a separate project, but its not reproducing stably. I still need to spend some more time minimizing to see what is actually needed.

@BuonOmo
Copy link
Member

BuonOmo commented May 17, 2023

Thank you very much for using this time. As we are not paid for maintaining RGeo either, these hours you are spending are really precious to us maintainers, and the whole community using the Gem.

If you want to share your ongoing progress, or are stuck at some point, we are here.

The more stable the reproduction is, the better, of course. But if you have a 1/2 times reproduction, it would already be enough IMHO.

@Pe-co
Copy link

Pe-co commented May 30, 2023

Running the tests in this project gives a stable reproduction for me, https://github.com/Pe-co/rgeo_test_app.
The trick seems to be to have enough models with geometry, the error seems to occur at the test setup, not the execution. Having some tests that did some work did however add stability to the reproduction.

@BuonOmo
Copy link
Member

BuonOmo commented May 31, 2023

I am reproducing locally 🎉

I'll be offline the next few days, but I (hopefully) downloaded everything to try and find this segfault!

@keithdoggett
Copy link
Member

Thanks for following up on this @Pe-co this is very helpful

@BuonOmo
Copy link
Member

BuonOmo commented Jun 8, 2023

@Pe-co @keithdoggett Here is a smaller set of the reproduction: https://github.com/rgeo/rgeo-proj4-segv-issue-39. Basically we should use the thread-safe proj api. I won't have time to open the PR myself soon, so if there is any emergency feel free to take it from here.

A few more notes on a potential implementation:

  • we only need one context actually, at least it worked for me. But a reproduction in the repo would be mandatory I think, to be sure we've covered it!
  • there is no need to run the destroy functions, as we only need to generate one context and ruby anyway ends its process without freeing most of the memory (yagni I guess)

@keithdoggett
Copy link
Member

keithdoggett commented Jun 9, 2023

@BuonOmo I should be able to look into it now that you've found a more reproducible version of it. I was still having issues reproducing it locally consistently, but your repo looks good. Thanks for the working digging into that.

BuonOmo added a commit that referenced this issue Jun 9, 2023
We would segv when using proj and then fork a process. This is fixed
by using proj's thread-safe context.

Fixes #39
BuonOmo added a commit that referenced this issue Jun 9, 2023
We would segv when using proj and then fork a process. This is fixed
by using proj's thread-safe context.

Fixes #39
BuonOmo added a commit that referenced this issue Jun 9, 2023
We would segv when using proj and then fork a process. This is fixed
by using proj's thread-safe context.

Fixes #39
BuonOmo added a commit that referenced this issue Jun 9, 2023
We would segv when using proj and then fork a process. This is fixed
by using proj's thread-safe context.

Fixes #39
BuonOmo added a commit that referenced this issue Jun 9, 2023
We would segv when using proj and then fork a process. This is fixed
by using proj's thread-safe context.

Fixes #39
@TamerShlash
Copy link

Hello 👋 has this been fixed? still getting Segmentation fault here. Running on Ruby 3.3.0 and rgeo-proj4-4.0.0.

/path/to/ruby/3.3.0/lib/ruby/gems/3.3.0/gems/rgeo-proj4-4.0.0/lib/rgeo/coord_sys/proj4.rb:249: [BUG] Segmentation fault at 0x0000000105670ad8
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [arm64-darwin23]

@keithdoggett
Copy link
Member

Hey @TamerShlash unfortunately this issue has not been fixed. We fixed a different, but related issue and @BuonOmo found a way to reproduce/fix the issue, but the change is fairly major and neither of us have had much time to commit to working on it recently. If you want to take a shot at making the changes we'd be more than happy to facilitate and help with the PR though.

@TamerShlash
Copy link

Hi @keithdoggett !

I saw the attempted fix PR as well and have a rough understanding of the issue (project context is being accessed by forked threads if I'm not mistaken). I also had an attempt at fixing that locally but without much success.

I don't have much experience with C extensions but am happy to help, can you explain what is the scope of the change? Would it help taking a look at the way proj4rb handles contexts? I know they have a different use case.

@BuonOmo
Copy link
Member

BuonOmo commented Apr 14, 2024

Hi @TamerShlash,

Thanks for digging up the topic. I indeed found a fix, but it was only for another issue I found along the way... I don't have a clear remember about this issue, nor the time right no to handle it, but I'll try to help as much as I can.

About the scope of the change. It might be related to the proj context: I changed it not to used the default one, and I tried forking (threads and processes) to see if it was still failing, and it worked for my use case. So my guesse would be that there is a specific weird case of forking that requires creating a new context. I think it would be important to first find that case, to be sure we're doing right, and have a test to prove it! Then the scope might be a bit large, but it wouldn't scare that much as it might be a lot of copy-pasta, and only initialization changes and fork hooks (see https://github.com/rails/rails/blob/main/activesupport/lib/active_support/fork_tracker.rb for a reference). I think the hardest is the reproduction, and if you have it and still do not feel confident about the fix, we could pair on this as I think it is at most a 3h work.

However, it may also not be related to context, we really need a reproduction.

About the proj4rb gem, I think you had a good intuition. The context.rb file is the one we are interested in I think. And their solution is basically to have one thread one context, and an object finalizer that ensures no leak with the proj contexts created

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants