Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gxensemble: fails to start when registry socket file exists #1106

Open
chiefnoah opened this issue Jan 21, 2024 · 3 comments
Open

gxensemble: fails to start when registry socket file exists #1106

chiefnoah opened this issue Jan 21, 2024 · 3 comments

Comments

@chiefnoah
Copy link
Collaborator

When running the ensemble registry, if you use the default unix socket and stop the process and restart it, it fails with the following:

^^>>> gxensemble registry                                                                                13:57:33 

2024-01-21T19:57:36Z INFO ensemble starting registry ...
^C------------- REPL is now in #<thread #46 ticker> -------------
*** INTERRUPTED IN std/actor-v18/proto#ticker__%
^^>>> gxensemble registry                                                                           (70) 13:57:37 
*** ERROR IN std/actor-v18/server#start-actor-server!__% -- 
*** ERROR IN "os/socket.ss"@196.6-196.17 [OSError]: Unknown error -98
--- irritants: socket-bind #<input-port #46 (socket 3)> #<sockaddr* #47 0x5830a6ba3b20> 
--- continuation backtrace:
[0] raise                                                                              
[1] std/io/socket/socket#listen                                                                                                                                                (std/os/socket#socket-bind _sock333968_ _sockaddr333966_)
[2] std/io/socket/api#unix-listen__%                                                                                                                                           (std/io/socket/socket#listen _path349834_ _backlog349836_ _sockopts349838_)
[3] std/actor-v18/server#actor-server-listen!                                                                                                                                  (with-catch values __tmp43054)
[4] std/actor-v18/server#start-actor-server!__%                                                                                                                                (std/actor-v18/server#actor-server-listen! _addrs654541_ _tls-context654533_)
[5] std/actor-v18/api#call-with-ensemble-server__%                                                                                                                             (std/actor-v18/server#start-actor-server!__% '#f _server-id656029_ _tls-conte...
^^>>>                 

This is likely because socket-bind expects the file to not exist and fails at the OS level when it does.

There's a couple of different ways we can fix this:

  • Clean up the socket on process exit
  • Reuse existing socket files

If we go with the second option, we will need a second way to ensure we don't have multiple registry processes listening on the same socket.

@chiefnoah
Copy link
Collaborator Author

I think it would be best to clean up the socket on exit. @vyzo is there a good way to go about handling that?

@chiefnoah
Copy link
Collaborator Author

On further investigation, it seems the issue is in signal handling. What's the general philosophy for handling signals in Gerbil?

@vyzo
Copy link
Collaborator

vyzo commented Jan 21, 2024

we can install but i think the right way is to introduce some sort of exit handler -- some form of at-exit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants