Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

context deadline exceeded & bad file descriptor for network calls in worker #100

Open
NorseGaud opened this issue Apr 7, 2024 · 4 comments

Comments

@NorseGaud
Copy link
Contributor

NorseGaud commented Apr 7, 2024

Using signal-handler.go, if I put this http call right under the daemon started:

	log.Println("- - - - - - - - - - - - - - -")
	log.Println("daemon started")

	httpClient := &http.Client{
		Timeout: 5 * time.Second,
	}
	req, err := http.NewRequest("GET", "https://google.com", nil)
	if err != nil {
		log.Println("error creating request", "err", err)
		return
	}
	resp, err := httpClient.Do(req)
	if err != nil {
		log.Println("error making GET request to google.com", "err", err)
		return
	}
	defer resp.Body.Close()
	fmt.Println("GET request to https://google.com status:", resp.Status)

I see:

2024/04/07 09:31:20 - - - - - - - - - - - - - - -
2024/04/07 09:31:20 daemon started
2024/04/07 09:31:25 error making GET request to google.com err Get "https://google.com": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

If I put the same into the worker, same problem:

	go func() {
	LOOP:
		for {
			fmt.Println("in worker")
			time.Sleep(time.Second) // this is work to be done by worker.
			select {
			case <-stop:
				break LOOP
			default:
				httpClient := &http.Client{
					Timeout: 5 * time.Second,
				}
				req, err := http.NewRequest("GET", "https://google.com", nil)
				if err != nil {
					log.Println("error creating request", "err", err)
					break
				}
				resp, err := httpClient.Do(req)
				if err != nil {
					log.Println("error making GET request to google.com", "err", err)
					break
				}
				defer resp.Body.Close()
				fmt.Println("GET request to https://google.com status:", resp.Status)
			}
		}
		done <- struct{}{}
	}()
2024/04/07 09:32:43 - - - - - - - - - - - - - - -
2024/04/07 09:32:43 daemon started
in worker
2024/04/07 09:32:49 error making GET request to google.com err Get "https://google.com": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
in worker
2024/04/07 09:32:50 terminating...
2024/04/07 09:32:55 error making GET request to google.com err Get "https://google.com": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
in worker
2024/04/07 09:32:56 daemon terminated
@NorseGaud NorseGaud reopened this Apr 7, 2024
@NorseGaud
Copy link
Contributor Author

If I use an IP, I see a different error:

2024/04/07 09:37:14 - - - - - - - - - - - - - - -
2024/04/07 09:37:14 daemon started
2024/04/07 09:37:14 error making GET request to google.com err Get "http://1.1.1.1": dial tcp 1.1.1.1:80: connect: bad file descriptor

@NorseGaud
Copy link
Contributor Author

NorseGaud commented Apr 8, 2024

OK, figured it out. Looks like it's the closeFiles in

func (d *Context) parent() (child *os.Process, err error) {
	if err = d.prepareEnv(); err != nil {
		return
	}

	defer d.closeFiles()

If I comment the defer out, it works.

2024/04/08 10:14:10 daemon started
GET request status: 200 OK
in worker

UPDATE: It worked once and fails any subsequent runs... :(

@NorseGaud
Copy link
Contributor Author

AHA! It's a premature close/race condition issue! If I add the call in closeFiles, which causes it to be delayed slowly, the childs can spawn properly with the FDs.

func (d *Context) closeFiles() (err error) {
	httpClient := &http.Client{
		Timeout: 5 * time.Second,
	}
	req, err := http.NewRequest("GET", "http://1.1.1.1", nil)
	if err != nil {
		log.Println("error creating request", "err", err)
		return
	}
	resp, err := httpClient.Do(req)
	if err != nil {
		log.Println("error making CLOSE GET request", "err", err)
		return
	}
	defer resp.Body.Close()
	fmt.Println("CLOSE GET request status:", resp.Status)
	cl := func(file **os.File) {
		if *file != nil {
			(*file).Close()
			*file = nil
		}
	}

Adding a sleep for 1 second before closing is a decent solution for now.

NorseGaud added a commit to NorseGaud/go-daemon that referenced this issue Apr 8, 2024
NorseGaud added a commit to NorseGaud/go-daemon that referenced this issue Apr 8, 2024
NorseGaud added a commit to NorseGaud/go-daemon that referenced this issue Apr 8, 2024
NorseGaud added a commit to NorseGaud/go-daemon that referenced this issue Apr 8, 2024
@NorseGaud NorseGaud changed the title error making GET request to google.com err Get "https://google.com": context deadline exceeded (Client.Timeout exceeded while awaiting headers) context deadline exceeded & bad file descriptor for network calls in worker Apr 8, 2024
@NorseGaud
Copy link
Contributor Author

The best thing I can find is to wait 2 seconds for the child and worker to fully init before exiting. It's a bit of a change for existing users, but it seems to work and unblock me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant