Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

logstream truncates output from WITH DOCKER #4063

Open
rrjjvv opened this issue Apr 26, 2024 · 5 comments
Open

logstream truncates output from WITH DOCKER #4063

rrjjvv opened this issue Apr 26, 2024 · 5 comments
Labels
type:bug Something isn't working

Comments

@rrjjvv
Copy link
Contributor

rrjjvv commented Apr 26, 2024

What went wrong?

Some time ago I noticed that the last line of output from a WITH DOCKER block was missing. I'm not sure if this was 100% of the time since usually I only care about the output on a failure. In some projects where this was bothersome I started adding a dummy echo "hi" to the RUN command, which would then allow me to see the last line of output (as the dummy echo would then get truncated).

I basically did a 'bisect' with all the earthly versions I have locally; the issue was reproduced 100% of the time with every version I tried between 0.7.19 and 0.8.9 and was absent in every version I tried between 0.7.0 and 0.7.17. Anecdotally, the regression occurs between 0.7.17 and 0.7.19.

A workaround (discovered shortly after my initial post) is to disable logstream, which may or may not be acceptable to typical Earthly Cloud users: --logstream=false --logstream-upload=false (both are necessary).

This Earthfile was at VERSION 0.8 for all the 0.8.x client versions I tried, then changed to VERSION 0.7 to test against the 0.7.x client versions:

VERSION 0.7

test-plain:
    FROM earthly/dind:alpine  
    WITH DOCKER
        RUN echo "plain"
    END

test-pull:
    FROM earthly/dind:alpine  
    WITH DOCKER --pull busybox
        RUN echo "pull"
    END

test-load:
    FROM earthly/dind:alpine  
    WITH DOCKER --load bb=+load
        RUN echo "load"
    END

load:
    FROM busybox

In all cases, the +test-load and +test-pull had the same behavior. The +test-plain does not have this behavior, which implies the issue isn't with WITH DOCKER per-se, but with output related to image transfers.

Here, in the older version, you see "pull" in the output:

$ earthly-0.7.17 -P --no-cache +test-pull
 Init 🚀
————————————————————————————————————————————————————————————————————————————————

           buildkitd | Found buildkit daemon as docker container (earthly-buildkitd)

 Build 🔧
————————————————————————————————————————————————————————————————————————————————

 earthly/dind:alpine | --> Load metadata earthly/dind:alpine linux/amd64
             busybox | --> Load metadata busybox linux/amd64
             busybox | --> DOCKER PULL busybox
             busybox | [----------] 100% resolve docker.io/library/busybox:latest@sha256:c3839dd800b9eb7603340509769c43e146a74c63dca3045a8e7dc8ee07e53966
              output | --> exporting outputs
              output | [----------] 100% exporting layers
              output | [----------] 100% exporting manifest sha256:80483c6b3ec0abb539fea285084e057b6d68c7ef0aef3a20dbff6ea5f6ec30d7
              output | [----------] 100% exporting config sha256:0f0a87a74fc2ee5f24fd31d6ef1dfa916329837c914bacc38872a75aac52a7ab
              output | [          ]   0% transferring docker.io/library/busybox:latest
          +test-pull | --> FROM earthly/dind:alpine
          +test-pull | [----------] 100% resolve docker.io/earthly/dind:alpine@sha256:8de4d6a322a9cdde70daa58c7c9c4eba7f2fb79190c4396789bb8ba057366950
          +test-pull | --> WITH DOCKER (install deps)
          +test-pull | --> WITH DOCKER RUN --privileged echo "pull"
          +test-pull | Starting dockerd with data root /var/earthly/dind/3127d94d515854ad966734095aa7f8b2e1e3711fbbe6b1af06ff3d69dbfdb80a/tmp.GkpjlC
          +test-pull | Loading images from BuildKit via embedded registry...
          +test-pull | Pulling 172.30.0.1:8371/sess-je119ngf22x4cfm6o4zb3bt9m:img-0 and retagging as busybox:latest
          +test-pull | img-0: Pulling from sess-je119ngf22x4cfm6o4zb3bt9m
          +test-pull | 7b2699543f22: Pulling fs layer
          +test-pull | 7b2699543f22: Verifying Checksum
          +test-pull | 7b2699543f22: Download complete
          +test-pull | 7b2699543f22: Pull complete
          +test-pull | Digest: sha256:80483c6b3ec0abb539fea285084e057b6d68c7ef0aef3a20dbff6ea5f6ec30d7
          +test-pull | Status: Downloaded newer image for 172.30.0.1:8371/sess-je119ngf22x4cfm6o4zb3bt9m:img-0
          +test-pull | 172.30.0.1:8371/sess-je119ngf22x4cfm6o4zb3bt9m:img-0
          +test-pull | Untagged: 172.30.0.1:8371/sess-je119ngf22x4cfm6o4zb3bt9m:img-0
          +test-pull | Untagged: 172.30.0.1:8371/sess-je119ngf22x4cfm6o4zb3bt9m@sha256:80483c6b3ec0abb539fea285084e057b6d68c7ef0aef3a20dbff6ea5f6ec30d7
          +test-pull | ...done
          +test-pull | pull
             ongoing | output (11 seconds ago), +test-pull (11 seconds ago)
              output | [----------] 100% transferring docker.io/library/busybox:latest
              output | --> exporting outputs
<snip>

In the next version, that output is missing:

————————————————————————————————————————————————————————————————————————————————

           buildkitd | Found buildkit daemon as docker container (earthly-buildkitd)

 Build 🔧
————————————————————————————————————————————————————————————————————————————————

 earthly/dind:alpine | --> Load metadata earthly/dind:alpine linux/amd64
             busybox | --> Load metadata busybox linux/amd64
             busybox | --> DOCKER PULL busybox
             busybox | [----------] 100% resolve docker.io/library/busybox:latest@sha256:c3839dd800b9eb7603340509769c43e146a74c63dca3045a8e7dc8ee07e53966
              output | --> exporting outputs
              output | [----------] 100% exporting layers
              output | [----------] 100% exporting manifest sha256:80483c6b3ec0abb539fea285084e057b6d68c7ef0aef3a20dbff6ea5f6ec30d7
              output | [----------] 100% exporting config sha256:0f0a87a74fc2ee5f24fd31d6ef1dfa916329837c914bacc38872a75aac52a7ab
              output | [          ]   0% transferring docker.io/library/busybox:latest
          +test-pull | --> FROM earthly/dind:alpine
          +test-pull | [----------] 100% FROM earthly/dind:alpine
          +test-pull | --> WITH DOCKER (install deps)
          +test-pull | --> WITH DOCKER RUN --privileged echo "pull"
          +test-pull | Starting dockerd with data root /var/earthly/dind/3127d94d515854ad966734095aa7f8b2e1e3711fbbe6b1af06ff3d69dbfdb80a/tmp.jcdPkN
          +test-pull | Loading images from BuildKit via embedded registry...
          +test-pull | Pulling 172.30.0.1:8371/sess-mjop80mxwkxa3hp922fkjgtd8:img-0 and retagging as busybox:latest
          +test-pull | img-0: Pulling from sess-mjop80mxwkxa3hp922fkjgtd8
          +test-pull | 7b2699543f22: Pulling fs layer
          +test-pull | 7b2699543f22: Verifying Checksum
          +test-pull | 7b2699543f22: Download complete
          +test-pull | 7b2699543f22: Pull complete
          +test-pull | Digest: sha256:80483c6b3ec0abb539fea285084e057b6d68c7ef0aef3a20dbff6ea5f6ec30d7
          +test-pull | Status: Downloaded newer image for 172.30.0.1:8371/sess-mjop80mxwkxa3hp922fkjgtd8:img-0
          +test-pull | 172.30.0.1:8371/sess-mjop80mxwkxa3hp922fkjgtd8:img-0
          +test-pull | Untagged: 172.30.0.1:8371/sess-mjop80mxwkxa3hp922fkjgtd8:img-0
          +test-pull | Untagged: 172.30.0.1:xxxxx@sha256:80483c6b3ec0abb539fea285084e057b6d68c7ef0aef3a20dbff6ea5f6ec30d7
          +test-pull | ...done
###############    there should be output here  ###############
          +test-pull | [----------] 100% transferring docker.io/library/busybox:latest
              output | --> exporting outputs
<snip>

If you watch the build carefully for the whole duration, you'll see that the output actually does appear in the terminal for a short time, then disappears. After noticing that, it was clear the issue was the fancy terminal magic. Relevant snippet of the 'raw' output (relatively speaking) from the 0.7.17:

         +test-pull | Status: Downloaded newer image for 172.30.0.1:8371/sess-oyksslr8wqaif4hhhh8tgnyei:img-0
          +test-pull | 172.30.0.1:8371/sess-oyksslr8wqaif4hhhh8tgnyei:img-0
          +test-pull | Untagged: 172.30.0.1:8371/sess-oyksslr8wqaif4hhhh8tgnyei:img-0
          +test-pull | Untagged: 172.30.0.1:8371/sess-oyksslr8wqaif4hhhh8tgnyei@sha256:80483c6b3ec0abb539fea285084e057b6d68c7ef0aef3a20dbff6ea5f6ec30d7
          +test-pull | ...done
          +test-pull | pull
              output | [----------] 100% transferring docker.io/library/busybox:latestESC[K
              output | --> exporting outputs

My understanding is that the ESC [ K sequence means "delete here to the end of the line", which is an effective no-op since it's already at the end of the line.

In 0.7.19:

          +test-pull | Status: Downloaded newer image for 172.30.0.1:8371/sess-8yvg01mxvc1je4toryzb3okic:img-0
          +test-pull | 172.30.0.1:8371/sess-8yvg01mxvc1je4toryzb3okic:img-0
          +test-pull | Untagged: 172.30.0.1:8371/sess-8yvg01mxvc1je4toryzb3okic:img-0
          +test-pull | Untagged: 172.30.0.1:xxxxx@sha256:80483c6b3ec0abb539fea285084e057b6d68c7ef0aef3a20dbff6ea5f6ec30d7
          +test-pull | ...done
          +test-pull | pull
              output | ESC[A[----------] 100% transferring docker.io/library/busybox:latestESC[K
              output | --> exporting outputs

My understanding is that ESC [ A means "go up one line", which would put the cursor to the last line of user-generated and 'overwriting' it. That's somewhat supported when looking at the target/stage for the "transferring" line from the working version:

              output | [----------] 100% transferring docker.io/library/busybox:latest

compared to the broken version:

          +test-pull | [----------] 100% transferring docker.io/library/busybox:latest

What should have happened?

The expectation is that stuff that the user outputs is displayed. Sometimes one lines doesn't matter, but when one takes care to generate 'clean' output, it matters. In my real use-case, I was confused because I was certain I had written X tests, but was only seeing X-1. In order to see every test:

               +test | 1..8
               +test | ok 1 postgres healthy in 16ms
               +test | ok 2 can retrieve app DB creds in 52ms
               +test | ok 3 app DB creds have limited access in 190ms
               +test | ok 4 vault server healthy in 96ms
               +test | ok 5 vault agent healthy in 60ms
               +test | ok 6 token is via approle in 61ms
               +test | ok 7 list app secrets in 99ms
               +test | ok 8 read app secrets in 276ms
               +test | [----------] 100% transferring docker.io/library/liquibase:latest
              output | [----------] 100% transferring docker.io/library/postgres:latest
              output | [----------] 100% transferring docker.io/library/vault:latest
              output | [----------] 100% transferring docker.io/library/bats:latest
              output | --> exporting outputs

I had to insert that dummy echo:

               +test | ok 1 postgres healthy in 16ms
               +test | ok 2 can retrieve app DB creds in 52ms
               +test | ok 3 app DB creds have limited access in 190ms
               +test | ok 4 vault server healthy in 96ms
               +test | ok 5 vault agent healthy in 60ms
               +test | ok 6 token is via approle in 61ms
               +test | ok 7 list app secrets in 99ms
               +test | ok 8 read app secrets in 276ms
               +test | FIN 
              output | ESC[A[----------] 100% transferring docker.io/library/liquibase:latestESC[K
              output | [----------] 100% transferring docker.io/library/postgres:latestESC[K
              output | [----------] 100% transferring docker.io/library/vault:latestESC[K
              output | [----------] 100% transferring docker.io/library/bats:latestESC[K
              output | --> exporting outputs

What earthly version?

As mentioned above, I can consistently reproduce with many versions starting at 0.7.19 (including the latest 0.8.9). Versions I explicitly tested:

  • 0.8.9 (broken)
  • 0.8.2 (broken)
  • 0.7.22 (broken)
  • 0.7.20 (broken)
  • 0.7.19 (broken)
  • 0.7.17 (ok)
  • 0.7.15 (ok)
  • 0.7.14 (ok)
  • 0.7.9 (ok)
  • 0.7.0 (ok)

Buildkit Logs

No response

Other Helpful Information

The changelog for 0.7.19 doesn't give any obvious clues, though it does mention logging changes. The diff shows a substantial number of changes to logging (more than I'd guess from the changelog), but nothing jumped out immediately.

This could be red herring, but I saw code related to "de-duplicating" log output (like in the old days, when you'd see "ongoing" messages repeated. In 0.7.17 version, when running in verbose mode, you see

             ongoing | output (11 seconds ago), +test-pull (11 seconds ago)

which then disappears, and with verbose mode in 0.7.19 you see

             ongoing | ongoing TODO
             ongoing | ongoing TODO

which then collapses to just one.

My hunch is that the issue isn't related to changes to logstream/logbus per-se, but maybe it's an existing issue that appeared since it's the default (though I did try disabling it to no avail).

EDIT

I mentioned previously that I had tried disabling logstream but that it had no effect. I had assumed that --logstream=false was a global kill-switch, but it must be more nuanced. This release had a second related flag that had a new default, and disabling both has the desired effect.

$ $ earthly-0.7.19 -P --logstream=false --logstream-upload=false --no-cache +test-pull
 Init 🚀
————————————————————————————————————————————————————————————————————————————————

           buildkitd | Found buildkit daemon as docker container (earthly-buildkitd)

 Build 🔧
————————————————————————————————————————————————————————————————————————————————

 earthly/dind:alpine | --> Load metadata earthly/dind:alpine linux/amd64
             busybox | --> Load metadata busybox linux/amd64
             busybox | --> DOCKER PULL busybox
             busybox | [----------] 100% resolve docker.io/library/busybox:latest@sha256:c3839dd800b9eb7603340509769c43e146a74c63dca3045a8e7dc8ee07e53966
              output | --> exporting outputs
              output | [----------] 100% exporting layers
              output | [----------] 100% exporting manifest sha256:80483c6b3ec0abb539fea285084e057b6d68c7ef0aef3a20dbff6ea5f6ec30d7
              output | [----------] 100% exporting config sha256:0f0a87a74fc2ee5f24fd31d6ef1dfa916329837c914bacc38872a75aac52a7ab
              output | [          ]   0% transferring docker.io/library/busybox:latest
          +test-pull | --> FROM earthly/dind:alpine
          +test-pull | [----------] 100% resolve docker.io/earthly/dind:alpine@sha256:8de4d6a322a9cdde70daa58c7c9c4eba7f2fb79190c4396789bb8ba057366950
          +test-pull | --> WITH DOCKER (install deps)
          +test-pull | --> WITH DOCKER RUN --privileged echo "pull"
          +test-pull | Starting dockerd with data root /var/earthly/dind/3127d94d515854ad966734095aa7f8b2e1e3711fbbe6b1af06ff3d69dbfdb80a/tmp.HCBAId
          +test-pull | Loading images from BuildKit via embedded registry...
          +test-pull | Pulling 172.30.0.1:8371/sess-i6hasay9rc770q105p9oil0w9:img-0 and retagging as busybox:latest
          +test-pull | img-0: Pulling from sess-i6hasay9rc770q105p9oil0w9
          +test-pull | 7b2699543f22: Pulling fs layer
          +test-pull | 7b2699543f22: Download complete
          +test-pull | 7b2699543f22: Pull complete
          +test-pull | Digest: sha256:80483c6b3ec0abb539fea285084e057b6d68c7ef0aef3a20dbff6ea5f6ec30d7
          +test-pull | Status: Downloaded newer image for 172.30.0.1:8371/sess-i6hasay9rc770q105p9oil0w9:img-0
          +test-pull | 172.30.0.1:8371/sess-i6hasay9rc770q105p9oil0w9:img-0
          +test-pull | Untagged: 172.30.0.1:8371/sess-i6hasay9rc770q105p9oil0w9:img-0
          +test-pull | Untagged: 172.30.0.1:8371/sess-i6hasay9rc770q105p9oil0w9@sha256:80483c6b3ec0abb539fea285084e057b6d68c7ef0aef3a20dbff6ea5f6ec30d7
          +test-pull | ...done
          +test-pull | pull
             ongoing | output (11 seconds ago), +test-pull (11 seconds ago)
              output | [----------] 100% transferring docker.io/library/busybox:latest
              output | --> exporting outputs
<snip>

And my real build:

$ earthly -version
earthly version v0.8.9 a2fc61ee6e00ed5b8ad09bd06c59451534e2541e linux/amd64; Ubuntu 22.04.4 LTS (Jammy Jellyfish)

$ earthly -P --logstream=false --logstream-upload=false --no-cache +test
<snip>
               +test | 1..8
               +test | ok 1 postgres healthy in 15ms
               +test | ok 2 can retrieve app DB creds in 55ms
               +test | ok 3 app DB creds have limited access in 191ms
               +test | ok 4 vault server healthy in 90ms
               +test | ok 5 vault agent healthy in 62ms
               +test | ok 6 token is via approle in 62ms
               +test | ok 7 list app secrets in 98ms
               +test | ok 8 read app secrets in 286ms
               +test | FIN 
             ongoing | output (41 seconds ago), +test (36 seconds ago)
              output | [----------] 100% transferring docker.io/library/postgres:latest
              output | [----------] 100% transferring docker.io/library/vault:latest
              output | [----------] 100% transferring docker.io/library/bats:latest
              output | [----------] 100% transferring docker.io/library/liquibase:latest
             context | --> local context .
             context | --> local context .
             context | --> local context .
              output | --> exporting outputs
<snip>

In the current earthly version, setting both --logstream=false and --logstream-upload=false result in my dummy line being shown (thus no longer needed).

@rrjjvv rrjjvv added the type:bug Something isn't working label Apr 26, 2024
@rrjjvv rrjjvv changed the title WITH DOCKER output truncates last line of output WITH DOCKER + logstream output truncates last line of output Apr 26, 2024
@rrjjvv rrjjvv changed the title WITH DOCKER + logstream output truncates last line of output logstream truncates output from WITH DOCKER Apr 26, 2024
@alexcb
Copy link
Collaborator

alexcb commented Apr 26, 2024

I experienced the same bug today while working on #4064 (I actually ended up adding an extra echo "the check has passed" output in order as a work-around). It's a rather bad user experience.

@alexcb
Copy link
Collaborator

alexcb commented Apr 26, 2024

the output is still technically there:

earthly -P +test |& tee /tmp/earthly.log
$ xxd /tmp/earthly.log | grep -C 4 llo # hello spans a line, so you may have to adjust what you're searching for
00000e70: 2d64 6f63 6b65 722b 7465 7374 207c 2068  -docker+test | h
00000e80: 656c 6c6f 0a20 2020 2020 2020 2020 2020  ello.           
00000e90: 2020 206f 7574 7075 7420 7c20 1b5b 415b     output | .[A[
00000ea0: 2d2d 2d2d 2d2d 2d2d 2d2d 5d20 3130 3025  ----------] 100%
00000eb0: 2074 7261 6e73 6665 7272 696e 6720 646f   transferring do
00000ec0: 636b 6572 2e69 6f2f 6c69 6272 6172 792f  cker.io/library/
00000ed0: 6d79 696d 673a 6c61 7465 7374 1b5b 4b0a  myimg:latest.[K.

we can see that the "ello\n" shows up:

656c 6c6f 0a

followed by a bunch of spaces:

20 2020 2020 2020 2020 2020

followed by a "output....":

6f 7574 7075 7420

but then we get to:

1b5b 4b0a

which man 4 console_codes describes:

       ESC (0x1B, ^[)
              starts an escape sequence;

at this point I suspect is causing the cursor to reset to the beginning of the line which causes the text to disappear.

Maybe there's a way we can prevent docker from outputting these characters?

@rrjjvv
Copy link
Contributor Author

rrjjvv commented Apr 26, 2024

Yeah, I had uncovered that same control sequence in my report (I know my tendency to be thorough can result in stuff getting overlooked and have the opposite effect...). However, since disabling logbus results in the correct output, my assumption is that it's not docker's console magic emitting it. (Or maybe it is, but legacy and logbus handle it differently.) That control character is actually present in the earthly codebase:

ansiUp = []byte(fmt.Sprintf("%c[A", esc))

That was actually the first thing I found and started seeing if I could backtrack, but then got lost in the weeds with the logstream stuff.

@rrjjvv
Copy link
Contributor Author

rrjjvv commented Apr 27, 2024

I decided to take up the quest again for a little bit. It turns out there's one other place in the codebase where that sequence is defined. And out of the three places it's used, one of them is the key to this issue. If I comment out

earthly/outmon/vertexmon.go

Lines 177 to 180 in 23000c0

if sameAsLast {
// Overwrite previous line if this update is for the same thing as the previous one.
builder = append(builder, string(ansiUp))
}
I get my last line of output intact, 100% of the time, with no other noticeable adverse effects. (Key being noticeable... I'm sure that code serves a valuable purpose in other contexts.) Obviously the true problem is in the caller (and that code has been in place for at least two years).

@rrjjvv
Copy link
Contributor Author

rrjjvv commented Apr 28, 2024

Okay, I've figured this out... it took a long time to realize it wasn't a logic/algorithm error, and even more frustrating since my previous two comments had 90% of the answer, but I didn't recognize it as such at the time.

The root cause is that in a standard build (post-0.7.17) there are two SolverMonitors in use and they don't collaborate.

Current (with logbus):

              logbus | Setting organization "" and project ""
          +test-pull | --> FROM +base
 earthly/dind:alpine | --> Load metadata earthly/dind:alpine linux/amd64
          +test-pull | --> WITH DOCKER RUN
             busybox | --> Load metadata busybox linux/amd64

*****  everything in this block is handled/output by the "outmon" solver  ******
             busybox | --> DOCKER PULL busybox
             busybox | [          ]   0% resolve docker.io/library/busybox:latest@sha256:c3839dd800b9eb7603340509769c43e146a74c63dca3045a8e7dc8ee07e53966
                 RJV | If I wasn't hacked, I'd be deleting the line above me
             busybox | [----------] 100% resolve docker.io/library/busybox:latest@sha256:c3839dd800b9eb7603340509769c43e146a74c63dca3045a8e7dc8ee07e53966
              output | --> exporting outputs
              output | [          ]   0% exporting layers
                 RJV | If I wasn't hacked, I'd be deleting the line above me
              output | [----------] 100% exporting layers
              output | [          ]   0% exporting manifest sha256:2dcaf02004cc21a645c79976d06c5077773429bfdfe42274ecaa40368a38831d
                 RJV | If I wasn't hacked, I'd be deleting the line above me
              output | [----------] 100% exporting manifest sha256:2dcaf02004cc21a645c79976d06c5077773429bfdfe42274ecaa40368a38831d
              output | [          ]   0% exporting config sha256:5a76b58d4cda708886eed5c0a18b198064da1127156be2a878c9948aa0914350
                 RJV | If I wasn't hacked, I'd be deleting the line above me
              output | [----------] 100% exporting config sha256:5a76b58d4cda708886eed5c0a18b198064da1127156be2a878c9948aa0914350
              output | [          ]   0% transferring docker.io/library/busybox:latest
*****  the previous line is still from outmon.. it has no idea something is going to be writing  ******

***** these are written/tracked by the 'solvermon' solver and doesn't know outmon has a pending operation *****
          +test-pull | --> FROM earthly/dind:alpine
          +test-pull | [----------] 100% FROM earthly/dind:alpine
          +test-pull | --> WITH DOCKER (install deps)
          +test-pull | --> WITH DOCKER RUN --privileged echo "pull"
          +test-pull | Starting dockerd with data root /var/earthly/dind/3127d94d515854ad966734095aa7f8b2e1e3711fbbe6b1af06ff3d69dbfdb80a/tmp.eDaDMn
          +test-pull | Loading images from BuildKit via embedded registry...
          +test-pull | Pulling 172.30.0.1:8371/sess-xo6l6sumx9af0zh2wtilsxdmk:img-0 and retagging as busybox:latest
          +test-pull | 172.30.0.1:8371/sess-xo6l6sumx9af0zh2wtilsxdmk:img-0
          +test-pull | Untagged: 172.30.0.1:8371/sess-xo6l6sumx9af0zh2wtilsxdmk:img-0
          +test-pull | Untagged: 172.30.0.1:xxxxx@sha256:2dcaf02004cc21a645c79976d06c5077773429bfdfe42274ecaa40368a38831d
          +test-pull | Loading images done in 0 ms
          +test-pull | pull
*****  solvermon stops  *****

***** outmon starts again; last it knew, there was a status at 0%, but now is at 100% thus should be deleted  ******
                 RJV | If I wasn't hacked, I'd be deleting the line above me
              output | [----------] 100% transferring docker.io/library/busybox:latest

I added a line indicating when the code wants to delete the line, but doesn't actually do so. From the earlier output, it's clear that the purpose is to delete a "0% blah" if/when a matching "100% blah" comes up. The problem is that 'outmon' is not aware something else was written between the 0% and 100% progress that it is tracking.

With logstream disabled:

              logbus | Setting organization "" and project ""
 earthly/dind:alpine | --> Load metadata earthly/dind:alpine linux/amd64
             busybox | --> Load metadata busybox linux/amd64
             busybox | --> DOCKER PULL busybox
             busybox | [          ]   0% resolve docker.io/library/busybox:latest@sha256:c3839dd800b9eb7603340509769c43e146a74c63dca3045a8e7dc8ee07e53966
                 RJV | If I wasn't hacked, I'd be deleting the line above me
             busybox | [----------] 100% resolve docker.io/library/busybox:latest@sha256:c3839dd800b9eb7603340509769c43e146a74c63dca3045a8e7dc8ee07e53966
              output | --> exporting outputs
              output | [          ]   0% exporting layers
                 RJV | If I wasn't hacked, I'd be deleting the line above me
              output | [----------] 100% exporting layers
              output | [          ]   0% exporting manifest sha256:2dcaf02004cc21a645c79976d06c5077773429bfdfe42274ecaa40368a38831d
                 RJV | If I wasn't hacked, I'd be deleting the line above me
              output | [----------] 100% exporting manifest sha256:2dcaf02004cc21a645c79976d06c5077773429bfdfe42274ecaa40368a38831d
              output | [          ]   0% exporting config sha256:5a76b58d4cda708886eed5c0a18b198064da1127156be2a878c9948aa0914350
                 RJV | If I wasn't hacked, I'd be deleting the line above me
              output | [----------] 100% exporting config sha256:5a76b58d4cda708886eed5c0a18b198064da1127156be2a878c9948aa0914350
              output | [          ]   0% transferring docker.io/library/busybox:latest

***** the next operation 'overwrites' the preceding one... the monitor doesn't care that it didn't finish ****

          +test-pull | --> FROM earthly/dind:alpine
          +test-pull | [          ]   0% resolve docker.io/earthly/dind:alpine@sha256:8de4d6a322a9cdde70daa58c7c9c4eba7f2fb79190c4396789bb8ba057366950
                 RJV | If I wasn't hacked, I'd be deleting the line above me
          +test-pull | [----------] 100% resolve docker.io/earthly/dind:alpine@sha256:8de4d6a322a9cdde70daa58c7c9c4eba7f2fb79190c4396789bb8ba057366950

**** at this point the monitor sees nothing pending  *****

          +test-pull | --> WITH DOCKER (install deps)
          +test-pull | --> WITH DOCKER RUN --privileged echo "pull"
          +test-pull | Starting dockerd with data root /var/earthly/dind/3127d94d515854ad966734095aa7f8b2e1e3711fbbe6b1af06ff3d69dbfdb80a/tmp.CadCFH
          +test-pull | Loading images from BuildKit via embedded registry...
          +test-pull | Pulling 172.30.0.1:8371/sess-h9nvpmqso47ti0617pkdy56to:img-0 and retagging as busybox:latest
          +test-pull | 172.30.0.1:8371/sess-h9nvpmqso47ti0617pkdy56to:img-0
          +test-pull | Untagged: 172.30.0.1:8371/sess-h9nvpmqso47ti0617pkdy56to:img-0
          +test-pull | Untagged: 172.30.0.1:8371/sess-h9nvpmqso47ti0617pkdy56to@sha256:2dcaf02004cc21a645c79976d06c5077773429bfdfe42274ecaa40368a38831d
          +test-pull | Loading images done in 1 ms
          +test-pull | pull
             ongoing | output (11 seconds ago), +test-pull (11 seconds ago)

****  there is no preceding 0%, so no need to delete a previous line  ****

              output | [----------] 100% transferring docker.io/library/busybox:latest

Though there are two monitor 'operations', involved, it is the same instance (sharing the same state).

There's basically three different places where monitors are used; the main/parent one is the only one where the 'solvermon' flavor is used

if s.useLogstream {
eg.Go(func() error {
return s.logbusSM.MonitorProgress(ctx, ch)
})
} else {
eg.Go(func() error {
var err error
vertexFailureOutput, err = s.sm.MonitorProgress(ctx, ch, phaseText, false, s.bkClient)
return err
})
}
where as the other two places
vertexFailureOutput, err := m.sm.MonitorProgress(ctx, statusChan, "", true, m.bkClient)
and
vertexFailureOutput, err = s.sm.MonitorProgress(ctx, ch, "", true, s.bkClient)
only use the other variant.

That may have been simple oversight, but I'm guessing that the solvermon one needs to be refactored to be a 'child' monitor (i.e., the sideRun=true). If the intent is truly to use both monitors at the same time, they probably need to communicate to avoid this issue. There are a number of other "lastX" variables (tracking the last thing that was written), which means there may be other similar bugs lurking as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:bug Something isn't working
Projects
Status: Todo
Development

No branches or pull requests

2 participants