Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable SubmitTaskStateChange for early digest reporting #4169

Merged
merged 4 commits into from May 13, 2024

Conversation

amogh09
Copy link
Contributor

@amogh09 amogh09 commented May 8, 2024

Summary

This PR enables SubmitTaskStateChange (STSC) calls from Agent to ECS backend for early reporting of container image manifest digests. Containers that are in-scope of digest resolution (non-internal containers and containers whose image reference does not already have a digest) will have their image manifest digests reported to ECS backend with an additional STSC call that Agent will make after the task has reached MANIFEST_PULLED state. The container states in this additional STSC call will be "PENDING" which is a backend-recognized state for pending containers.

This additional STSC call will be made only if there is at least one container for which digest was successfully resolved during transition to MANIFEST_PULLED state.

An example of this new STSC call is shown below.

    {
      "clusterReference": {
        "accountId": "<account-id>",
        "name": "test"
      },
      "containers": [
        {
          "containerName": "second",
          "imageDigest": "sha256:5eef5ed34e1e1ff0a4ae850395cbf665c4de6b4b83a32a0bc7bcb998e24e7bbb",
          "status": "PENDING"
        },
        {
          "containerName": "first",
          "imageDigest": "sha256:32e76d4f34f80e479964a0fbd4c5b4f6967b5322c8d004e9cf0cb81c93510766",
          "status": "PENDING"
        }
      ],
      "identity": <redacted>,
      "reason": "",
      "requestDate": 1715292586,
      "status": "PENDING",
      "taskId": <redacted>
    }

Implementation details

  • Add a new *Container.DigestResolutionRequired method that checks if a container requires digest resolution. This method is now called in container state transition function for MANIFEST_PULLED state to perform digest resolution only for containers that need it. The method returns true for non-internal containers whose Image field does not already contain a digest.
  • Add a new *Container.DigestResolved method that checks if the container has had its image manifest digest resolved. The method is used to prevent emitting a container change event for containers ineligible for digest reporting (due to it being internal or having digest available in its Image field or due to a failure in digest resolution).
  • Add a new *Task.HasAContainerWithResolvedDigest method that checks if the task has at least one container with a successfully resolved digest. The method is used to prevent emitting a task change event for tasks that don't have any resolved digests to report.
  • Existing NewTaskStateChangeEvent method is updated to generate a task state change event for transition to MANIFEST_PULLED state if at least one of task's containers has a resolved digest.
  • Existing NewContainerStateChangeEvent method is updated to generate a container state change event for transition to MANIFEST_PULLED state if the container's digest was resolved. A new helper function containerStatusChangeStatus for mapping container's known status to a status eligible for STSC reporting is added which replaces *ContainerStatus.BackendStatus method from ecs-agent module. The reason to move the function to agent module is that the function has implementation details of ECS Agent which are unrelated to the more general nature of ecs-agent module. In addition to the existing logic of *ContainerStatus.BackendStatus method, the new function also maps ContainerManifestPulled state to ContainerManifestPulled state making the state eligible for STSC reporting.
  • api.buildContainerStateChangePayload method, that builds a container state change payload for STSC calls, is updated to allow ContainerManifestPulled state changes to go through and use the recently added (Add BackendStatusString method to ContainerStatus #4167) ContainerStatus.BackendStatusString method to map the state to "PENDING" container state which is backend-recognized.
  • *DockerTaskEngine.pullContainerManifest method, which is the container state transition function for MANIFEST_PULLED state, is updated to not resolve digest for containers that have digest populated in their Image field. This is because digest resolution is not needed for such containers and not populating ImageDigest field during transition to MANIFEST_PULLED state helps detect that the container's digest does not need to be reported early. It will be reported as usual after the container's image is pulled.
  • *managedTask.emitTaskEvent method, that is used to generate and submit task state change events, is updated to allow transitions to MANIFEST_PULLED states to be reported. This gating is probably redundant given that it's performed in NewTaskStateChangeEvent method also but this PR is keeping this additional gating as removing it is considered out-of-scope.

Testing

Given that the change affects all tasks, a LOT of existing unit and integration tests have been updated to check that state change events for MANIFEST_PULLED container and task states are emitted or not emitted depending on the specific test cases.

Thorough unit tests are added for all new functions introduced in this PR.

Manual testing was performed and it was observed that -

  • An additional STSC call is made with "PENDING" container states with "ImageDigest" fields populated for each container for which digest was resolved successfully.
  • Containers that had digest already populated in their "Image" field in the task definition did not have their digest resolved or reported early by Agent. These containers did have their digest reported as usual when the task transitioned to "RUNNING".
  • Containers that had a failure during digest resolution (due to intentional usage of an invalid image) did not have a container change event generated.

New tests cover the changes: yes

Description for the changelog

Feature: Early reporting of image manifest digests for eligible containers

Does this PR include breaking model changes? If so, Have you added transformation functions?

Licensing

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@amogh09 amogh09 changed the title Digest stsc Enable SubmitTaskStateChange for early digest reporting May 9, 2024
@amogh09 amogh09 marked this pull request as ready for review May 9, 2024 23:01
@amogh09 amogh09 requested a review from a team as a code owner May 9, 2024 23:01
Yiyuanzzz
Yiyuanzzz previously approved these changes May 10, 2024
Copy link
Contributor

@Yiyuanzzz Yiyuanzzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed implementation description!

@amogh09 amogh09 merged commit 6f30d38 into aws:feature/digest-resolution May 13, 2024
40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants