Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: on demand docker environments #1796

Open
wants to merge 38 commits into
base: master
Choose a base branch
from

Conversation

saikonen
Copy link
Collaborator

@saikonen saikonen commented Apr 10, 2024

Adds support for on demand Docker images. Open for discussion

open todo's

  • add fallback to conda environment when executing locally with --environment docker
  • add support for imagebakery endpoint auth
  • cache bakery image tags locally to avoid environment drift with consecutive deployments when using loosely pinned versions

@saikonen saikonen marked this pull request as ready for review April 29, 2024 11:19
@savingoyal
Copy link
Collaborator

python for.py --environment=docker run without @kubernetes immediately returns an error that Image Bakery is not configured. Perhaps it shouldn't throw the error?

get_pinned_conda_libs,
)

BAKERY_METAFILE = ".imagebakery-cache"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the imagebakery-cache can also be made an optimization detail within docker_environment.py

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, can you help me understand the contents of this cache?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved caching to the environment side. what is in the cache at the moment is

  • a unique hash derived from the requested environment (python, packages, image-type etc.)
  • the image tag that was returned from the bakery endpoint for the request
  • we also persist the request payload that was sent to the bakery in order to be able to list details of cached images via docker cache list

import hashlib
import json
import requests

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would something of this sorts work here -

Suggested change
import json
import requests
class FastBakery(object):
def __init__(self, url):
self.url = url
def bake(
self,
python_ver=None, # TODO - Ensure the keys are same in the API spec and client
pypi_packages=None,
conda_packages=None,
base_image=None,
image_kind="esgz-zstd",
):
deep_clean = (lambda f: lambda d: f(f, d))(
lambda self, item: {
k: v
for k, v in (
(k, self(self, v)) if isinstance(v, dict) else (k, v)
for k, v in item.items()
)
if v is not None and v != {}
}
)
payload = deep_clean(
{
"pythonVersion": python_ver,
"imageKind": image_kind,
"pipRequirements": pypi_packages,
"condaMatchspecs": conda_packages,
"baseImage": {"imageReference": base_image},
}
)
return self.aws_iam_invoker(payload)
def aws_iam_invoker(self, payload):
try:
# AWS_IAM requires a signed request to be made
headers = {"Content-Type": "application/json"}
payload = json.dumps(payload)
import boto3
from botocore.auth import SigV4Auth
from botocore.awsrequest import AWSRequest
from botocore.exceptions import BotoCoreError, NoCredentialsError
import urllib
session = boto3.Session()
credentials = session.get_credentials().get_frozen_credentials()
# credits to https://github.com/boto/botocore/issues/1784#issuecomment-659132830,
# We need to jump through some hoops when calling the endpoint with IAM auth
# as botocore does not offer a direct utility for signing arbitrary requests
req = AWSRequest("POST", self.url, headers, payload)
SigV4Auth(
credentials, service_name="lambda", region_name=session.region_name
).add_auth(req)
response = requests.post(req.url, data=req.body, headers=req.headers)
# Manually check the status code and raise an exception if it's an error
if response.status_code >= 400:
raise requests.exceptions.HTTPError(
f"HTTP {response.status_code} Error: {response.reason}\nResponse Body: {response.text}",
response=response
)
return response
except requests.exceptions.HTTPError as e:
raise e
except (requests.exceptions.RequestException, BotoCoreError, NoCredentialsError) as e:
# Handle specific errors related to requests and AWS services
raise Exception(f"Failed to invoke AWS service due to network or AWS error: {e}")
except Exception as e:
# Handle other possible exceptions
raise Exception(f"An unexpected error occurred: {e}")

self.bake_image_for_step(step)
echo("Environments are ready!")
if self.steps_to_delegate:
# TODO: add debug echo to output steps that required a conda environment.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these comments too

self.delegate.validate_environment(echo, self.datastore_type)
self.delegate.init_environment(echo, self.steps_to_delegate)

def bake_image_for_step(self, step):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multiple steps might share the same image - maybe we can check if that is the case and only bake unique images?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the images are baked in sequence so the consecutive steps with identical packages should be hitting the cache at this point, skipping the baking. previously the caching was happening inside the fast_bakery, but moved to the environment side now so it should be more clear.

@@ -23,6 +24,9 @@
DEFAULT_SECRETS_BACKEND_TYPE,
AWS_SECRETS_MANAGER_DEFAULT_REGION,
S3_SERVER_SIDE_ENCRYPTION,
FAST_BAKERY_URL,
FAST_BAKERY_AUTH,
FAST_BAKERY_TYPE,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can avoid introducing these as env vars here and in @kubernetes with the expectation that using fast bakery may involve using @secrets or @environment - to configure the URL and auth. The client can try to read from env vars unless explicitly configured with the URL and auth?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants