Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

airbyte-ci: introduce a SecretStore abstraction #38322

Conversation

alafanechere
Copy link
Contributor

@alafanechere alafanechere commented May 17, 2024

What

Closes https://github.com/airbytehq/airbyte-internal-issues/issues/7551

This PR introduces a SecretStore abstraction to streamline the management of secrets in airbyte-ci.
In practice it allows to:

  • Fetch connector secrets according to pointers declared in the connectorTestSuitesOptions field of metadata.yaml
  • Encapsulate user defined secrets (manual CLI options / Env vars / Local file) in a specific Secret and SecretStore
    classes.

Review guide

User Impact and manual testing

The main impact is that the source of truth for which secret is being used in CI for connectors is now officially metadata.yaml

Copy link

vercel bot commented May 17, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview May 23, 2024 10:54am

Copy link
Contributor Author

alafanechere commented May 17, 2024

@alafanechere alafanechere force-pushed the augustin/05-16-airbyte-ci_fetch_connector_secrets_from_connectorTestSuitesOptions branch from bbf7834 to 6ee7d9e Compare May 17, 2024 18:15
@alafanechere alafanechere force-pushed the augustin/05-15-airbyte-ci_dynamic_test_step_tree_according_to_metadata branch 2 times, most recently from 73e1c2c to 4832aa4 Compare May 21, 2024 06:45
@alafanechere alafanechere force-pushed the augustin/05-16-airbyte-ci_fetch_connector_secrets_from_connectorTestSuitesOptions branch from 6ee7d9e to 4f33ce9 Compare May 21, 2024 06:47
@alafanechere alafanechere force-pushed the augustin/05-15-airbyte-ci_dynamic_test_step_tree_according_to_metadata branch from 4832aa4 to c980901 Compare May 21, 2024 07:08
Base automatically changed from augustin/05-15-airbyte-ci_dynamic_test_step_tree_according_to_metadata to master May 21, 2024 07:25
@alafanechere alafanechere changed the title airbyte-ci: fetch connector secrets from connectorTestSuitesOptions airbyte-ci: introduce a SecretStore abstraction May 21, 2024
@alafanechere alafanechere force-pushed the augustin/05-16-airbyte-ci_fetch_connector_secrets_from_connectorTestSuitesOptions branch 4 times, most recently from 8cd06fb to 73ac8c5 Compare May 21, 2024 13:34
@alafanechere alafanechere marked this pull request as ready for review May 21, 2024 13:35
@alafanechere alafanechere requested a review from a team as a code owner May 21, 2024 13:35
@alafanechere alafanechere force-pushed the augustin/05-16-airbyte-ci_fetch_connector_secrets_from_connectorTestSuitesOptions branch 8 times, most recently from 1c4e4e1 to 38a9f76 Compare May 22, 2024 13:17
Copy link
Contributor

@bnchrch bnchrch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple minor non structural changes I think we should do

But,

You were right.

I did love the abstraction.

This is REALLY good work. 💎

def _fetch_secret(self, name: str) -> str:
raise NotImplementedError("SecretStore subclasses must implement a _fetch_secret method")

def fetch_secret(self, name: str) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💎 Nice implementation

service_account_info = json.loads(gcp_credentials.value)
credentials = service_account.Credentials.from_service_account_info(service_account_info)
self.gsm_client = secretmanager_v1.SecretManagerServiceClient.from_service_account_info(service_account_info)
# This assumes the service account can only read a single project: the one it was created on.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💎 good warning

)
response = self.gsm_client.access_secret_version(request=request)
return response.payload.data.decode()
raise SecretNotFoundError(f"No enabled secret version in GSM found for secret {name}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💅 Could use white space above and below the for loop to improve reading


def _fetch_secret(self, name: str) -> str:
secret_path = self.local_directory_path / name
if not secret_path.exists():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💎 Good error handling!

return md5_hash.hexdigest()[:20]

def as_dagger_secret(self, dagger_client: DaggerClient) -> DaggerSecret:
return dagger_client.set_secret(self.value_hash, self.value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ Why do we have to transform it to a md5 hash?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bnchrch :

  • The Dagger client is shared between connector pipelines
  • Dagger keeps an internal in memory store of secrets, by name
  • If two connector use the same secret name they would step on each others toe and we'd have strange errors in CI to debug.
  • I could have used a namespaced secret name like connector-name_secret_name but I believe these new model should be available for non connector use cases.
  • MD5 sounded like the simplest and straightforward way of getting unique secret names

@@ -125,6 +126,31 @@ def is_current_process_wrapped_by_dagger_run() -> bool:
return called_with_dagger_run


def wrap_in_secret(ctx: click.Context, param: click.Option, value: Any) -> Optional[Secret]: # noqa
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💅 We have a secrets.py file for dagger actions. Should we do the same for cli?

def wrap_in_secret(ctx: click.Context, param: click.Option, value: Any) -> Optional[Secret]: # noqa
if value is None:
return None
assert param.name is not None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💅 Could use some white space to separate the logic sections.

@@ -24,6 +26,7 @@
type=click.STRING,
required=True,
envvar="SPEC_CACHE_GCS_CREDENTIALS",
callback=wrap_gcp_credentials_in_secret,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💎 Cool use of callback!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docker_hub_username=ctx.obj["docker_hub_username"],
docker_hub_password=ctx.obj["docker_hub_password"],
docker_hub_username=Secret("docker_hub_username", ctx.obj["secret_stores"]["in_memory"]),
docker_hub_password=Secret("docker_hub_password", ctx.obj["secret_stores"]["in_memory"]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ Why do we treat dockerhub secrets differently than the python registry token or gcs creds?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the PublishContext requires these two parameters to not be None.
ctx.obj["python_registry_token"] can be set to None and the PublishContext does handle that.

I believe we would should rework the typing for consistency, but I did not want to fall too deep in a refactoring rabbit hole.

@@ -114,8 +116,8 @@ async def publish(
spec_cache_bucket_name=spec_cache_bucket_name,
metadata_service_gcs_credentials=metadata_service_gcs_credentials,
metadata_bucket_name=metadata_service_bucket_name,
docker_hub_username=ctx.obj["docker_hub_username"],
docker_hub_password=ctx.obj["docker_hub_password"],
docker_hub_username=Secret("docker_hub_username", ctx.obj["secret_stores"]["in_memory"]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💅 nit: I dont think username should be a secret. Its useful for debugging

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's considered a secret on GHA 😄

Copy link
Contributor

@bnchrch bnchrch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preapproving for timezone reasons :)

Copy link
Contributor

@clnoll clnoll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice @alafanechere! Just a few comments & questions, nothing major.

if raise_on_missing_secret_store:
raise SecretNotFoundError(message)
if logger is not None:
logger.warn(message)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under what circumstances would it be okay to declare a secret store but have it not be available? This will just error downstream otherwise, right? (In a possibly unpredictable way that might not have a nice error message.)

It feels like we should always raise if the secret was declared but not available (and then we can get rid of the raise_on_missing_secret_store code paths).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@clnoll when the users does not have access to our GSM secret store it won't be available.
get_connector_secrets_for_test_suite does the following:

  • Fetch secrets from remote secret store by calling get_secrets_from_connector_test_suites_option
  • Fetch secrets from the local secret store (in the secrets connector directory)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in this case the secret store is the local secret store instead of the GSM one, right? In which case there will still be a secret store available?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in this case the secret store is the local secret store instead of the GSM one, right?

Secrets can come from various secret store.
The current logic concatenates secrets from GSM and local ones.
Most of our internal airbyte-ci user have access to GSM locally so I think it's the right logic.

Comment on lines 33 to 101
def get_secrets_from_connector_test_suites_option(
connector_test_suites_options: List[Dict[str, str | Dict[str, List[Dict[str, str | Dict[str, str]]]]]],
suite_name: str,
secret_stores: Dict[str, SecretStore],
raise_on_missing_secret_store: bool = True,
logger: logging.Logger | None = None,
) -> List[Secret]:
"""Get secrets declared in metadata connectorTestSuitesOptions for a test suite name.
It will use the secret store alias declared in connectorTestSuitesOptions.
If the secret store is not available a warning or and error could be raised according to the raise_on_missing_secret_store parameter value.
We usually want to raise an error when running in CI context and log a warning when running locally, as locally we can fallback on local secrets.

Args:
connector_test_suites_options (List[Dict[str, str | Dict]]): The connector under test test suite options
suite_name (str): The test suite name
secret_stores (Dict[str, SecretStore]): The available secrets stores
raise_on_missing_secret_store (bool, optional): Raise an error if the secret store declared in the connectorTestSuitesOptions is not available. Defaults to True.
logger (logging.Logger | None, optional): Logger to log a warning if the secret store declared in the connectorTestSuitesOptions is not available. Defaults to None.

Raises:
SecretNotFoundError: Raised if the secret store declared in the connectorTestSuitesOptions is not available and raise_on_missing_secret_store is truthy.

Returns:
List[Secret]: List of secrets declared in the connectorTestSuitesOptions for a test suite name.
"""
secrets: List[Secret] = []

for enabled_test_suite in connector_test_suites_options:
if enabled_test_suite["suite"] == suite_name:
if enabled_test_suite.get("testSecrets"):
assert isinstance(enabled_test_suite["testSecrets"], list)
suite_secrets: List[Dict[str, str | Dict[str, str]]] = enabled_test_suite["testSecrets"]
for s in suite_secrets:
if s["secretStore"]["alias"] not in secret_stores:
message = f"Secret {s['name']} can't be retrieved as {s['secretStore']['alias']} is not available"
if raise_on_missing_secret_store:
raise SecretNotFoundError(message)
if logger is not None:
logger.warn(message)
continue
secret_store = secret_stores[s["secretStore"]["alias"]]
secret = Secret(s["name"], secret_store, file_name=s.get("fileName"))
secrets.append(secret)

return secrets
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tbh I actually think I prefer the original version 😄 . In particular, I'm wary of functions that return an Optional[thing] like _process_secret since it adds extra work to handle the None case.

This version also confirms my theory from my previous comment that getting rid of raise_on_missing_secret_store will make the code much nicer to read.

if self._connector_secrets is None:
self._connector_secrets = await secrets.get_connector_secrets(self)
return self._connector_secrets
def local_secret_store(self) -> Optional[LocalDirectorySecretStore]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels a little random to stash the local secrets info here, when we're getting it from helper functions in python_connectors.py and java_connectors.py. Would it be possible to follow the same pattern you used for the other secret stores?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tied the local_secret_store to the connector context because:

  • We have one context instance per connector under test
  • We want a local secret store per connector (the content of the secrets folder)

@@ -24,6 +26,7 @@
type=click.STRING,
required=True,
envvar="SPEC_CACHE_GCS_CREDENTIALS",
callback=wrap_gcp_credentials_in_secret,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

raise SecretNotFoundError(f"No secret found in GSM for secret {name}")
for version in page_result:
# 1 means enabled version
if version.state == 1:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use State.ENABLED?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could not figure how to import this enum class from the google library 🫤

request = secretmanager_v1.ListSecretVersionsRequest(
parent=f"projects/{self.project_id}/secrets/{name}",
)
page_result = self.gsm_client.list_secret_versions(request=request)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the results in a particular order (I didn't see evidence of that here but didn't look too hard)? Or do we just expect one? Since we only return the first one, feels like we could use a comment about why it's correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GSM only allows one enabled secret version.
According to the code snippet list_secret_versions returns all the versions.
My variable name is misleading here as it implicitly declares some sort of pagination. Will rename it.

@alafanechere alafanechere force-pushed the augustin/05-16-airbyte-ci_fetch_connector_secrets_from_connectorTestSuitesOptions branch from 38a9f76 to 100b522 Compare May 23, 2024 10:32
@alafanechere alafanechere force-pushed the augustin/05-16-airbyte-ci_fetch_connector_secrets_from_connectorTestSuitesOptions branch from 100b522 to 8c5ab8a Compare May 23, 2024 10:54
@alafanechere alafanechere enabled auto-merge (squash) May 23, 2024 11:04
@alafanechere alafanechere merged commit 62c6820 into master May 23, 2024
30 checks passed
@alafanechere alafanechere deleted the augustin/05-16-airbyte-ci_fetch_connector_secrets_from_connectorTestSuitesOptions branch May 23, 2024 11:11
Copy link

sentry-io bot commented May 23, 2024

Suspect Issues

This pull request was deployed and Sentry observed the following issues:

  • ‼️ SecretNotFoundError: Secret SECRET_DESTINATION-ASTRA__CREDS can't be retrieved as airbyte-connector-testing-secret-sto... pipelines.airbyte_ci.connectors.test.steps.comm... View Issue
  • ‼️ SecretNotFoundError: Secret SECRET_SOURCE_GONG_CREDS_OAUTH can't be retrieved as airbyte-connector-testing-secret-stor... pipelines.airbyte_ci.connectors.test.steps.comm... View Issue
  • ‼️ UnboundLocalError: local variable 'repo_dir' referenced before assignment pipelines.airbyte_ci.connectors.bump_version.pi... View Issue

Did you find this useful? React with a 👍 or 👎

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants