Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation for custom artifact in Vertex AI #6102

Open
jeongukjae opened this issue Jul 26, 2023 · 1 comment
Open

Documentation for custom artifact in Vertex AI #6102

jeongukjae opened this issue Jul 26, 2023 · 1 comment

Comments

@jeongukjae
Copy link
Contributor

URL(s) with the issue:

https://www.tensorflow.org/tfx/api_docs/python/tfx/v1/dsl/Artifact

Description of issue (what needs changing):

Clear description:

For now, custom artifact should satisfy the schema title regex (^[a-z][a-z0-9-_]{2,20}[.][A-Z][a-zA-Z0-9-_]{2,49}$).

def get_artifact_schema(artifact_type: Type[artifact.Artifact]) -> str:
"""Gets the YAML schema string associated with the artifact type.
Args:
artifact_type: the artifact type that the schema is generated for.
Returns:
the encoded yaml schema definition for the artifact.
Raises:
ValueError if custom artifact type name does not adhere to KFP schema title.
"""
if artifact_type in _SUPPORTED_STANDARD_ARTIFACT_TYPES:
# For supported first-party artifact types, get the built-in schema yaml per
# its type name.
schema_path = os.path.join(
os.path.dirname(__file__), 'artifact_types',
'{}.yaml'.format(artifact_type.TYPE_NAME))
return fileio.open(schema_path, 'rb').read()
else:
# Otherwise, fall back to the generic `Artifact` type schema.
# To recover the Python type object at runtime, the artifact TYPE_NAME will
# be encoded as the schema title.
# Read the generic artifact schema template.
if not _SCHEMA_TITLE_RE.fullmatch(artifact_type.TYPE_NAME):
raise ValueError(
f'Invalid custom artifact type name: {artifact_type.TYPE_NAME}')
schema_path = os.path.join(
os.path.dirname(__file__), 'artifact_types', 'Artifact.yaml')
data = yaml.safe_load(fileio.open(schema_path, 'rb').read())
# Encode artifact TYPE_NAME.
data['title'] = artifact_type.TYPE_NAME
return yaml.dump(data, sort_keys=False)

And custom artifact should be accessible with its title (top-level module) to be resolved in KubeFlowV2's container entry point.

def _retrieve_class_path(type_schema: pipeline_pb2.ArtifactTypeSchema) -> str:
"""Gets the class path from an artifact type schema."""
if type_schema.WhichOneof('kind') == 'schema_title':
title = type_schema.schema_title
if type_schema.WhichOneof('kind') == 'instance_schema':
data = yaml.safe_load(type_schema.instance_schema)
title = data.get('title', 'tfx.Artifact')
if title in compiler_utils.TITLE_TO_CLASS_PATH:
# For first party types, the actual import path is maintained in
# TITLE_TO_CLASS_PATH map.
return compiler_utils.TITLE_TO_CLASS_PATH[title]
else:
# For custom types, the import path is encoded as the schema title.
return title

But this information does not documented in the Artifact class. So it will be helpful for the developers who want to extend TFX with Vertex AI.

@singhniraj08 singhniraj08 self-assigned this Jul 26, 2023
@singhniraj08
Copy link
Contributor

@jeongukjae,

Thank you bring this up for enhancing our documentation. Let me take this internally and update this thread with updates. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants