Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mashmallow or pydantic models from json-schema #1879

Open
dazza-codes opened this issue Mar 11, 2021 · 4 comments
Open

mashmallow or pydantic models from json-schema #1879

dazza-codes opened this issue Mar 11, 2021 · 4 comments

Comments

@dazza-codes
Copy link

dazza-codes commented Mar 11, 2021

Apologies if this already discussed somewhere - I am new to this project. To get started I used cfn2py script and tested some round-trip serializations back to json and yaml using some deepdiff and cfn-lint checks.

One concern is that boolean values are not JSON booleans but strings. Why does the t.to_dict() and t.to_json() data contain strings instead of JSON booleans? It seems like encode_to_dict(obj) should be replaced with just a json.loads(json.dumps(obj)) and let the json lib take care of all the necessary python/JSON compatibility and encodings.

Or using marshmallow or pydantic models in general should take care of all the schema mappings and serializations. It might also be easier to use botocore service descriptions or other AWS json payloads to auto-generate json-schema and models. It's not quite the same thing as CFN templates, but botocore has service API descriptions in e.g. lib/python3.7/site-packages/botocore/data/cloudformation/2010-05-15/service-2.json; see also

The resource-type-schemas might be amenable to auto-generation of service models using mashmallow or pydantic schema parsers and code generators. If something like this were to work well, it might eliminate most if not all of the issues about supporting new features in CFN services.

In this example, the cfn_schemas is a directory with unzipped data from a regional download .zip file.

pip install datamodel-code-generator[http]

wget https://schema.cloudformation.us-east-1.amazonaws.com/CloudformationSchema.zip
mkdir cfn_schemas
mv CloudformationSchema.zip cfn_schemas/
cd cfn_schemas/
unzip CloudformationSchema.zip 
cd ..
datamodel-codegen  --input cfn_schemas/aws-s3-bucket.json --input-file-type jsonschema --output aws_s3_bucket.py
cat aws_s3_bucket.py 

The pydantic models have built in serializations.

The resulting aws_s3_bucket.py contains:

# generated by datamodel-codegen:
#   filename:  aws-s3-bucket.json
#   timestamp: 2021-03-11T02:07:54+00:00

from __future__ import annotations

from typing import List, Optional

from pydantic import BaseModel


class DefaultRetention(BaseModel):
    Years: Optional[int] = None
    Days: Optional[int] = None
    Mode: Optional[str] = None


class ReplicationTimeValue(BaseModel):
    Minutes: int


class FilterRule(BaseModel):
    Value: str
    Name: str


class AccelerateConfiguration(BaseModel):
    AccelerationStatus: str


class Metrics(BaseModel):
    Status: str
    EventThreshold: Optional[ReplicationTimeValue] = None


class RoutingRuleCondition(BaseModel):
    KeyPrefixEquals: Optional[str] = None
    HttpErrorCodeReturnedEquals: Optional[str] = None


class DeleteMarkerReplication(BaseModel):
    Status: Optional[str] = None


class OwnershipControlsRule(BaseModel):
    ObjectOwnership: Optional[str] = None


class CorsRule(BaseModel):
    ExposedHeaders: Optional[List[str]] = None
    AllowedMethods: List[str]
    AllowedOrigins: List[str]
    AllowedHeaders: Optional[List[str]] = None
    MaxAge: Optional[int] = None
    Id: Optional[str] = None


class AccessControlTranslation(BaseModel):
    Owner: str


class ObjectLockRule(BaseModel):
    DefaultRetention: Optional[DefaultRetention] = None


class S3KeyFilter(BaseModel):
    Rules: List[FilterRule]


class Destination(BaseModel):
    BucketArn: str
    BucketAccountId: Optional[str] = None
    Format: str
    Prefix: Optional[str] = None


class RedirectAllRequestsTo(BaseModel):
    Protocol: Optional[str] = None
    HostName: str


class TagFilter(BaseModel):
    Value: str
    Key: str


class PublicAccessBlockConfiguration(BaseModel):
    RestrictPublicBuckets: Optional[bool] = None
    IgnorePublicAcls: Optional[bool] = None
    BlockPublicPolicy: Optional[bool] = None
    BlockPublicAcls: Optional[bool] = None


class NoncurrentVersionTransition(BaseModel):
    StorageClass: str
    TransitionInDays: int


class ServerSideEncryptionByDefault(BaseModel):
    SSEAlgorithm: str
    KMSMasterKeyID: Optional[str] = None


class MetricsConfiguration(BaseModel):
    TagFilters: Optional[List[TagFilter]] = None
    Id: str
    Prefix: Optional[str] = None


class ObjectLockConfiguration(BaseModel):
    ObjectLockEnabled: Optional[str] = None
    Rule: Optional[ObjectLockRule] = None


class LoggingConfiguration(BaseModel):
    DestinationBucketName: Optional[str] = None
    LogFilePrefix: Optional[str] = None


class Tiering(BaseModel):
    AccessTier: str
    Days: int


class DataExport(BaseModel):
    Destination: Destination
    OutputSchemaVersion: str


class ReplicationTime(BaseModel):
    Status: str
    Time: ReplicationTimeValue


class RedirectRule(BaseModel):
    ReplaceKeyWith: Optional[str] = None
    HttpRedirectCode: Optional[str] = None
    Protocol: Optional[str] = None
    HostName: Optional[str] = None
    ReplaceKeyPrefixWith: Optional[str] = None


class EncryptionConfiguration(BaseModel):
    ReplicaKmsKeyID: str


class InventoryConfiguration(BaseModel):
    Destination: Destination
    OptionalFields: Optional[List[str]] = None
    IncludedObjectVersions: str
    Enabled: bool
    Id: str
    Prefix: Optional[str] = None
    ScheduleFrequency: str


class ReplicationRuleAndOperator(BaseModel):
    TagFilters: Optional[List[TagFilter]] = None
    Prefix: Optional[str] = None


class VersioningConfiguration(BaseModel):
    Status: str


class CorsConfiguration(BaseModel):
    CorsRules: List[CorsRule]


class ReplicaModifications(BaseModel):
    Status: str


class Transition(BaseModel):
    TransitionDate: Optional[str] = None
    TransitionInDays: Optional[int] = None
    StorageClass: str


class SseKmsEncryptedObjects(BaseModel):
    Status: str


class Tag(BaseModel):
    Value: str
    Key: str


class AbortIncompleteMultipartUpload(BaseModel):
    DaysAfterInitiation: int


class SourceSelectionCriteria(BaseModel):
    ReplicaModifications: Optional[ReplicaModifications] = None
    SseKmsEncryptedObjects: Optional[SseKmsEncryptedObjects] = None


class OwnershipControls(BaseModel):
    Rules: List[OwnershipControlsRule]


class RoutingRule(BaseModel):
    RedirectRule: RedirectRule
    RoutingRuleCondition: Optional[RoutingRuleCondition] = None


class NotificationFilter(BaseModel):
    S3Key: S3KeyFilter


class ServerSideEncryptionRule(BaseModel):
    BucketKeyEnabled: Optional[bool] = None
    ServerSideEncryptionByDefault: Optional[ServerSideEncryptionByDefault] = None


class ReplicationDestination(BaseModel):
    AccessControlTranslation: Optional[AccessControlTranslation] = None
    Account: Optional[str] = None
    Metrics: Optional[Metrics] = None
    Bucket: str
    EncryptionConfiguration: Optional[EncryptionConfiguration] = None
    StorageClass: Optional[str] = None
    ReplicationTime: Optional[ReplicationTime] = None


class Rule(BaseModel):
    Status: str
    NoncurrentVersionExpirationInDays: Optional[int] = None
    Transitions: Optional[List[Transition]] = None
    TagFilters: Optional[List[TagFilter]] = None
    NoncurrentVersionTransitions: Optional[List[NoncurrentVersionTransition]] = None
    Prefix: Optional[str] = None
    NoncurrentVersionTransition: Optional[NoncurrentVersionTransition] = None
    ExpirationDate: Optional[str] = None
    ExpirationInDays: Optional[int] = None
    Transition: Optional[Transition] = None
    Id: Optional[str] = None
    AbortIncompleteMultipartUpload: Optional[AbortIncompleteMultipartUpload] = None


class WebsiteConfiguration(BaseModel):
    RoutingRules: Optional[List[RoutingRule]] = None
    IndexDocument: Optional[str] = None
    RedirectAllRequestsTo: Optional[RedirectAllRequestsTo] = None
    ErrorDocument: Optional[str] = None


class TopicConfiguration(BaseModel):
    Event: str
    Topic: str
    Filter: Optional[NotificationFilter] = None


class IntelligentTieringConfiguration(BaseModel):
    Status: str
    TagFilters: Optional[List[TagFilter]] = None
    Tierings: List[Tiering]
    Id: str
    Prefix: Optional[str] = None


class StorageClassAnalysis(BaseModel):
    DataExport: Optional[DataExport] = None


class LambdaConfiguration(BaseModel):
    Function: str
    Event: str
    Filter: Optional[NotificationFilter] = None


class ReplicationRuleFilter(BaseModel):
    Prefix: Optional[str] = None
    And: Optional[ReplicationRuleAndOperator] = None
    TagFilter: Optional[TagFilter] = None


class BucketEncryption(BaseModel):
    ServerSideEncryptionConfiguration: List[ServerSideEncryptionRule]


class LifecycleConfiguration(BaseModel):
    Rules: List[Rule]


class QueueConfiguration(BaseModel):
    Event: str
    Filter: Optional[NotificationFilter] = None
    Queue: str


class ReplicationRule(BaseModel):
    Status: str
    Destination: ReplicationDestination
    Filter: Optional[ReplicationRuleFilter] = None
    Priority: Optional[int] = None
    SourceSelectionCriteria: Optional[SourceSelectionCriteria] = None
    Id: Optional[str] = None
    Prefix: Optional[str] = None
    DeleteMarkerReplication: Optional[DeleteMarkerReplication] = None


class ReplicationConfiguration(BaseModel):
    Role: str
    Rules: List[ReplicationRule]


class AnalyticsConfiguration(BaseModel):
    TagFilters: Optional[List[TagFilter]] = None
    StorageClassAnalysis: StorageClassAnalysis
    Id: str
    Prefix: Optional[str] = None


class NotificationConfiguration(BaseModel):
    QueueConfigurations: Optional[List[QueueConfiguration]] = None
    LambdaConfigurations: Optional[List[LambdaConfiguration]] = None
    TopicConfigurations: Optional[List[TopicConfiguration]] = None


class Model(BaseModel):
    InventoryConfigurations: Optional[List[InventoryConfiguration]] = None
    WebsiteConfiguration: Optional[WebsiteConfiguration] = None
    DualStackDomainName: Optional[str] = None
    AccessControl: Optional[str] = None
    AnalyticsConfigurations: Optional[List[AnalyticsConfiguration]] = None
    AccelerateConfiguration: Optional[AccelerateConfiguration] = None
    PublicAccessBlockConfiguration: Optional[PublicAccessBlockConfiguration] = None
    BucketName: Optional[str] = None
    RegionalDomainName: Optional[str] = None
    OwnershipControls: Optional[OwnershipControls] = None
    ObjectLockConfiguration: Optional[ObjectLockConfiguration] = None
    ObjectLockEnabled: Optional[bool] = None
    LoggingConfiguration: Optional[LoggingConfiguration] = None
    ReplicationConfiguration: Optional[ReplicationConfiguration] = None
    Tags: Optional[List[Tag]] = None
    DomainName: Optional[str] = None
    BucketEncryption: Optional[BucketEncryption] = None
    WebsiteURL: Optional[str] = None
    NotificationConfiguration: Optional[NotificationConfiguration] = None
    LifecycleConfiguration: Optional[LifecycleConfiguration] = None
    VersioningConfiguration: Optional[VersioningConfiguration] = None
    MetricsConfigurations: Optional[List[MetricsConfiguration]] = None
    IntelligentTieringConfigurations: Optional[
        List[IntelligentTieringConfiguration]
    ] = None
    CorsConfiguration: Optional[CorsConfiguration] = None
    Id: Optional[str] = None
    Arn: Optional[str] = None
@markpeek
Copy link
Member

I'll have to come back to read your additional comments. But when running your tests, did you set the TROPO_REAL_BOOL environment variable? The mapping is done here. This was added for backwards compatibility and will be the default in the next major revision.

@dazza-codes
Copy link
Author

The TROPO_REAL_BOOL was not set.

@lautjy
Copy link

lautjy commented Apr 6, 2021

I would like to second that using Pydantic is really sweet. Typehints, serialization, Literals, etc.; it has been so agile to use.
But not sure how big an overhaul it would be for this repo.

Looking at this PR for example: https://github.com/cloudtools/troposphere/pull/1858/files
Seems like all of the definitions could be Pydantic BaseModels.
But there is likely lots of machinery that rely on the current form 🤷

@angoraking
Copy link

@dazza-codes @lautjy I found a python library https://github.com/MacHu-GWU/cottonformation-project#welcome-to-cottonformation-documentation seems like they did exactly what you said about the Typehint, Parameter suggest and validation.

Seems like this guy use the cloudformation schema json file from AWS and jinja2 automatically generates all those code, I think we can borrow this to here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants