Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating Schema reading procedures and refactoring #78

Open
wants to merge 21 commits into
base: dev
Choose a base branch
from

Conversation

@Hsankesara Hsankesara marked this pull request as ready for review January 24, 2024 14:12
Copy link
Member

@afolarin afolarin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -4,7 +4,7 @@ project:
version: mock_version

input:
data_type: local # couldbe mock, local, sftp, s3
data_type: mock # couldbe mock, local, sftp, s3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would this be better as data_source or source_type data type is more specific to the data, this I think relates more to the source of the data

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While SFTP (though rsync might be useful for restart function) and S3 (implemented?) probably cover a lot of cases, I don't want to really support every method here as we can't support the long tail of the distribution. It should probably be the user's responsibility to provide a way to expose the remote data with network mounts, local copies, etc.

logger = logging.getLogger(__name__)


class CustomDataReader():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the distinction here between ingestion and reader?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants