Replies: 3 comments 2 replies
-
XCOM is not for data , it's for metadata ( xcom backend is just a work around , in all case Airlfow should not see your data ) Airflow is a scheduling tool not an ETL tool Airlfow operators are most of them simple python helpers not data-transfert efficient tools |
Beta Was this translation helpful? Give feedback.
-
[2024-05-06, 15:31:02 KST] {s3_to_gcs.py:262} INFO - All done, uploaded 1330 files to Google Cloud Storage ..... (Background on this error at: https://sqlalche.me/e/14/9h9h); 17) |
Beta Was this translation helpful? Give feedback.
-
I think I explained it wrong When using MySQL as a metadata database, while transferring thousands of S3 data files to GCS, the files are moved successfully, but when recording the paths of these files in the metadata (XCom), the length of the list of paths exceeds the MySQL BLOB length limit. To resolve this issue, it would be helpful to have a feature to compress this list when not using the XCom data. What do you think? |
Beta Was this translation helpful? Give feedback.
-
Description
If a large amount of s3 data is moved using s3 to gcs, data too long for column error occurs when storing related values in the last xcom
Use case/motivation
I think we can prevent this error by adding a logic that compresses the list to show only some of the data
Related issues
No response
Are you willing to submit a PR?
Code of Conduct
Beta Was this translation helpful? Give feedback.
All reactions