Skip to content

deep-diver/Continuous-Adaptation-for-Machine-Learning-System-to-Data-Changes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Continuous Adaptation for Machine Learning System to Data Changes (#TFCommunitySpotlight Awarded)

By Chansung Park and Sayak Paul

MLOps system evolves according to the changes of the world, and that is usually caused by data/concept drift. This project shows how to combine two separate pipelines, one for batch prediction and the other for training to adapt to data changes. We worked with the TFX team to author a blog post detailing our approach. The blog post is available here: https://blog.tensorflow.org/2021/12/continuous-adaptation-for-machine.html.

We assume familiarity with basic MLOps concepts (like pipelines, data drift, batch predictions, etc.), TensorFlow, TensorFlow Extedned, and Vertex AI from the reader.

MLOps system also can be evolved when much better algorithm (i.e. state-of-the-art model) comes out. In that case, the system should apply a better algorithm to understand the existing data better. We have demonstrated such workflows in the following projects:

Workflow

  1. Run the initial training pipeline to train an image classifier and deploy it using TensorFlow, TFX, and Vertex AI (02_TFX_Training_Pipeline.ipynb).
  2. Download and prepare images from Bing search to simulate the data drift (97_Prepare_Test_Images.ipynb).
  3. Generate batch prediction pipeline specification (JSON) (03_Batch_Prediction_Pipeline.ipynb).
  4. Deploy cloud function to watch if there are enough sample data to perform batch prediction pipeline and to trigger the batch prediction pipeline (04_Cloud_Scheduler_Trigger.ipynb).
  5. Schedule a periodic job to run the deployed cloud function (04_Cloud_Scheduler_Trigger.ipynb).

Custom components

We developed several custom components in TFX for this project. You can find them under the custom_components directory.

Checklist

  • Initial Data Preparation (CIFAR10)
  • Build Training Pipeline
  • Build Batch Prediction Pipeline
    • FileListGen component
    • BatchPredictionGen component
    • PerformanceEvaluator component
    • SpanPreparator component
    • PipelineTrigger component
  • Data Preparation for Data/Concept Drift Simulation (from Bing)
  • Deploy Cloud Function, Schedule a Job to Trigger the Cloud Function
  • End to End Test

Feedback

We welcome feedback. Please create an issue to let us know what you think.

Acknowledgements

  • ML-GDE program for providing GCP credits.
  • Robert Crowe and Jiayi Zhao of Google for helping us with our technical doubts.