Material Library #6

tonyshumlh · 2024-04-30T04:54:44Z

This issues serves as the storage of all the related and useful material for the creation for Checklist and Prompt.
Summary of material is recommended to be written down to save the effort of other readers

tonyshumlh · 2024-04-30T04:58:53Z

Microsoft Industry Solutions Engineering Team 2024 https://microsoft.github.io/code-with-engineering-playbook/machine-learning/

ml-fundamentals-checklist: A complete checklist. The Data Quality and Governance part could be useful
ml-fundamentals-checklist.md
ml-testing: Provided idea and example what code should be tested. Mainly on Data
ml-testing.md
ml-model-checklist.md: Checklist about ML model in Production. NOT necessarily be useful
ml-model-checklist.md

tonyshumlh · 2024-04-30T05:15:04Z

Jeremy Jordan - Effective testing for machine learning systems https://www.jeremyjordan.me/testing-ml/

Proposed a workflow to include tests (mainly ML pipeline tests) into ML development
Introduced the ideas of Pre-train tests and Post-train tests:
Pre-train tests are conducted before the model is trained, aiming to identify bugs early on and potentially save time by avoiding wasted training jobs.
Post-train tests utilize the trained model artifact to inspect behaviors for various scenarios defined by the testing process. These tests aim to understand the logic learned during training and provide a behavioral report of model performance.
- Invariance Tests: Assess whether deliberate change to the input affect the model's output.
- Directional Expectation Tests: Define deliberate change to the input with predictable effects on the model output.

tonyshumlh · 2024-04-30T16:56:49Z

Studying the Practices of Testing Machine Learning Software in the Wild https://arxiv.org/pdf/2312.12604

Research on the Practices of 10 Testing Machine Learning Benchmark Projects. Test examples are included in the paper

Testing Strategies: Four major categories were identified: Grey-box, White-box, Black-box, and Heuristic-based techniques. Grey-box and White-box techniques were the most commonly used.

ML Properties Tested: 16 ML properties were identified, with functional correctness, consistency, robustness, data validity, and efficiency being the most frequently tested.

Testing Methods: Thirteen different testing methods were identified, with only seven previously included in the Test Pyramid of ML.

tonyshumlh · 2024-05-06T21:58:26Z

Retrieval-Augmented Generation

Technique for enhancing the accuracy and reliability of generative AI models with facts fetched from external sources, e.g. developer defined database.
It could be our ML test checklist and other background information.

JohnShiuMK · 2024-05-16T18:30:14Z

https://arxiv.org/pdf/2310.01402

Evaluating the Decency and Consistency of
Data Validation Tests Generated by LLMs∗
An application to Canadian political donations data

By a professor from the University of Toronto

tonyshumlh assigned tonyshumlh, SoloSynth1, JohnShiuMK and jinyz8888 Apr 30, 2024

JohnShiuMK added the admin meeting related label May 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Material Library #6

Material Library #6

tonyshumlh commented Apr 30, 2024 •

edited

tonyshumlh commented Apr 30, 2024

tonyshumlh commented Apr 30, 2024 •

edited

tonyshumlh commented Apr 30, 2024 •

edited

tonyshumlh commented May 6, 2024 •

edited

JohnShiuMK commented May 16, 2024

Material Library #6

Material Library #6

Comments

tonyshumlh commented Apr 30, 2024 • edited

tonyshumlh commented Apr 30, 2024

tonyshumlh commented Apr 30, 2024 • edited

tonyshumlh commented Apr 30, 2024 • edited

tonyshumlh commented May 6, 2024 • edited

JohnShiuMK commented May 16, 2024

tonyshumlh commented Apr 30, 2024 •

edited

tonyshumlh commented Apr 30, 2024 •

edited

tonyshumlh commented Apr 30, 2024 •

edited

tonyshumlh commented May 6, 2024 •

edited