Before I Sleep: How to be assertive about not testing your data science pipeline #21

utterances-bot · 2023-10-21T01:32:15Z

Before I Sleep: How to be assertive about not testing your data science pipeline

https://milesmcbain.com/posts/assertive-programming-for-pipelines/

anthonynorth · 2023-10-21T02:36:54Z

Great post!

I've never considered using {testthat} for assertive programming. What I like most about this idea is that you're not using a new tool / package to write assertions. No new api to learn. Bonus: {testthat} is very well documented.

I'm a big fan of assertive programming, particularly in {targets} pipelines. I've found this to be a timesaver, not only in avoiding wasted compute, but also (and more importantly) in debugging.

I find it useful to validate inputs and sometimes outputs of targets. For input validation/assertions, you're making explicit what your assumptions of your inputs are. Re-running pipelines with new (external) data can result in explicit errors, or worse silently incorrect results, or a failure further down the pipeline that is difficult to diagnose.

Output assertions, at least how I've used them, are much more like a unittest; you're explicitly checking that your code does what you think it does and allows for specific failure messages.

MilesMcBain added the comment_thread label Oct 21, 2023 — with utterances

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Before I Sleep: How to be assertive about not testing your data science pipeline #21

Before I Sleep: How to be assertive about not testing your data science pipeline #21

utterances-bot commented Oct 21, 2023

anthonynorth commented Oct 21, 2023

Before I Sleep: How to be assertive about not testing your data science pipeline #21

Before I Sleep: How to be assertive about not testing your data science pipeline #21

Comments

utterances-bot commented Oct 21, 2023

Before I Sleep: How to be assertive about not testing your data science pipeline

anthonynorth commented Oct 21, 2023