Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding NumericCalculationStep #93

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

Suvarna-Thejas-CE
Copy link
Contributor

Overview

This creates a NumericCalculationStep that takes in the calculation string from the Designer and calculates the output.

In order to test the functions, I implemented the jest testing framework and added some unit test cases for the logic of the calculation.

This should follow a similar model as the celebration/YouTube step in the sense that it will be implemented as a step type in MDHD.

I did not include a story because this step type only shows a loading screen. It also retrieves data mainly from MDH.js.

Security

REMINDER: All file contents are public.

  • I have ensured no secure credentials or sensitive information remain in code, metadata, comments, etc.
    • Please verify that you double checked that .storybook/preview.js does not contain your participant access key details.
    • There are no temporary testing changes committed such as API base URLs, access tokens, print/log statements, etc.
  • These changes do not introduce any security risks, or any such risks have been properly mitigated.

Describe briefly what security risks you considered, why they don't apply, or how they've been mitigated.

Checklist

Testing

Documentation

  • I have added relevant Storybook updates to this PR as well.
  • If this feature requires a developer doc update, tag @CareEvolution/api-docs.

Consider "Squash and merge" as needed to keep the commit history reasonable on main.

Reviewers

Assign to the appropriate reviewer(s). Minimally, a second set of eyes is needed ensure no non-public information is published. Consider also including:

  • Subject-matter experts
  • Style/editing reviewers
  • Others requested by the content owner

@Suvarna-Thejas-CE
Copy link
Contributor Author

Function has been unit tested, but would love some thoughts on how to test the actual react component. Promise chaining is done correctly, submission happens at the right time, etc.

.then((() => MyDataHelps.getCurrentSurveyAnswers()))
.then((surveyAnswers) => {
setCalculationResult(calculateValue(calculationString, surveyAnswers).toString());
console.log(calculationResult);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove before merging

@@ -0,0 +1,54 @@
import { create, all } from 'mathjs'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh neat find with mathjs - I like it better than using eval.

So actually...all this string parsing/replacing logic should be unnecessary? Because mathjs supports passing variables? Like here's one of the examples:

let scope = {
    a: 3,
    b: 4
}
math.evaluate('a * b', scope)           // 12
math.evaluate('c = 2.3 + 4.5', scope)   // 6.8
scope.c                                 // 6.8

So you could imagine setting up your scope as:

let scope = {
    '[Q1]': 3,
    '[Q2]': 4,
    '[F1.Q1]': 5
}

from your survey answers. For form variables (where step id/result id are different) one could just use a dot separator like [F1.Q1].

Or you can set up the scope to call a function to get the answer it looks like: https://mathjs.org/examples/advanced/custom_scope_objects.js.html

The main question/issue is whether mathjs is kind/permissive enough to allow numbers in it's symbols; seems like there is some flexibility there but not sure how much.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. I've been playing with this and have a few thoughts on feasibility.

  1. the . is the object access operator. So this seems to prevent us from being able to do calculations with stepIds/resultIds that have a . in the name (ie. Form1.Q1.2). I think we can probably add a little validation message that says when you have calculation step, the calculation string cannot reference a stepIdentifier/resultIdentifier combo that has a . in it.
    ^How does that sound?

  2. When assembling the scope dictionary from the survey answers, does this rule work? if stepIdentifier only appears once in the survey answer list, then use just {stepIdentifier: value}, otherwise if it appears more than once use stepIdentifier: {resultIdentifier1: value, resultIdentifier2: value}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Suvarna-Thejas-CE maybe dig into REDCap a bit here; what kind of validation / restrictions do they have.

Even might be worth it if we follow their syntax for the vars; not sure if math.js will allow [] though, e.g. [clab_lipidldl_u2] * [clab_lipidldlab]

For 2...after thinking about it...I think we should just key it by result identifier. It makes it so much simpler...and I would add validation so that your result identifiers have to be unique (e.g. you can't have Q1 on FormStep1 and Q1 on FormStep2) if you have a numeric calculation step.

Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On my preliminary testing, the [] do work in mathJS :) so that should work great. Will look into REDCap a bit here.

I like not having restrictions on the step/resultIdentifiers. The '_' is a cool idea to me and feels like a great one from a consumer perspective.

For me, the usability concern for using only resultIdentifiers is that the question step UI doesn't mention "resultIdentifier". Could potentially be fixed with documentation, but would choose not to force customers to rely on reading user guide articles.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah hmm I wasn't suggesting we do anything with underscores actually...was not a separator I was proposing, sorry. Those were just some redcap vars I copy/pasted.

Basically just thinking that we have [resultIdentifier] be the syntax; and we enforce unique result identifiers when using this step because in practice they are anyways, and it will be a pain to construct a step identifier / result identifier pair (e.g. [formstepidentifier_resultidentifier] or whatever you were thinking)

In terms of usability, I think that might just be how we describe it. It is either the step identifier OR the form item identifier if it is a form step basically; so we can avoid the use of "resultIdentifier" if we want (or we can clarify in the survey builder as well). But it again feels easier to me than the user having to write out [formstepidentifier_resultidentifier] if they can just always do [resultidentifier].

We don't necessarily have to enforce it via a validation rule. We can just have [resultidentifier] be the way this works and if they happen to have multiple overlapping resultidentifiers, it just takes the latest survey answer with that resultidentifier.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what it's worth; using only the result identifier is what I did on my version of this since I knew all my questions were form questions, and the result identifiers were unique. What stinks about that is that there's not a good way to find the right question that the result identifier is on (there's lots of repeated looping). I debated, in my version, making a dictionary to do lookups so that I only did the looping once, but didn't take it that far.

That said, I'm also somewhat in favor of only the result identifier being in the eventual UI because, just as an example, the survey I was working on required doing a sum total of 91 variables for its final calculation. Listing 91 variable names is pretty gnarly in a text input; it's even worse if they're longer because they require the question identifier as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool - yeah. Y'all are convincing me that there is value to using ResultIdentifier. The functional name seems to be "QuestionIdentifier" which, like you said Chris, is defined as the StepIdentifier for Question Steps or FormItemIdentifier for Form Steps. I think this is workable.

It does bring up some questions about taxonomy for me (that I want to document, and maybe get reactions too, but is a bit of scope creep for this PR specifically).

In the case, where I have repeated questions (i.e, For all the people in your family, specify the height and weight). I would do something that looks like:

Form Step: Person1
 -- Form Item: Height
 -- Form Item: Weight
Form Step: Person2
 -- Form Item: Height
 -- Form Item: Weight
Form Step: Person3
 -- Form Item: Height
 -- Form Item: Weight

Currently, it's not obvious that there would be a relationship between a form item identifier and a question step identifier.

I think it makes sense to make that relationship explicit, just through naming changes.

  • Renaming Form Item to Question or Question Item
  • Potentially renaming Form Step to multi-Question step

It would change the taxonomy from:

- Question Step
   -- Question
- Form Step
  -- Form Item
     --- Question
  -- Form Item
     --- Question

To this model that makes the relationship between multi-question and question steps clear. Functionally, I think it also describes better what a form step is, just a collection of questions.

- Question Step
  -- Question Identifier
- Multi-Question Step Identifier
  -- Question Identifier
  -- Question Identifier

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well...we cannot go renaming things like crazy unfortunately...have to consider the downstream implications given that creates inconsistencies with exported data and the survey answer API. Have to think of the scope of the change we are making here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can give this another quick review after the latest push

Comment on lines 44 to 49
parseFloat(surveyAnswers.find((answer) => {
return(answer.stepIdentifier == stepIdentifier && answer.resultIdentifier == resultIdentifier);
})!.answers[0])
: parseFloat(surveyAnswers.find((answer) => {
return (answer.stepIdentifier == stepIdentifier);
})!.answers[0]));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's really interesting to see you tackle this because I've just recently been making a generic Calculation WebViewStepTemplate for another project. It's interesting to compare/contrast how we went about this. In my case, I knew specifically how the questions were going to be formatted (and that all of them were Form Steps), so I didn't try and make my version extremely generic.

Which brings me to my question here, wondering about constraints on how this step could actually be used. Am I correct in that if the step was a Question Step with the Text Choice Answer Format, and a Text Choice Style of Multiple Choice that you could have multiple answers in the array here? What's the expectation on how to calculate that type of step? I could see an argument that it should be the sum of the parseFloat of the answers?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonder what REDCap does for multiple choice. If there is an easy way to match them maybe worth it. Otherwise feels like we should ignore and not try to handle it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From REDCap Docs:
image

"If possible, avoid" haha.

Maybe in our docs we say something similar, but explain the behavior if they do use it. Or should we say if length of the results for a step > 1 then throw an error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah for an MVP the latter might make the most sense...just don't allow results > 1

@@ -0,0 +1,38 @@
import { calculateValue } from "./numericCalculationFunctions";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be great if you could add a note to the project wiki or readme about adding the test framework and how to run the tests.


useEffect(() => {
MyDataHelps.getStepConfiguration().then((config: StepConfiguration) => {
if (!config) return; // allows test mode to work
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed anymore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still need the config in order to get the calculation string.

let answerDict: {[id: string] : number} = {};
surveyAnswers.map((answer) => {
if (answer.answers.length > 1) {
throw new Error(`Calculation Error - Calculation not available for multiple-select text choice questions (${answer.resultIdentifier})`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we would not want to throw errors for any of these? We should just skip it and include the variables we can...otherwise any survey with a multi-select text choice or a "." in the resultIdentifier will not work


test('Calculation Passes', () => {
expect(calculateValue(
"(([Q1]*[Q1_2])+[Q2])-[Q3]",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I am missing it but it seems like we did not do anything special for the [] syntax this to work, no? So that is all built into Math.js and actually you do not even need the [] in most cases. Q1 * Q3 works fine as an expression it seems.

However, in my quick test, seems like Q1_2 does not work and you need the brackets for some reason. Curious if you know why or have figured out what the [] actually does

I think there are other characters that do not work as intended; for instance spaces - [Q1 2] doesn't work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The [] is the array operator in mathjs. In this case, it doesn't really do anything. I used it because I liked having a "syntax" to designate which is a variable.

When I was testing the Q1_2 without the [] worked correctly and gave the correct response, but you're right. The [Q1 2] and the Q1 2, will actually calculate but result in an correct calculation.

Here is a list of operators , so many of these won't work. Doesn't really make sense to error check for all of them. But what is our list of basic ones? dash, space, period? what about parentheses, brackets?

return(math.evaluate(calcString, answerDict)._data[0].toString());
}
catch {
return("Calculation Error - Unmatched Variable")
Copy link
Contributor

@chrisnowak chrisnowak Aug 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might not be an unmatched variable but could be any other error...probably just want to surface the error from Math.js

We should probably figure out the right behavior for the numeric step if there is an error. Looks like you have it just proceeding...bit conflicted if that is the right call...it does maximize participant experience I suppose because it is on the study team to figure it out / fix it and possibly redeliver

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, if subsequent steps depend on the calculation for branching logic etc...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like showing more helpful error messages can be a real boon for people that are trying to learn how to use the tool. Giving them an error message about why something failed and a place to look can help them solve problems and know what to look for in the future.

I have modified how I do the error checking. At a high level:

  1. When building the results dictionary from the survey answers, create a list of Identifiers that are flagged for some error.

  2. Parse the calculation string for the variables that are used (this is where having the [] was very very helpful), and compare to the flagged identifiers from step 1.

  3. Create an error message that shows these identifiers.

I had to employ this model to catch any errors for Identifiers that had a space followed by a number (i.e. in the case of [Q1] + [Q1 2], Q1 2 was getting replaced by the value of Q1 * 2, but the calculation would succeed successfully, so I wasn't able to catch the error of an identifier with a space if i didn't cross check with the variables used in the calculation string.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants