Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

difference sampling (multistock / oldreagents) broken #96

Open
ipendlet opened this issue Nov 27, 2019 · 3 comments
Open

difference sampling (multistock / oldreagents) broken #96

ipendlet opened this issue Nov 27, 2019 · 3 comments
Assignees
Labels
bad_first_issue Seasoned Devs Only bug Something isn't working

Comments

@ipendlet
Copy link
Member

ipendlet commented Nov 27, 2019

I think this is going to be a rather complex debug. First, whoever is programming this function should really wrap their head around what the goal of this feature actually is: we are attempting to use old inputs (a previous xlsx) as the "old" reagent, or the "old stock solution" and remove the regions that these old reagents were able to access from the total region of the available concentration space.

The difference sampling is clearly not working either:
search space

For example, in the case of a simple iodide perovskite experiment we can target the grey region as the full space. The blue region is what has already been sampled (old sampling, poor). The yellow region is the space that is accessible for all of the PbI2 containing compounds, while the yellow space highlighted by the black triangle is the region accessible by only 2 of the stock solutions.

For examples of the failings of the current multistock sampling see issue 96 folder in ESCALATE_development (here
A demonstration of a run accessing a related "yellow" region. LBL_Multistock_Edited.xlsx
A demonstration of a difference sampling run that should work but errors out: LBL_Multistock_Edited_subregion.xlsx

The fix for this is not straight forward. There really are two options: figure out what is actually needed to pull off the desired operation, or remove the "old_reagents" (terrible name) sections of the code entirely. Either works, but I think there is a benefit for being able to easily test / sample new regions with additional reagent definitions.

The best result would be to sample using new reagents for the spaces that were inaccessible using only the "old reagents". The generated runs will use the MINIMUM number of reagents possible to sample the new space and correctly prepare the run files.

Far above and beyond extension: Use the existing dataset of associated workflows (wf1.1 , or 3.0) and bound the area of explore space with a convex hull, figure out what regions haven't been explored and develop runs using the fewest number of reagents possible.

A few challenges

@ipendlet ipendlet added bad_first_issue Seasoned Devs Only bug Something isn't working labels Nov 27, 2019
@jschrier
Copy link

Yeah, I don't think this is actually a true bug, but rather an input problem. [Although I can't really make sense of how to read the spreadsheets in issue 96]

As commented to IP on Slack DM 26 Nov 2019—this appears to be a misunderstanding of how the difference sampler works. The "new" region is defined by a set of reagents (which may or many not be the same), and the "old/excluded" region should also be defined with all the relevant reagents (this includes neat GBL and neat FAH).

I'll remind you of the usage demo from back on 31 Aug 2019... https://www.wolframcloud.com/env/jschrier0/Published/2019.08.31_new_region_sampler.nb

[FWIW, the updated code for sampling experiments containing >3 solute species also speeds this up by 10x as a "side effect"]

@ipendlet
Copy link
Member Author

If X=pbi2 conc, and y=amine, the new region explores the region of space in low y, under the original pbi2 solution line. If we think of the new region as being everything under the old region (in terms of y), than in order to access that new space we must include the old pbi2 solution and omit the amine solution -- that's it. We don't actually need any extra (difference) sampling tools, just select the stock solutions appropriately.

Explanation/demonstration of correctness: https://www.wolframcloud.com/obj/jschrier0/Published/2019.11.27_do_we_actually_need_difference_sampling.nb

ifference sampler: Consider the following case of 1,4-DAB. It would be useful to sample the entire space, less the convex hull of experiments that have already been done...
[The wrinkle, of course, is that it is not the difference defined by solution points, but by a more arbitrary geometric object]
No specific action suggested, but something to keep in mind.

@ipendlet
Copy link
Member Author

Screen Shot 2019-12-10 at 11 53 31 AM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bad_first_issue Seasoned Devs Only bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants