Replies: 2 comments 1 reply
-
I can see where this might be applicable -- say a staged, or cascaded, or convolutional set of transformations where the previous stage is fed into the next stage. That may not be what you are trying to do but the effect is the same. The overall complexity does not reduce unless the intermediate result has some plausible meaning. I am working on a similar problem where the intermediate would be model1 (which can be checked for plausibility) and the final model2 can incorporate model1. This is usually decomposed as a data-flow model, where the model1 output is piped into the model2 stage. So what I do is run model1 in a mode that it minimizes an error in what the intermediate result is expected to be (a plausible result). Then for model2, I start with the model1 results and that is further modified and symbolic transformations (mainly sin, cos trig functions) are applied to model1 until the error to the target is minimized. When that is done, can see how much mode1 deviated from the expected intermediate result. The important point is that the complexity of model1 doesn't matter as much as the complexity of model2, since the model1 is serving as more of an emulator to get close to the intermediate result and then will be parsimoniously transformed to the target. It's almost as if you need a parametric representation of the intermediate results so that slight phase shifts and amplitude adjustments can be made. This is one of those scenarios that happen all the time -- you may have an intermediate result but you want that emulated because you don't know if it's the exact input the next stage is operating on. Or anytime that only a short calibration of the intermediate stage is available and you need that extrapolated over a much wider range, so you may want to backtrack in history, or make future predictions, In medical, it may be in inverse tomography. Tried asking ChatGPT4 if it could figure out any clever way to do this in a continuously iterative fashion using PySR. |
Beta Was this translation helpful? Give feedback.
-
Hi @gm89uk, Thanks very much for your post, I am delighted to hear you are trying things out! Very exciting to hear your use-case, I'm happy to help more.
You raise some interesting ideas! PySR does some of these: https://arxiv.org/abs/2305.01582. In particular the algorithm actually re-introduces the current hall-of-fame back into the populations at a regular interval. The "migration" parameters control this behavior: https://astroautomata.com/PySR/api/#migration-between-populations such as However these happen at a regular interval rather than based on some heuristic; I think that's an interesting idea. Other things to note that you might try tuning: https://astroautomata.com/PySR/api/#working-with-complexities
The default Another parameter you could try is Hope this helps a bit. Please followup with any other questions; happy to help! Best, |
Beta Was this translation helpful? Give feedback.
-
Thank you Miles for sharing this amazing package.
I am an ophthalmologist who is very interested in using this to improve the accuracy of our outcomes in various types of refractive surgery, i.e. I am not a programmer and know just enough python to get by and run this. I have so far, had excellent results, surpassing that of optimised XGBoost models on test databases. As we are dealing with outcomes, accuracy is more important than simplicity (while keeping it generalisable).
I wanted to share some observations and potential to improve the Pysr search strategy to get more rapid reductions in loss with simpler equations.
In my example, I have eight features, therefore, I increased the maxsize to 60. After much experimentation and running various algorithms on different databases for a couple days each (6 Core Ryzen 7 CPU laptop), I found that there are often diminishing returns after a certain max size, often the Loss plotted against maxsize appears to follow an exponential decay function:
The loss drops very slowly following this (obviously problem specific and I understand there is no convergence as such).
My strategy is to take the equation from the maxsize / 2, (so in the above example it was 30) and generate a new feature with it, x8.
I then generate a new y variable, which is y1= y - x8 and this represents the error left over from the equation at size = 30. I run Pysr again with maxsize = Previous_Max_Size /2 (so 60/2 = 30) and so the total maxsize remains the same as before. The new y variable is y1 with the same x features as the original run.
What happens is that there is a rapid reduction in loss for lower complexity and new expressions are found quickly. I ran both the original code on ipython locally and the new pysr run on y1 on google colab simultaneously. Despite being significantly slower, the google colab managed to find new expressions very quickly at lower complexity to improve the original equation. Inputting the error as y, rather than x, prevents pysr from using the previous expression as a variable, which would lead to horrible increased complexity and lack of interpretability. This happens reliability every time I've tried it.
I have attached some example screenshots:
This is my ipython run, has been running for approximately 3 days.
You can see at complexity = 30, the loss (MSE) is stuck on 1.160e-01 and hasn't budged in the last day, while at complexity = 60, it has very slowly reduced.
The new loss units are identical and comparable between first run pysr and second run, as the new y is the loss.
Here is a new run on jupyter notebook using the process above.
At the start of model2, at complexity 1, the loss is basically the same as what we had with model1: complexity of [maxsize / 2] (30).
However, after running model2 for just 5 minutes, we have a rapid reduction in loss with low complexity:
You can see in model2 at complexity = 4 (i.e. total true complexity of 30+4 = 34), we now have a loss of 1.086e-01. If we compare this to complexity 34 in the first run, 1.125e-1, having remained stagnant for over a day, we achieved lower loss in a few minutes.
Therefore, my new equation would be: model1(complexity 30) + model2(complexity 4) with a reduced loss function and a new expression the model could explore.
I have manually calculated the loss functions myself of model 1 and model1+model2 to confirm the loss functions are comparable.
Unfortunately, I don't have a means of feeding new equations found in model2 back to model1; I cannot add it to the hall of fame or let pysr know of the new expression that it can play with a mutate/cross over to improve the existing model1.
My suggestion would be to somehow add an option to permit pysr to perform this process iteratively above a certain threshold of 'stagnation', and feedback the new simple complexity terms that lead to a reduced loss back to the original model equations and let the mutations and cross overs help from there, perhaps a new term of complexity 34, may help at complexity 60 for example. Model2 only needs to run for a few minutes to find new expressions and 'mix things up' for model 1.
I thought of trying to do this myself in python, generate the new variables and set up the new model automatically, but if you know a means to feed expressions back to model 1 and resume it would be much appreciated.
What would be ideal, is that we can add simple expressions from model 2 to model 1, for example maxsize/2 (complexity 30) and still permit the equation to be modified and improved further within the model1 run.
I apologise about the long winded explanation, I hope it made sense!
Here is my code for reference.
Beta Was this translation helpful? Give feedback.
All reactions