Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variables #700

Open
wants to merge 29 commits into
base: master
Choose a base branch
from
Open

Variables #700

wants to merge 29 commits into from

Conversation

lfarv
Copy link
Contributor

@lfarv lfarv commented Nov 15, 2023

This replaces #603 and #696. Upon request, #696 was split in 4 parts, here is the first one.

We introduce standard ways of parametrising a lattice and varying any quantity.

Variables

Variables are references to any scalar quantity. AT includes two predefined
variable classes referring to scalar attributes of lattice elements:

  • an ElementVariable is associated to an element object, and acts on all occurences of this object. But it will not affect any copy, neither shallow nor deep, of the original object,
  • a RefptsVariable is not associated to an element object, but to an element location in a Lattice. It acts on any copy of the initial lattice. A ring argument must be provided to the set and get methods to identify the lattice.

Variable referring to other quantities may be created by:

  • Deriving the Variable base class. Usually this consist in overloading the abstract methods _setfun and _getfun,
  • Using the CustomVariable class, accepting user-defined get and set functions.

The documentation for this branch is available here.

@lfarv lfarv added enhancement WIP work in progress Python For python AT code labels Nov 15, 2023
@lfarv
Copy link
Contributor Author

lfarv commented Nov 15, 2023

The new classes introduce name conflicts with existing Variable, ElementVariable classes. Keeping two slightly different objects is not desirable, so the idea is to smoothly replace the old classes with the new ones. The proposed strategy is:

  1. Starting with this PR, the access to the new classes and functions is:

    from at.future import ElementVariable, ...

    Nothing changes for the old classes, still available with:

    from at import ElementVariable, ...

    or imply using at.ElementVariable(...).

  2. At some point (version 1.0?), we will switch the default to the new scheme, and the old one will need something like:

    from at.deprecated import ElementVariable, ...

@lfarv
Copy link
Contributor Author

lfarv commented Nov 16, 2023

Still trying to answer @simoneliuzzo's questions, concerning parameters:

I think that this is the natural user expectation

p5 = Param(-0.3, name='QD strength')
qd1.PolynomB[1] = p5

This is a major point. I couldn't find any solution such that a numpy array could turn itself into a ParamArray. But since I agree with Simone's that it is really frustrating, I am considering another solution: each array attribute of an Element would be initially set as a ParamArray at element creation. The performance penalty should be very small, I'll prepare this modification.

@lfarv
Copy link
Contributor Author

lfarv commented Nov 18, 2023

Directly assigning a parameter to an array item is now allowed:

p5 = Param(-0.3, name='QD strength')
qd1.PolynomB[1] = p5

@lfarv
Copy link
Contributor Author

lfarv commented Nov 21, 2023

Documentation largely improved, please check again…

@T-Nicholls
Copy link
Contributor

T-Nicholls commented Nov 21, 2023

The performance penalty should be very small, I'll prepare this modification.

Hello @lfarv, I would encourage you to do your own timing tests; however, when I ran my standard linopt timing script, I found this branch to be roughly between 2 to 3.85 times slower than the master. To me that is much too large of a slowdown, especially when I was already already planning to try and make future speed improvements to linopt due to its current speed.

Full timing data for reference:

master branch:
Mean time per linopt call over 1000 iterations (get_chrom = False):
    No refpts:
        Uncoupled: 0.046473862171173
        Coupled:   0.046404954195022
    All refpts:
        Uncoupled: 0.060675595998764
        Coupled:   0.143170256614685
Mean time per linopt call over 1000 iterations (get_chrom = True):
    No refpts:
        Uncoupled: 0.069423148155212
        Coupled:   0.069539334297180
    All refpts:
        Uncoupled: 0.084518674612045
        Coupled:   0.167194108247756

variables_and_parameters branch:
Mean time per linopt call over 1000 iterations (get_chrom = False):
    No refpts:
        Uncoupled: 0.174603536367416
        Coupled:   0.176802119970321
    All refpts:
        Uncoupled: 0.208549186229705
        Coupled:   0.290254921436309
Mean time per linopt call over 1000 iterations (get_chrom = True):
    No refpts:
        Uncoupled: 0.267525310277938
        Coupled:   0.269076839447021
    All refpts:
        Uncoupled: 0.301143751859664
        Coupled:   0.383905014276504

@lfarv
Copy link
Contributor Author

lfarv commented Nov 22, 2023

@T-Nicholls : you raise a very important point. I tried to go more into the details of timings, down to the root of the problem which is the modification of the access to the attributes of elements.

There are 3 different test cases:

  1. the master branch
  2. the variables_observables branch which was the topic of the defunct Variables, parameters and observables #696 PR, and had the inconvenience pointed by @simoneliuzzo that parametrising a single item of an array attribute needed a special construct, set_parameter, instead of simple assignment,
  3. the present variables_and_parameters branch which removes this problem by replacing all numpy arrays by ParamArrays, whether they contain parameters or not.

All the tests are run on a lattice without parameters. I display all the results in terms of ratio compared to the master branch.

linopt6

I first tried to reproduce @T-Nicholls's results by measuring linopt6 on a test lattice:

master variables_observables variables_and_parameters
refpts=None, get_chrom=False 1 2.51 3.01
refpts=None, get_chrom=True 1 2.37 3.06
refpts=All, get_chrom=False 1 1.54 1.71
refpts=All, get_chrom=True 1 1.88 2.14

Ratio between 1.54 and 3.06, similar to @T-Nicholls's results. The last modification (array attributes) is indeed worse, but not so much. Remark: refpts=All is better because it's just additional python computations, without accessing lattice elements.

Tracking

The I tried a pure tracking: 10000 turns with a non-zero test particle:

master variables_observables variables_and_parameters
tracking 10000 turns 1 0.97 1.01

As expected, no effect on tracking because the attribute values are initially stored in C structures, so the overhead with 10000 turns is negligible.

Attribute access

And finally I just measured a simple access to an attribute, scalar or array:

master variables_observables variables_and_parameters
a = qp.Length 1 19.5 18.35
a = qp.PolynomB[1] 1 7.84 52.9

On the 1st version, array access is better probably because indexing PolynomB itself takes time both in master and variables_observables. On the 2nd step, array access is very much degraded, that's expected.

Conclusion

  • concerning the last modification: as expected, it increases a lot the access time to array attributes. But on the average (linopt6), it's a 20% increase with respect to the previous version, we can discuss whether the comfort of assigning directly an item of an array rather that using a set_parameter construct is worth this penalty
  • concerning the basic problem of a parametrising a lattice, the problem is in these 6 lines:
      def __getattribute__(self, key):
          attr = super(Element, self).__getattribute__(key)
          if isinstance(attr, (ParamBase, ParamArray)):
              return attr.value
          else:
              return attr
    To have a fully transparent access to parameters, both in python and C, we are adding for each access (parameter or not) a test (that should be fast), but a insistance call. Is there any better way to do that?

@simoneliuzzo
Copy link
Contributor

Dear @lfarv @T-Nicholls @swhite2401,

To get more user friendly matching we lose generally a factor 1.5-3 in speed (for an already slow matching).
This makes it globally less user friendly than before.

May be when GPU tracking is available we can think about it again, but as it is I would not want to use it.

Could we add the timing tests to the global tests of AT? if slower than before --> give a warning/error?

best regards
Simone

@lfarv
Copy link
Contributor Author

lfarv commented Nov 23, 2023

@simoneliuzzo: I'm not sure I understand you comment:
The slowness problem comes from the introduction of Parameters, whether you use them or not. Variables are not concerned.

I understand that you propose to postpone the introduction of Parameters (until we find a better way to implement them, if is is possible), and keep only Variables? That's a good solution to go on without affecting the performance! If there is a global agreement for this, I can prepare it.

Adding timing tests in the test sequence ? Good idea. How could it be done?

  • comparing the times with absolute reference values? Impossible: we don't control the "runner" machines executing the tests. Timings may vary a lot for the same code without notice,
  • comparing 2 branches? That's what we do manually. GitHub tests run on the branch where the commits are pushed. I don't know if we can have access to other branches in the same test job. Anyone to look at that?

@lfarv
Copy link
Contributor Author

lfarv commented Nov 23, 2023

For info: apparently the problem is for a large part in python. The simple fact of adding a __getattribute__ magic function doing nothing:

def __getattribute__(self, key):
   return super(Element, self).__getattribute__(key)

slows down significantly the access to the attributes:

master with getattribute
a = qp.Length 1 8.8
a = qp.PolynomB[1] 1 3.85

Tested with python 3.9 on macOS.
With that, there is little hope that we can find a "transparent" solution for introducing parameters…

@T-Nicholls
Copy link
Contributor

@lfarv, apologies for the delay in getting back to you, I was off most of last week.

Re: timing tests, I don't see why it wouldn't be possible to create a new GitHub workflow to run tests on the current branch and master, comparing how long they take. I can add this to my todo list if no-one else is keen to do it, but it'd be low priority, i.e., it'd be several months before I got around to it.

As for our slowdown, it is to be expected that overriding a default Python method with a new custom method would be slower, after all, most of the default operations are optimised "under the hood"; but I must admit that I was surprised by the magnitude of the slowdown though. As you note, this fact combined with the sheer number of getattr calls by linopt makes it inevitable that making anything that linopt needs to access a Parameter will cause a significant slowdown.

What are the primary use cases for Parameters, are they intended to be used mainly when optimising a lattice or elsewhere too? If they are only for optimisation, then I could see us having separate Parameterised element classes as a solution.

@swhite2401
Copy link
Contributor

Dear all, this slow-down is clearly not acceptable and this feature cannot be merged as is.
I would then suggest to separate the Parameters from Variables so we can proceed with the review of the valid part and move to Observable, matching, etc...

Concerning Parameters specifically, I think it is necessary to evaluate it in a separate branch. To honest I am also surprised by this huge slow-down but it seems we cannot do much about it with the present implementation.

My feeling is that a workaround can be found using __getattr()__ magic function instead of __get_attribute()__. __getattr()__ is a fallback solution in case the normal method do not work so in principle we could implement something that would be called only when needed, for example:

1- When parametrizing and attribute, using setattr() or other methods change its name to AttrName_Parametrized (AttrName is copied to AttrName_Parametrized and deleted)
2- In this case normal attribute access to AttrName will fail and code will fall back to __getattr()__
3- implement __getattr()__ such that AttrName_Parametrized is also searched for
4- if AttrName_Parametrized is not found raise AttributeError

In this case for standard attributes the efficient python attribute access will be used and only in the case where a parametrized attribute is declared the slow access will be used. This should considerably speed-up methods like linopt and not affect lattices without parametrized attributes (99% of the usage)

What do you think?

@lfarv
Copy link
Contributor Author

lfarv commented Nov 28, 2023

Hello all,

To honest I am also surprised by this huge slow-down

Yes, and it suggests that there may be some potential improvement in linopt6. @T-Nicholls already mentioned that.

For now:

  1. As @swhite2401 suggest, I'll take the Parameters out of this PR. It will allow to go on. It will take some time to do that, but at least the Variable base class is prepared to accept Parameters if/when they are better implemented.
  2. The idea of using __getattr__ should be tried. It will further slow down access to parameters, but on the other hand it will completely remove the penalty for "normal" attributes. In the end, it should be beneficial. For info, I tried replacing super().__getattribute(key) with object.__getattribute__(self, key). This eliminates the search across all the base classes:
    def __getattribute__(self, key):
       return object.__getattribute__(self, key)
    The single attribute access is more than twice faster ! But globally on linopt6, it's still too penalising.

@T-Nicholls
Copy link
Contributor

@swhite2401's __getattr__ idea is clever and should solve the core issue if carefully implemented; we need to be very careful when creating/deleting Parameterised attributes (Parameterised and normal), that both types never exist at the same time on the same element for the same root name, e.g, elem.PolynomB and elem.PolynomB_Parameterised. Hopefully, @lfarv's nice __getattribute__ call method speed improvement can help offset the slowdown for those using Parameters.

@lfarv lfarv changed the title Variables and parameters Variables Nov 28, 2023
@lfarv
Copy link
Contributor Author

lfarv commented Nov 28, 2023

This PR now concerns only Variables, and Parameters have been removed. The title and description have been updated.

The documentation is here.

I recall that to use the new classes, one must specify:

from at.future import ElementVariable, RefptsVariable, CustomVariable, VariableList

to avoid conflicts with existing classes.

@swhite2401
Copy link
Contributor

Hello @lfarv thanks for this! I have been quite busy with other stuff but this looks very nice at first sight and the documentation seems also very good.
I have a small comment to start with, it concerns the class naming. Shouldn't we set Variable as BaseVariable and CustomVariable as Variable. No strong opinion just a question of taste for me.

No I have to find time for testing...

Other question/comment: did you set a branch for the Parameters? Is there some implementation we could use to test ideas to solve this slow down issue?

@lfarv
Copy link
Contributor Author

lfarv commented Dec 3, 2023

did you set a branch for the Parameters?

The parameters are still available in the base_parameters branch. The corresponding documentation is here.

@swhite2401
Copy link
Contributor

I have tested these features and overall it works very well.

Here a first round of comments:

  • It would nice to have a method Element/RefptsVariables.refpts_uint32 and Element/RefptsVariables.refpts_bool that would return the refpts/mask of the elements concerned.
  • How deep is the history buffer, it seems to be that it can grow infinitely -> this could be an issue for large match jobs. Can we instead set an attribut initial_value and a history_buffer with a given length. This would seem safer to me
  • it would be useful to verify that the bounds are valid at initialization: that lb < value < hb is fulfilled, for now it seems to be done only at the first call of set which is ambiguous
  • the example for custom variables requires that the ring is defined, this is certainly not a requirement and should not appear as such
  • passing kwargs to the set/get function should be allowed

I might have more with further testing

@swhite2401
Copy link
Contributor

More comments:

  • I am not really keen on the syntax ElementVariable(ring[refpts], ....). I would very much prefer ElementVariable(ring, refpts, ....)
  • I am not convinced at all by the relevance of ElementVariable. From personal experience I don't see much practical applications where I would favor it w.r.t to RefptsVariable. ElementVariable may even be considered unsafe providing so many functions in AT can copy elements
  • Didn't we agree that these developments would go in a separate sub-package latticetools?
  • set_initial would be better called reset()

@lfarv
Copy link
Contributor Author

lfarv commented May 23, 2024

@swhite2401:

Being able to save these variable with the lattice is essential for the lattice designer, why not make the development now?

Ok, I can start looking at that as soon as this PR is merged, which will unlock other branches: maintaining several active branches starts being difficult since elements.py and lattice_objects.py are involved in several places…

@swhite2401
Copy link
Contributor

Ok then we can merge, this is anyway in the "future" imports so we can make some modification even after the merge right?

@lfarv
Copy link
Contributor Author

lfarv commented May 23, 2024

this is anyway in the "future" imports so we can make some modification even after the merge right?

You are right. There may modifications resulting from reviewing the upcoming "new matching" PR which is closely linked to the "Variables" and "Observables" branches.

Can you have a look at #738, while I'm preparing the "matching" one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Python For python AT code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants