Allow passing of user data for callback procedure #30

awvwgk · 2022-02-05T15:48:14Z

Proposal to allow passing user data through minpack as unlimited polymorphic object. This approach has the advantage of mostly keeping the API unchanged, however users have to adjust their objective functions and add another argument.

~~This or an alternative solution is required to create a proper export of the C bindings.~~ Opening this for discussion.

Previously discussed in

Pass data as pointer #4

jacobwilliams · 2022-02-05T16:18:50Z

I'm not wild about making the user declare the udata argument in all their functions, even if they aren't going to use it. It's similar to how old codes had rwork, iwork array arguments that you also had to declare even if you weren't going to use them.

I haven't look at the c callback issues. What are the alternatives?

awvwgk · 2022-02-05T16:51:35Z

At least for the C API it makes sense to have a data pointer, since there is no straight-forward alternative available. The issue is that we can't do the trick with via an internal procedure to avoid global variables there, however I think I found a solution to archive the required behavior in C without affecting the Fortran side following your proposal in #4.

certik · 2022-02-06T05:05:06Z

The cleanest is still probably the internal procedure. Why cannot we use that? The lowest level code just uses a callback. Then in the upper level API we simply create an internal subroutine, that accesses the "udata", as passed through in the parent procedure. Then in the C level API we simply expose udata as a pointer. That should work, without polluting the low level API with udata.

awvwgk · 2022-02-06T09:32:41Z

The cleanest is still probably the internal procedure. Why cannot we use that?

The only practical issue might be compiler support, see #14 (comment). But maybe we shouldn't worry about this now and just use internal procedures.

Nicholaswogan · 2022-02-06T16:39:33Z

I really support this pull request because it helps scientific programers write better code. The nested function approach works just fine, but it isn't the obvious solution that scientific programers will make. The obvious solution is to pass data via a global module variable, or even a common block. These solutions are obvious to scientific programers because legacy code is littered with these sub-optimal solutions, and scientific programers learn fortran by struggling with their advisors legacy code (speaking from experience here!). Of course, passing data via globals like this is not ideal for a bunch of reasons.

Alternatively, if fortran-lang/minpack has root-solving functions that look like the following, as @awvwgk suggests,

subroutine hybrd1(fcn, n, x, Fvec, Tol, Info, Wa, Lwa, udata)
! ...
        class(*), intent(inout), optional :: udata !! user data
! ...
end subroutine hybrd1

then it's extremely clear to the novice scientific programer that udata is how they should pass data to their objective function. I would almost prefer the variable was named user_data instead of udata to make it even more obvious. The novice might not know how to use an unlimited polymorphic object, but this is quickly remedied when they google it and land on fortran-lang documentation.

This is the strength of Fortran over many other languages. The right way to do something is often obvious. We should play to this strength!

jacobwilliams · 2022-02-06T17:32:07Z

I think one of the goals here is to show scientific programmers how they should be doing things, not let them keep using the old bad patterns. :)

Rarely will it be convenient anyway to have to stuff everything in some unlimited polymorphic entity (which would require the dreaded and annoying select type construct). Using contained routines or object oriented classes (see #31) are better, and don't have the effect of cluttering up the low level api.

C is different...since C is a low-level language. I think it's OK to have something like that in the C API, but there should be a way to avoid having it clutter up the fortran code so much?

ivan-pi · 2022-02-06T18:17:50Z

Rarely will it be convenient anyway to have to stuff everything in some unlimited polymorphic entity (which would require the dreaded and annoying select type construct). Using contained routines or object oriented classes (see #31) are better, and don't have the effect of cluttering up the low level api.

This might be just a matter of taste, but having to write an internal procedure to match an interface, or having to wrap the callback for each solver class feels just as tedious to me. The clutter has simply shifted from the data dummy variable to the specialization needed for the solver in question.

I'll take the SLSQP optimizer as an example, the callback needed is:

    subroutine rosenbrock_func(me,x,f,c)
        class(slsqp_solver),intent(inout) :: me
        real(wp),dimension(:),intent(in)  :: x      !! optimization variable vector
        real(wp),intent(out)              :: f      !! value of the objective function
        real(wp),dimension(:),intent(out) :: c      !! the constraint vector `dimension(m)`,
                                                    !! equality constraints (if any) first.

        f = 100.0_wp*(x(2) - x(1)**2)**2 + (1.0_wp - x(1))**2  !objective function
        c(1) = 1.0_wp - x(1)**2 - x(2)**2  !equality constraint (>=0)
    end subroutine rosenbrock_func

I cannot reuse this callback with other optimization libraries unless 1) the solver instance is part of an abstract solver hierarchy, or 2) I put the core of the callback into a second low-level procedure (or god forbid, include it using the preprocessor). Moreover, if I may borrow some words

I'm not wild about making the user declare the me argument in all their functions, even if they aren't going to use it.

However, if the callback was changed to:

abstract interface
  subroutine calcfc(x,f,c,data)
    real(wp), intent(in) :: x(:)   ! optimization variables
    real(wp), intent(out) :: f     ! value of the objective function
    real(wp), intent(out) :: c(:)  ! the constraint vector
    class(*), intent(in), optional :: data
  end subroutine
end interface

both the callback function, and any necessary data can be reused throughout any solver library which matches the abstract interface. Caveat, we need a select type if we actually want to access the data.

In both cases the callback takes four arguments. In the first case, the procedure is constrained to a single solver. In the second case it is not. Arguably, it's easier to remember how to use a class(*) variable, than it is to write a callback for each solver class.

certik · 2022-02-06T22:04:30Z

And if the callback is just:

abstract interface
  subroutine calcfc(x,f,c,data)
    real(wp), intent(in) :: x(:)   ! optimization variables
    real(wp), intent(out) :: f     ! value of the objective function
    real(wp), intent(out) :: c(:)  ! the constraint vector
  end subroutine
end interface

Then it seems it's even easier to reuse...

SciPy's minimize has an args argument:

args : tuple, optional

Extra arguments passed to the objective function and its derivatives (Jacobian, Hessian).

But I think it's easier to handle in Python than in Fortran from the user's perspective.

awvwgk · 2022-02-06T22:34:02Z

SciPy's minimize has an args argument:

args : tuple, optional
Extra arguments passed to the objective function and its derivatives (Jacobian, Hessian).

But I think it's easier to handle in Python than in Fortran from the user's perspective.

This is probably true, unless we want to wrap directly from Fortran to Python (via f2py or similar). Preferably we can define our Python intercompatibility via a C API, which will be beneficial for more than one language in the long run.

Fortunately, I don't think the design of the Fortran callback is actually relevant for building the C API. For any C intercompatibility we have to involve at least two layers of callback unless the callback interface is C intercompatible, so we have to have some glue in between, which will account for any design choice of the Fortran callback. Therefore, I don't see this as a blocker for defining our C API.

certik · 2022-02-06T22:35:33Z

For the C API, it seems a void* is the most natural to pass as an argument for the user defined data?

awvwgk · 2022-02-06T22:44:38Z

For the C API, it seems a void* is the most natural to pass as an argument for the user defined data?

Yes, the callback for the C API proposal in #29 defines it as

minpack/include/minpack.h

Lines 15 to 20 in cd7bb38

    
           typedef void (*minpack_func)( 
        
               int /* n */, 
        
               const double* /* x */, 
        
               double* /* fvec */, 
        
               int* /* iflag */, 
        
               void* /* udata */);

Depends a bit on taste whether the void* should be the first or last argument, I followed NLopt's convention to put it last. It should provide all languages interfacing via C to minpack with at least one way to pass custom data.

certik · 2022-02-06T23:26:06Z

Depends a bit on taste whether the void* should be the first or last argument

I think last is better from a performance perspective: the first arguments get passed in registers, and the user data can be empty, or not always used, so it would be a waste to pass the nullptr in a register. On the other hand, if the user data is used (say it contains a pointer to some kind of an array), then I currently don't know which way is faster (possibly it depends on the problem).

ivan-pi · 2022-02-08T10:59:47Z

I've seen some C codes use an attribute like

void*           cookie __attribute__ ((unused))

to mark an unused data argument in the callback.

codecov · 2022-04-23T20:39:48Z

Codecov Report

Merging #30 (18a7f67) into main (281e9b9) will decrease coverage by 3.97%.
The diff coverage is 60.31%.

@@            Coverage Diff             @@
##             main      #30      +/-   ##
==========================================
- Coverage   88.77%   84.80%   -3.98%     
==========================================
  Files           2        3       +1     
  Lines        1221     1336     +115     
  Branches      456      468      +12     
==========================================
+ Hits         1084     1133      +49     
- Misses         40       94      +54     
- Partials       97      109      +12

Impacted Files	Coverage Δ
src/minpack_legacy.f90	`46.42% <46.42%> (ø)`
src/minpack_capi.f90	`76.34% <66.66%> (-22.05%)`	⬇️
src/minpack.f90	`88.26% <77.77%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 281e9b9...18a7f67. Read the comment docs.

awvwgk · 2022-04-23T20:44:35Z

Updated this branch to show how we can retain the old interface, add support for user data and remove the nested function usage from the C export. It also raises the bar regarding Fortran features a compiler has to support, which can be a disadvantage.

certik · 2022-04-23T21:54:26Z

It also raises the bar regarding Fortran features a compiler has to support

It looks like it "only" requires a support for class(*) procedure arguments.

awvwgk · 2022-04-23T21:59:41Z

~~One notable required feature which breaks compiler support for Intel LLVM is the usage of procedure pointers.~~ Looks like procedure pointers are now implemented with ifx 2022.0.0, so we are good.

awvwgk requested a review from a team February 5, 2022 15:48

awvwgk mentioned this pull request Feb 6, 2022

Callback function design grimme-lab/nlopt-f#16

Open

awvwgk mentioned this pull request Mar 5, 2022

Provide Python bindings for Minpack #49

Merged

11 tasks

awvwgk force-pushed the user-data branch from 62e156c to bd15ecb Compare March 5, 2022 15:24

awvwgk mentioned this pull request Mar 12, 2022

C bindings broken on MacOS/Arm64 with GFortran 11 #54

Closed

awvwgk force-pushed the user-data branch from bd15ecb to 5b00f5f Compare April 23, 2022 20:38

Allow passing of user data for callback procedure

18a7f67

awvwgk force-pushed the user-data branch from 5b00f5f to 18a7f67 Compare April 23, 2022 21:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow passing of user data for callback procedure #30

Allow passing of user data for callback procedure #30

awvwgk commented Feb 5, 2022 •

edited

jacobwilliams commented Feb 5, 2022

awvwgk commented Feb 5, 2022

certik commented Feb 6, 2022

awvwgk commented Feb 6, 2022

Nicholaswogan commented Feb 6, 2022

jacobwilliams commented Feb 6, 2022 •

edited

ivan-pi commented Feb 6, 2022 •

edited

certik commented Feb 6, 2022 •

edited

awvwgk commented Feb 6, 2022

certik commented Feb 6, 2022

awvwgk commented Feb 6, 2022 •

edited

certik commented Feb 6, 2022 •

edited

ivan-pi commented Feb 8, 2022

codecov bot commented Apr 23, 2022 •

edited

awvwgk commented Apr 23, 2022

certik commented Apr 23, 2022

awvwgk commented Apr 23, 2022 •

edited

Allow passing of user data for callback procedure #30

Are you sure you want to change the base?

Allow passing of user data for callback procedure #30

Conversation

awvwgk commented Feb 5, 2022 • edited

jacobwilliams commented Feb 5, 2022

awvwgk commented Feb 5, 2022

certik commented Feb 6, 2022

awvwgk commented Feb 6, 2022

Nicholaswogan commented Feb 6, 2022

jacobwilliams commented Feb 6, 2022 • edited

ivan-pi commented Feb 6, 2022 • edited

certik commented Feb 6, 2022 • edited

awvwgk commented Feb 6, 2022

certik commented Feb 6, 2022

awvwgk commented Feb 6, 2022 • edited

certik commented Feb 6, 2022 • edited

ivan-pi commented Feb 8, 2022

codecov bot commented Apr 23, 2022 • edited

Codecov Report

awvwgk commented Apr 23, 2022

certik commented Apr 23, 2022

awvwgk commented Apr 23, 2022 • edited

awvwgk commented Feb 5, 2022 •

edited

jacobwilliams commented Feb 6, 2022 •

edited

ivan-pi commented Feb 6, 2022 •

edited

certik commented Feb 6, 2022 •

edited

awvwgk commented Feb 6, 2022 •

edited

certik commented Feb 6, 2022 •

edited

codecov bot commented Apr 23, 2022 •

edited

awvwgk commented Apr 23, 2022 •

edited