Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better handling of huge tableau #18

Open
szaghi opened this issue Jan 15, 2017 · 9 comments
Open

Better handling of huge tableau #18

szaghi opened this issue Jan 15, 2017 · 9 comments

Comments

@szaghi
Copy link
Member

szaghi commented Jan 15, 2017

Currently, all tableau of coefficients are hard-coded in the sources. This has at least 2 cons:

  1. really error-prone with very bad visualization due to the 132 characters limit;
  2. not flexible: add/modify tableau require touch the sources.

I think it is much better to read tableau from a separate file at run-time. I like to encode them in JSON by means of json-fortran. However the big question is

where place the default tableau files?

@zbeekman (@rouson @cmacmackin @jacobwilliams and all having a system-knowledge) Maybe I have already asked an opinion about this, but I do not remember your answer.

Do you know if there is some standard (or almost standard) place where unix-like libraries search for their auxiliary-include files read at run-time?

In the case a user do not want to perform a full-installation, but want to use sources (as often happens in our Fortran ecosystem) where should we search for such files? Maybe we need an include directory in the root project...

Note

Some coefficients are well defined integer fractions, it can be very useful to add FortranParser as third party library for parsing tableau with such a coefficients definition.

@cmacmackin
Copy link

cmacmackin commented Jan 15, 2017 via email

@zbeekman
Copy link
Member

I agree with @cmacmackin about where one would place such tables. However, I would recommend, for performance reasons, that you might explore using the tables to generate, at compile or configure/cmake time, generated source files that have the coefficients hard-coded as parameters (compile time constants) that you then either #include or put in a module of compile-time constant coefficients, unless you can verify that reading them in does not have adverse performance impacts.

(I know that I am violating the "premature optimization is the root of all evil" principle, however it is VERY likely that you flux stencil coefficients and/or your smoothness coefficients are going to be in the inner most kernel, and fairly computationally expensive... so I would recommend doing some experiments to check whether making these coefficients compile time constants vs read from a file doesn't have an adverse performance impact)

@szaghi
Copy link
Member Author

szaghi commented Jan 15, 2017

@cmacmackin

Chris, thank you very much: the right dirs are what I was searching for.

@zbeekman

Zaak, I do not understand the performance issue: coefficient should be load 1 time during the integrator creation, not during the actual usage of integrators. Moreover, I do not understand how to generate tables without hard-coding in some way, either fortran, configure/make, python, etc. To me, having a JSON tables is very handy. Can you elaborate a bit more?

Thank you very much guys!

Cheers

@zbeekman
Copy link
Member

zbeekman commented Jan 15, 2017

Zaak, I do not understand the performance issue: coefficient should be load 1 time during the integrator creation, not during the actual usage of integrators.

Yes, it is read from disk once, but then it is placed in a variable which is subject to the tyranny of the memory hierarchy (moved around between RAM, L3, L2, L1 and registers). The CPU may not have any guarantees that the value hasn't changed, so it may end up fetching it from further away than it needs to. Compile time constants can be embedded in the instructions themselves, if I understand correctly---which I may not, I am a poor Fortran guy---which means that they may not take up registers that need to be used by other data, and may be fetched along with the instruction. As I said, my understanding here is pretty limited, but I do know that I've heard people who know more about the hardware layer than I do, discussing the merits of compile time constants.

Moreover, I do not understand how to generate tables without hard-coding in some way, either fortran, configure/make, python, etc. To me, having a JSON tables is very handy. Can you elaborate a bit more?

Yes my idea is simple: use generated source code. If you don't wish to write coefficients in hardcoded tables (either because you have a formula that can generate them, or due to readability issues, etc.) then you have another program write the Fortran source code for you before compiling the main library/program. You could put the tables of coefficients into a JSON file and then you could have a python, or Fortran, or some other program that reads the JSON file and writes a Fortran module that has the same tables but as compile time constants like:

module coefficients
  implicit none
  real, parameter :: ISk(4,4) = reshape( [3.0/12.0, 4.5/12.0 ... ! don't remember what the dimensions should be or the coefficients and am too lazy to look it up right now
...
end module

The timings of this implementation could be compared to the version of the code that directly reads the coefficients from the JSON file into memory, without the effort/complication of creating generated sources.

CMake has capabilities to handle generated sources. It would be a bit more complicated, perhaps, to roll your own, but you can do it with a makefile or another means.

I hope I have been more clear.

@giacrossi
Copy link
Collaborator

giacrossi commented Jan 15, 2017 via email

@zbeekman
Copy link
Member

Dear @zbeekman, thank you for your idea: I agree with you that using
parameters could be better for performance reasons...

You won't know for sure until you can compare the techniques... but I just thought it was worth mentioning since it is likely that the Smoothness computation is an expensive, inner-most kernel.

@szaghi
Copy link
Member Author

szaghi commented Jan 16, 2017

@zbeekman

Zaak, thank you for your insight.

Yes, it is read from disk once, but then it is placed in a variable which is subject to the tyranny of the memory hierarchy (moved around between RAM, L3, L2, L1 and registers). The CPU may not have any guarantees that the value hasn't changed, so it may end up fetching it from further away than it needs to. Compile time constants can be embedded in the instructions themselves, if I understand correctly---which I may not, I am a poor Fortran guy---which means that they may not take up registers that need to be used by other data, and may be fetched along with the instruction. As I said, my understanding here is pretty limited, but I do know that I've heard people who know more about the hardware layer than I do, discussing the merits of compile time constants.

Oh, sorry, I did not focused that you were referring to parameters, my bad. Sure, parameters are always better-handled (I hope) than other memories, but in this specific case I did not consider them for some practical issues (see below).

Yes my idea is simple: use generated source code. If you don't wish to write coefficients in hardcoded tables (either because you have a formula that can generate them, or due to readability issues, etc.) then you have another program write the Fortran source code for you before compiling the main library/program. You could put the tables of coefficients into a JSON file and then you could have a python, or Fortran, or some other program that reads the JSON file and writes a Fortran module that has the same tables but as compile time constants like:

module coefficients
  implicit none
  real, parameter :: ISk(4,4) = reshape( [3.0/12.0, 4.5/12.0 ... ! don't remember what the dimensions should be or the coefficients and am too lazy to look it up right now
...
end module

Ok, this is an option, but has its own cons.

Currently, we have 8 different set of polynomial-coefficients and linear (optimal) coefficients and 3 different WENO (JS, JS-Z, JS-M) resulting in 24 different integrators: just from its born, it was clear for me that, to preserve easy maintenance/improvements and allow a lot of different schemes, I need a flexible OOP pattern. The strategy pattern is very attractive in this scenario and allocatable variables are ubiquitous here. This was the main reason why I never considered parameters. I have full-thrust in your experience, thus if you think it worth to try, I do.

Your coefficient modules should become something like:

module wenoof_coefficients
  implicit none
  private
  public :: beta-S2, beta-S3...., beta-S8
  public :: gamma-S2, gamma-S3..., gamma-S8 

  real(R_P), parameter, target :: beta-S2(...:...,...:...) = reshape([....], ...)
  real(R_P), parameter, target :: beta-S3(...:...,...:...) = reshape([....], ...)
....
    real(R_P), parameter, target :: gamma-S8(...:,...:...) = reshape([....], ...)
endmodule wenoof_coefficients 

I used target specification for the following reason: when a user instantiate an interpolator (s)he must select the accuracy, namely the stencils number/dimension. Thus when performing the interpolation the logic possible are essentially 2:

  1. for each interpolate call we must check the number S (by means of an if-elseif or select case construct) and the access to the right beta-S# and gamma-S#) ;
  2. avoid the check by building the interpolator with the proper set of coefficients inside it:

Currently, we adopt the second approach: by a strategy pattern the interpolator is constructed with the proper set of coefficients that are stored into allocatables members of the interpolator.

Now, if we want to have parameter-coefficients while avoiding the S-check for each interpolate we have few options:

  1. provide a set concrete interpolators with hard-coded reference to the proper parameter-coefficients set;
  2. make the generic interpolator coefficients a pointer to the proper parameter-coefficients set;

Namely:

! concrete approach

type :: inpterpolator_S2
  contains
    procedure :: interpolate ! here beta-S2 and gamma-S2 are directly accessed
endtype interpolator_S2

type :: inpterpolator_S3
  contains
    procedure :: interpolate ! here beta-S3 and gamma-S3 are directly accessed
endtype interpolator_S3

! and so on...

! pointer approach

type :: inpterpolator
  real(R_P), pointer :: beta(:,:)
  real(R_P), pointer :: gamma(:,:)
  contains
    procedure :: init ! here beta and gamma are associated to the correct beta-S, gamma-S
    procedure :: interpolate ! here the local beta, gamma members are accessed 
endtype interpolator   

Is it possible to associate a pointer to a parameters right? If so, is the memory handling still good?

At the end, I am really in doubt about which approach is better and overall if the performance will increase. As a matter of fact, while the coefficients are surely constants, smoothness indicators are not and must be stored in dynamic memory: the tyranny of memory hierarchy cannot be completely avoided.

My afraid is mostly about code-simplicity-conciseness-clearness: Damian (@rouson) teach me how it is important to be KISS and handling coefficients by parameters looks very complex...

You won't know for sure until you can compare the techniques... but I just thought it was worth mentioning since it is likely that the Smoothness computation is an expensive, inner-most kernel.

I'll try to verify the performance difference with 1 case if I'll find the time.

Zaak, thank you again your help is priceless.

Cheers

@giacrossi
Copy link
Collaborator

giacrossi commented Jan 16, 2017 via email

@szaghi
Copy link
Member Author

szaghi commented Jan 16, 2017

@giacombum

I'm not the Fortran expert here, but if we use a JSON file to read the coefficients, is it possible to store them into parameters?

Nope if you do it at run-time in the library, parameters are compile-time constants. If you want JSON-formatted coefficients you must go with the Zaak suggestion: a pre-processor that read JSON before you compile WenOOF.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants