Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smooth Package Overhaul #684

Open
timothyas opened this issue Dec 13, 2022 · 1 comment
Open

Smooth Package Overhaul #684

timothyas opened this issue Dec 13, 2022 · 1 comment

Comments

@timothyas
Copy link
Member

I worked with the smooth package quite a bit during my PhD. During that time I've noticed some things that I would like to update. At the end of the day, my goal is to add a new correlation model that I implemented as part of my PhD in the MITgcm, but I think it would be a good idea to clean up the current implementation first.

I have implemented all fixes/changes relevant to each item discussed below (except for good documentation and the proposed deprecated file removal), in my branch: https://github.com/timothyas/mitgcm/tree/rewrite-smooth
See also a verification setup for the smooth package here:
https://github.com/timothyas/verification_other/tree/feature/smooth_verification

I'm making this issue so that anyone else can add changes they would like to see, and so we can discuss what the best path forward is. For instance should I make one PR that completely overhauls all of the smooth package in one go? Or should we make it more fragmented? Note that a lot of the changes discussed below are breaking changes, because changing runtime parameters in data.smooth requires users to have new data.smooth files.

Documentation

This one is obvious, and I'm happy to add documentation for the smooth package including:

  • Brief, general description of what it does
  • Citations to relevant literature
  • Runtime/compile time options

Note: see my SMOOTH.h which documents the proposed (and existing) smooth parameters in a "bare minimum" way: https://github.com/timothyas/MITgcm/blob/rewrite-smooth/pkg/smooth/SMOOTH.h

Bugs

  • Right now, when smooth_filtervar[2/3]d are run, the random number generator on each processor is initialized with the same seed, so each process generates the same sequence of random numbers. As a result, there can be a repeating spatial pattern imprinted on the random numbers. For example:

    rng_problem

    Note that 1. I made this exaggerated by taking the tutorial_global_ocean_latlon and splitting it across 9 processes and 2) we really only clearly see this behavior in the first random sample in domains with any sort of interesting topography. That is because the port_rand_norm function is only called when mask>0, so after the first sample each processor is at a different RNG seed simply because it has been called a different number of times due to differing topography in each processor... this would be a huge problem for idealized models though!

    One possible fix is to use each process's PID to initialize random seeds on each process. This is what I did for a purely MPI based setup, and we could probably extend this to threads (if necessary?) easily. I implemented this, and here we can see that things look random, as they should...

    rng_fixed

    See code: https://github.com/timothyas/MITgcm/blob/dfa1ad4a0bd8bbdb1816a7194196399c8b191789/pkg/smooth/smooth_filtervar2d.F#L103-L104

  • Right now there is a code block that limits the vertical correlation length scale to be at maximum the vertical spacing at that vertical level:

    C avoid excessive vertical smoothing:
    IF ( smooth3DsizeZ(smoothOpNb).NE.3 ) THEN
    DO bj=myByLo(myThid),myByHi(myThid)
    DO bi=myBxLo(myThid),myBxHi(myThid)
    DO k=1,Nr
    DO j=1-OLy,sNy+OLy
    DO i=1-OLx,sNx+OLx
    smooth3D_KzMax=drC(k)
    smooth3D_KzMax = smooth3D_KzMax*smooth3D_KzMax
    & /smooth3DtotTime/2
    IF ( smooth3D_kappaR(i,j,k,bi,bj).GT.smooth3D_KzMax ) THEN
    smooth3D_kappaR(i,j,k,bi,bj) = smooth3D_KzMax
    ENDIF
    ENDDO
    ENDDO
    ENDDO
    ENDDO
    ENDDO
    ENDIF

    A comment above suggests that this is to limit excessive smoothing, but I am almost positive that many people do not know that this limit is being imposed. This may not be problematic for systems like ECCOv4, but as a specific example, this would be a problem for anyone modeling ice shelves and using a very small, constant, vertical grid spacing.

    Proposed solution: Add a boolean flag smooth3DUseDRCMaxLz which is default set to FALSE. User can change this flag at runtime.

Confusing runtime parameters

In each of these I use ND to mean both 2D and 3D

  • smoothNDtype: integer value either 0, 1, 2. If 0, then don't compute the filter normalization. If 1, use a randomized method to compute filter normalization. If 2, use a hand coded "exact" approach to compute the normalization, except that (a) this method does not scale to large systems well and (b) this method requires an adjoint of smooth_diff2d. See how it is currently handled (or not!):

    c adjoint:
    WRITE(errorMessageUnit,'(A,/,A)' )
    & "you need to have adsmooth_diff2D compiled and then:",
    & "uncomment the line below and comment the stop"
    CALL ALL_PROC_DIE( myThid )
    STOP 'ABNORMAL END: S/R smooth_filtervar2D'
    c call adsmooth_diff2D(smoothTmpFld,smooth2dmask,nbt_in,myThid)

    Proposed change: I propose to change this to a LOGICAL smoothNDCalcNormFactor, where if TRUE then use the randomized method to calculate the normalization factor, otherwise don't (for instance, if it's already been computed and we are providing it as an input file). And, just get rid of the hand coded "exact", which fails right now anyway.

  • smoothNDsize: integer value that determines how the correlation length scale is set. If 2, then look for a file named smoothNDscales which has the length scales in each direction, as records. (For 3D it looks for files scalesH and scalesZ for horizontal and vertical ...). If not 2, then set length based on fixed values Lx/y/z_0 which can be set in data.smooth

    Proposed changes: Use logicals smooth2DConstHorizontal , smooth3DConstHorizontal, smooth3DConstVertical. If True, use Lx_0 (or similar for y, z) to set a constant length scale, otherwise look for spatially varying field

  • Expected files: In the case of a spatially varying length scale, pkg/smooth expects, for 2D: a 2 record file with first record Lx and second record Ly. For 3D, it expects one file smooth3DscalesH like the 2D, and a separate smooth3DscalesZ for the vertical length scale. Then, it writes the elements of the diffusion operator with records K11 and K22 for 2D case, using the notation Kij referring to the location in a 2D tensor. Then for the 3D case it writes out a single file with records (in this order): K31, K32, K33 ,K11, K22, K13, K23, K12, K21 ... this order only makes sense if we start to think about nonzero entries in a GMRedi-like tensor, but I'd say it's still ... unique. In any case, based on my own experience, this is pretty error prone.

    Proposed changes: Read and write separate, clearly named files in all of these cases.

  • Overlap with pkg/ctrl: When a user wants to apply a smoothing operator to a control field, they specify in data.ctrl whether or not they want to use the full operator from Weaver and Courtier, 2001 or just the diffusion operator part of it (i.e., apply normalization or not). Right now these options are passed to xx_gen*_preproc as wc01 or smooth for the full operator or just diffusion, and xx_gen*_preproc_i takes the corresponding operator number.

    Proposed changes: Deprecate / remove the xx_gen*preproc specification of wc01 etc, but still provide the operator number to xx_gen*_preproc_i. Then, make a new runtime parameter in the smooth package smoothNdApplyNorm for whether or not to apply the normalization (True => WC01, False => smooth).

Proposed file deprecation/removal

  • smooth2d.f and smooth3d.f:

    • These are essentially calling smooth_diffnd
    • They are only called from
      1. ctrl_map_ini_gen[2,3]d, called from ctrl_map_ini_ecco, which is deprecated
      2. ctrl_get_gen, but I think this code block (and lines 157-165) should be removed too for the following reasons. First, the smoothing is now done at initialization time, called from ctrl_map_ini_gentim2d. If smooth2d is called again from this routine, I am pretty sure the user will get smoothing/correlation that they are not expecting (and requires the lesser known CPP Macro ALLOW_SMOOTH_CTRL2D). Secondly, this is not general, and does not allow the user to use any other correlation model other than running a simple diffusion operator (i.e., not even WC01).
  • smooth_basic2d and smooth_hetero2d: these provide really small tweaks to the more general smooth_diff2d. Specifically, they specify constant length scale and number of iterations (basic) or length scales based on file (hetero), which should instead be handled by making different operators in data.smooth. These two subroutines are used in pkg/ecco/cost_gencost_bpv4 and pkg/ecco/cost_gencost_sshv4. In the second one, there is a similar overlap between ecco and smooth packages in a similar way to the ctrl package as described above.

Other small things

  • It would be nice to have a different number of 2D and 3D operators, whereas right now we have the same maximum number for both: smoothNbMax.
  • It would be nice to specify the number of random samples taken to form the filter normalization smoothOpNbRand
  • In Shelfice controls, generic masking, and lognormal control #445 I added the smoothMaskName variable to allow for different masks other than maskC. This is a 5 character variable where we only check for the last character (e.g. C, W, S,...). It would probably be more straightforward to make this a single character variable.
  • It might be nice to write out the samples used to compute the randomized filter normalization factor. This would not be the default, because it would usually involve writing out thousands of random fields, but my proposed change is to add a LOGICAL smoothNDWriteSamples which the user can toggle (per operator) to write out the samples.
@gaelforget
Copy link
Member

Thanks for working on this. Will try to read through asap

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants