Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Fix remaining memory leaks in HEMCO #255

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

yantosca
Copy link
Contributor

@yantosca yantosca commented Jan 5, 2024

Name and Institution (Required)

Name: Bob Yantosca
Institution: Harvard + GCST

Confirm you have reviewed the following documentation

Describe the update

This is a placeholder PR and should not yet be merged. I have made the base branch main but will change it later.

This is a companion PR to geoschem/geos-chem#2102. Configuring GEOS-Chem with -DSANITIZE=y has revealed the following memory leaks in GEOS-Chem and HEMCO:

=================================================================
==1495473==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 107464 byte(s) in 133 object(s) allocated from:
    #0 0x14815712993f in __interceptor_malloc ../../.././libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x1bd566d in __hco_filedata_mod_MOD_filedata_init /n/holyscratch01/jacob_lab/ryantosca/tests/cloudj/test_memory/CodeDir/src/HEMCO/src/Core/hco_filedata_mod.F90:174
    #2 0x202020202020201f  (<unknown module>)

Direct leak of 29248 byte(s) in 8 object(s) allocated from:
    #0 0x14815712993f in __interceptor_malloc ../../.././libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x10c438e in __histcontainer_mod_MOD_histcontainer_create /n/holyscratch01/jacob_lab/ryantosca/tests/cloudj/test_memory/CodeDir/src/GEOS-Chem/History/histcontainer_mod.F90:319
    #2 0x202020202020201f  (<unknown module>)

Direct leak of 3720 byte(s) in 1 object(s) allocated from:
    #0 0x14815712993f in __interceptor_malloc ../../.././libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x18f6ae1 in __state_chm_mod_MOD_init_state_chm /n/holyscratch01/jacob_lab/ryantosca/tests/cloudj/test_memory/CodeDir/src/GEOS-Chem/Headers/state_chm_mod.F90:803

Direct leak of 3160 byte(s) in 1 object(s) allocated from:
    #0 0x14815712993f in __interceptor_malloc ../../.././libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x18fe162 in __state_chm_mod_MOD_init_state_chm /n/holyscratch01/jacob_lab/ryantosca/tests/cloudj/test_memory/CodeDir/src/GEOS-Chem/Headers/state_chm_mod.F90:1027

SUMMARY: AddressSanitizer: 143592 byte(s) leaked in 143 allocation(s).

Leak 1 is in HEMCO. We are still investigating.

The other leaks are being addressed in geoschem/geos-chem#2104.

Tagging @msulprizio @lizziel

Expected changes

This will be a zero-diff update that will remove memory leaks. It will not change results.

Reference(s)

N/A

Related Github Issue(s)

BettyCroft and others added 17 commits February 21, 2022 07:19
…piling

The MSG variable was defined twice in HCOX_TOMAS_Jeagle_Run. The second
occurrence has been removed, along with an unused ERR variable.

Signed-off-by: Melissa Sulprizio <mpayer@seas.harvard.edu>
There were an excessive amount of trailing whitespaces in hcox_tomas_jeagle_mod.F90
that affected readability of the file. These have now been removed.

Signed-off-by: Melissa Sulprizio <mpayer@seas.harvard.edu>
Resolved conflicts in:
	CHANGELOG.md
	src/Extensions/hcox_tomas_jeagle_mod.F90

Signed-off-by: Melissa Sulprizio <mpayer@seas.harvard.edu>
…MCO 3.7.x in vertical regridding updates.

Signed-off-by: Haipeng Lin <hplin@seas.harvard.edu>
Signed-off-by: Melissa Sulprizio <mpayer@seas.harvard.edu>
TOMAS species SF6 has conflicts with sulfur hexafluoride (SF6) used in
the TransportTracers simulation. To avoid conflicts, we rename all of the
TOMAS species for bins 1-9 to use 01-09. This is done for all TOMAS species
(NK*, SF*, SS*, ECOB*, ECIL*, OCOB*, OCIL*, AW*, and DUST*).

Signed-off-by: Melissa Sulprizio <mpayer@seas.harvard.edu>
Signed-off-by: Melissa Sulprizio <mpayer@seas.harvard.edu>
…v_cesm' of github.com:jimmielin/HEMCO-1 into dev/3.8.0

Signed-off-by: Melissa Sulprizio <mpayer@seas.harvard.edu>
src/Core/hco_arr_mod.F90
- Set Arr and ArrVec to null before exiting cleanup routines


src/Core/hco_tidx_mod.F90
- Set individual fields of AlltIDx to NULL after deallocating
- Update comments

Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
@yantosca yantosca added memory category: Bug Fix Fixes a bug that was previously reported labels Jan 5, 2024
@yantosca yantosca self-assigned this Jan 5, 2024
@yantosca
Copy link
Contributor Author

yantosca commented Jan 9, 2024

@msulprizio @lizziel @christophkeller: I now know where the memory leak is being incurred. In HEMCO/src/Core/hco_config_mod.F90 there is a FileData object (Dta) that is a local variable that gets allocated, and pointed to by the linked list.

IF ( TRIM( srcFile ) == '-' ) THEN
! The current entry of the configuration file specifies that
! we will get data from the file listed immediately above it.
! Thus we have to reuse a previously-defined FileData object
! (aka Dta). Stop if Dta is not initialized.
IF ( .not. ASSOCIATED( Dta ) ) THEN
MSG = 'Cannot use previous data container: '//TRIM(tagcName)
CALL HCO_Error( msg, RC, thisLoc=loc )
RETURN
ENDIF
! Reuse the file metadata specified in PrevDta for
! this entry of the HEMCO configuration file.
Lct%Dct%DtaHome = Lct%Dct%DtaHome - 1
ELSE
! The current entry of the configuration file specifies that
! we will read data from a file. We thus need to initialize
! a new FileData object to keep track of the file metadata.
!
! >>> NOTE: This seems to cause a memory leak! <<<
! >>> We will look into this at a later date. <<<
CALL FileData_Init( Dta )
! Set source file name. Check if the read file name starts
! with the configuration file token '$CFDIR', in which case
! we replace this value with the passed CFDIR value.
STRLEN = LEN( srcFile )
IF ( STRLEN > 6 ) THEN
IF ( srcFile(1:6) == '$CFDIR' ) THEN
srcFile = TRIM( CFDIR ) // TRIM( srcFile(7:STRLEN) )
ENDIF
ENDIF
Dta%ncFile = srcFile
! Set source variable and original data unit.
Dta%ncPara = ADJUSTL( srcVar )
Dta%OrigUnit = ADJUSTL( srcUnit )
! If the parameter ncPara is not defined, attempt to read data
! directly from configuration file instead of netCDF.
! These data are always assumed to be in local time. Gridded
! data read from netCDF is always in UTC, except for weekdaily
! data that is treated in local time. The corresponding
! IsLocTime flag is updated when reading the data (see
! hcoio_dataread_mod.F90).
IF ( TRIM( Dta%ncPara ) == '-' ) THEN
Dta%ncRead = .FALSE.
Dta%IsLocTime = .TRUE.
ENDIF
! Extract information from time stamp character and pass values
! to the corresponding container variables. If no time string is
! defined, keep default values (-1 for all of them)
IF ( TRIM(srcTime) /= '-' ) THEN
CALL HCO_ExtractTime( HcoConfig, srcTime, Dta, RC )
IF ( RC /= HCO_SUCCESS ) THEN
msg = 'Could not extract time cycle information!'
CALL HCO_Error( msg, RC, thisLoc=loc )
RETURN
ENDIF
ENDIF
#if defined(ESMF_)
! In an ESMF environment, the source data will be imported
! through ExtData by name, hence need to set ncFile equal to
! container name!
IF ( Dta%ncRead ) THEN
Dta%ncFile = ADJUSTL( tagcName )
ENDIF
#endif
! Update the properties of the data container
! NOTE: This routine abstracts the big IF/ELSE block that
! processes the time cycle information (bmy, 07 Mar 2022)
CALL UpdateDtaProperties( &
dctType = dctType, &
int3 = int3, &
char1 = TRIM( char1 ), &
char2 = char2(1:1), &
tagCName = TRIM( tagCName ), &
tmCycle = TRIM( tmCycle ), &
separator = TRIM( separator ), &
srcDim = TRIM( srcDim ), &
wildCard = TRIM( wildCard ), &
HcoConfig = HcoConfig, &
Lct = Lct, &
Dta = Dta, &
RC = RC )
! Trap potential errors
IF ( RC /= HCO_Success ) THEN
MSG = 'Error encountered in routine "UpdateDtaProperties"'
CALL HCO_Error( MSG, RC, thisLoc=loc )
RETURN
ENDIF
ENDIF
! Connect this FileData object to the HcoState%HcoConfigList.
Lct%Dct%Dta => Dta
! Free list container for next cycle
Lct => NULL()

The Dta object is also needed further down in the loop for those entries in HEMCO_Config.rc that use - as the file name (i.e. using the file that is immediately above it).

I tried to replace Dta with Lct%Dct%Dta, which would get rid of the memory leak, but then the Dta object would not be present for those entries that use -.

I just wanted to note this. It might be going deep into the rabbit hole to try to undo this memory leak, so I will probably put this off for the time being. If any of you have suggestions then let me know. Thanks!

Copy link

stale bot commented Mar 13, 2024

This issue has been automatically marked as stale because it has not had recent activity. If there are no updates within 7 days it will be closed. You can add the "never stale" tag to prevent the Stale bot from closing this issue.

@stale stale bot added the stale No recent activity on this issue label Mar 13, 2024
@yantosca yantosca added never stale Never label this issue as stale and removed stale No recent activity on this issue labels Mar 13, 2024
@yantosca yantosca added topic: Performance Related to HEMCO performance, parallelization, or memory issues and removed memory labels Mar 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: Bug Fix Fixes a bug that was previously reported never stale Never label this issue as stale topic: Performance Related to HEMCO performance, parallelization, or memory issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Memory leaks in 14.3.0 development code
5 participants