Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial MPI_T-like profiling support in Mercury #350

Open
wants to merge 40 commits into
base: master
Choose a base branch
from

Conversation

srini009
Copy link

@srini009 srini009 commented Mar 4, 2020

Add an MPI_T-like profiling interface in Mercury.
Profiling interface is active by default.
Currently, only one PVAR is exported called "hg_pvar_hg_forward_count" that counts the number of times the HG_Forward call is invoked on this instance.

Refer to: https://www.mpi-forum.org/docs/mpi-3.1/mpi31-report/node372.htm#Node372 for the corresponding MPI 3.1 PVAR interface specification that this implementation is largely based off of.

Notable differences with MPI (implementation, not interface spec):

  1. PVAR sessions in Mercury are associated with the hg_class object to avoid the use of global variables.
  2. Similarly, PVARs themselves are not global variables and aren't declared globally and used as such. Instead, each module interested in exporting PVARs must register these PVARs and refer to them by the use of handles (that internally fetch the address of the PVAR) within the local function where these PVARs are updated.
  3. We used atomics for updating counter PVARs where MPI does not.

*/
HG_PUBLIC hg_return_t
HG_Prof_init();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should probably take an hg_class as a parameter. We usually try to avoid using globals.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my previous comment, you should return an hg_prof_class_t * here

/* PVAR profiling support */
#include "mercury_prof_pvar_impl.h"
HG_PROF_PVAR_UINT_COUNTER_DECL_EXTERN(hg_pvar_hg_forward_count);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a general comment I think it would be good to see if/how we can avoid declaring global variables.

@carns
Copy link
Contributor

carns commented Mar 12, 2020

I would also suggest that the comments for this interface reference a URL (I'm not sure which one off hand) for the MPI_T interface with a brief comment along the lines of "The following interface is patterned off of the MPI_T interface, except that X, Y, Z", where X, Y, an Z are notable philosophical differences (like being oriented around HG classes rather than globals) if there are any.

That way we can lean on the MPI documentation to some degree and also help rationalize the API conventions, since there is some value to reusing terminology for this stuff.

@srini009
Copy link
Author

Hey guys @carns, @soumagne: Thank you for your useful comments. I generally agree with all the points that have been put forward. I shall address most if not all comments in the next 2 days. I hope that works!

@srini009
Copy link
Author

srini009 commented Apr 1, 2020

@soumagne, @carns: I have addressed all your comments. Kindly check if this is okay. I apologize for the delay, it took longer than I expected.

@carns
Copy link
Contributor

carns commented Apr 6, 2020

@soumagne, @carns: I have addressed all your comments. Kindly check if this is okay. I apologize for the delay, it took longer than I expected.

Thanks @srini009 !

I just checked two things that I had commented on:

a) did you find a URL or some other reference to put in the comments for the general model the API is using? It might be there and I'm missing it.

b) there has got to be a better way to implement COUNTER_INC() than looping over an incr() right? Maybe do a get() and then a cas() instead (with a retry loop if needed)? Jerome may have another idea. I'm thinking of the case where the val could be relatively large, like if you are accumulating bytes transmitted or something. The current code might be slow, and also isn't technically atomic across the macro.

@srini009
Copy link
Author

srini009 commented Apr 6, 2020

Hi @carns,
As for (a), I have addressed the comment by modifying the commit description (see top). Oops, I think you probably meant the code! :) I shall add the same comments in the interface header file as well.

As for (b), indeed, I did give this a thought. As far as atomicity is concerned, hopefully, I'm not mistaken, but I think we don't need to enforce atomicity at the macro level since individual updates are atomic and addition is commutative. I don't think we shall be encountering race conditions here. But you're right, it is incredibly inefficient for large updates related to bytes transferred, etc. Perhaps I can take @soumagne's help here.

I am excited about this becoming a part of mercury. After that, it is a relatively quick step to introduce a bunch of new PVARs that are insightful.

Regards,

@carns
Copy link
Contributor

carns commented Apr 7, 2020

Hi @carns,
As for (a), I have addressed the comment by modifying the commit description (see top). Oops, I think you probably meant the code! :) I shall add the same comments in the interface header file as well.

As for (b), indeed, I did give this a thought. As far as atomicity is concerned, hopefully, I'm not mistaken, but I think we don't need to enforce atomicity at the macro level since individual updates are atomic and addition is commutative. I don't think we shall be encountering race conditions here. But you're right, it is incredibly inefficient for large updates related to bytes transferred, etc. Perhaps I can take @soumagne's help here.

I am excited about this becoming a part of mercury. After that, it is a relatively quick step to introduce a bunch of new PVARs that are insightful.

Regards,

Ah! Yes, that link and explanation in the PR comment is perfect, just drop it in the code. My thinking is that in the future someone might look at the code and want to change things without realizing that it's intentionally matching some broader conventions.

@carns
Copy link
Contributor

carns commented Apr 10, 2020

@soumagne what's the best what to do the performance counter macro? Do a get(), calculate new value, then cas() (in a loop to retry if the cas fails)?

Copy link
Member

@soumagne soumagne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @srini009 for all the changes I added some more comments, suggestions for fixes.

int hg_prof_is_initialized; /* Is profiling initialized */
int num_pvars; /* No of PVARs currently supported */
hg_prof_pvar_session_t session; /* Is profiling initialized on the session */
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I am missing something but you should have a struct hg_prof_class here, not hg_private_class.

hg_prof_set_is_initialized(struct hg_private_class * hg_private_class)
{
hg_private_class->hg_prof_is_initialized = 1;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a detail but instead of using 1 or 0, there are HG_TRUE and HG_FALSE macros

static int
hg_prof_get_session_is_initialized(struct hg_prof_pvar_session * session) {
return session->is_initialized;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, you could return an hg_bool_t instead.

hg_prof_set_is_initialized(hg_private_class);
hg_private_class->num_pvars = NUM_PVARS;
hg_private_class->session = NULL;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you are allowed to do that here :) HG_Prof_init() should return a new hg_prof_class_t * so I would expect you to malloc a new struct hg_prof_class in that routine, fill it and return it.

/*---------------------------------------------------------------------------*/
hg_return_t
HG_Prof_finalize(hg_class_t *hg_class) {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here instead it should take an hg_prof_class_t *


/*---------------------------------------------------------------------------*/
hg_return_t
HG_Prof_pvar_get_info(hg_class_t *hg_class, int pvar_index, char *name, int *name_len, hg_prof_class_t *var_class, hg_prof_datatype_t *datatype, char *desc, int *desc_len, hg_prof_bind_t *bind, int *continuous) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that means hg_prof_class_t becomes hg_prof_var_class_t ?


struct hg_prof_pvar_session s = *session;
unsigned int key = pvar_index;
hg_prof_pvar_data_t * val;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep declarations at beginning of blocks

HG_Prof_pvar_handle_free(hg_prof_pvar_session_t session, int pvar_index, hg_prof_pvar_handle_t *handle) {

if(!hg_prof_get_session_is_initialized(session))
return HG_NA_ERROR;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that seems like the wrong error code :) HG_NA_ERROR means it's an NA layer error so maybe just HG_INVALID_ARG?

*/
HG_PUBLIC hg_return_t
HG_Prof_init();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my previous comment, you should return an hg_prof_class_t * here

HG_PVAR_CLASS_SIZE, /* PVAR that represents the size of a given resource at any given point in time */
HG_PVAR_CLASS_HIGHWATERMARK, /* PVAR that represents a high watermark value */
HG_PVAR_CLASS_LOWWATERMARK /* PVAR that represents a low watermark value */
} hg_prof_class_t;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes you should rename that one to hg_prof_var_class_t, given the changes that I suggested earlier.

@soumagne
Copy link
Member

soumagne commented Apr 10, 2020

@carns @srini009 yes for the performance counter instead of:

#define HG_PROF_PVAR_COUNTER_INC(name, val)
    addr_##name = (addr_##name == NULL ? hg_prof_get_pvar_addr_from_name(#name): addr_##name);
    for(int i=0; i < val; i++)
        hg_atomic_incr32(addr_##name);

you should do something like:

#define HG_PROF_PVAR_COUNTER_INC(name, val) do {
    hg_util_int32_t tmp;

    addr_##name = (addr_##name == NULL ? hg_prof_get_pvar_addr_from_name(#name): addr_##name);
    do {
        tmp = hg_atomic_get32(addr_##name);
    } while (!hg_atomic_cas32(addr_##name, tmp, (tmp + val)));
} while (0)

@srini009 srini009 force-pushed the mercury_profiling_interface branch from b1fe49b to c174d8d Compare July 30, 2020 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants