Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

$V("PROBECRIT","DEFAULT") CPT time resolution is low #333

Open
shabiel opened this issue Aug 2, 2018 · 4 comments
Open

$V("PROBECRIT","DEFAULT") CPT time resolution is low #333

shabiel opened this issue Aug 2, 2018 · 4 comments
Assignees

Comments

@shabiel
Copy link
Contributor

shabiel commented Aug 2, 2018

A better timer should be able to give us values between 0 and 1000.

YDB>W $V("PROBECRIT","DEFAULT")
CPT:1000,CFN:0,CQN:0,CYN:0,CQF:0,CQE:1024,CAT:3648
YDB>W $V("PROBECRIT","DEFAULT")
CPT:0,CFN:0,CQN:0,CYN:0,CQF:0,CQE:1024,CAT:3649
@ksbhaskar
Copy link
Member

What sort of environment are you running your test on?

@shabiel
Copy link
Contributor Author

shabiel commented Aug 7, 2018 via email

@chathaway-codes chathaway-codes self-assigned this Aug 9, 2018
chathaway-codes added a commit to chathaway-codes/YottaDB that referenced this issue Aug 9, 2018
ABS_TIME, a structure used through the codebase, only allowed
granularity to within microseconds. This presented problems when
reporting durations that were much shorter than that, as reported by
customers in issue YottaDB#333. This commit removes the ABS_TIME structure, and
instead typedef's it to the structures used by the C library.
@shabiel
Copy link
Contributor Author

shabiel commented Aug 15, 2018

Thank you Charles!

@chathaway-codes
Copy link
Contributor

I'm still working on getting this past our internal testing; that commit had a few bugs in it that I fixed... I just pushed up the latest to that branch

chathaway-codes added a commit that referenced this issue Sep 26, 2018
… microsecond) resolution

The issue is that in ONE_MUTEX_TRY macro in mutex.c, as part of computing probecrit_rec.t_get_crit,
we make a call to sys_get_curr_time() which fetches a nanosecond resolution time ("struct timespec")
using the clock_gettime() system call but then converts it to microsecond resolution ("struct timespec")
before returning and storing in CPT. This meant CPT is always in microseconds even though the description
of that statistic in sr_unix/tab_probecrit_rec.h says "nanoseconds for the probe to get crit".

To fix this, sys_get_curr_time() has to be changed to return a time with nanosecond resolution.
This means changing the ABS_TIME structure (which is used by the entire codebase to do time
related activities) to hold nanosecond resolution.

This commit removes the ABS_TIME structure, and instead typedef's it to "struct timespec", the
time structure used by the clock_gettime() call. This meant changing all usages of "at_sec" and
"at_usec" members of the former ABS_TIME structure to instead be "tv_sec" and "tv_nsec" with the
latter change also requiring a granularity change (microsecond to nanosecond conversion).

Since the ABS_TIME structure and various other dependent structure layouts are changing, the
GTMDefinedTypesInit*.m files for sr_x86_64 and sr_armv7l had to be regenerated.

Additionally, the following changes were done

* sr_unix/gt_timers.c : The code now assumes clock_gettime() is available on all supported platforms
  i.e. #ifdef BSD_TIMER is assumed to be TRUE, and therefore a compile-time error is issued if
  that is not the case and all #ifndef BSD_TIMER code has been removed to simplify the file.

* sr_unix/gt_timers.c : We use clock_gettime(CLOCK_MONOTONIC) instead of CLOCK_REALTIME. This is because
  CLOCK_MONOTONIC is immune to wall-clock system time adjustments from adjtime() and NTP.

* sr_unix/gt_timers.c : The SYS_SETTIMER macro was used in only one place so it was inlined there
  to make the caller code more clear.

* sr_unix/gt_timers.c : When converting time from nanoseconds to microseconds before calling "setitimer"
  we now check if the time becomes 0 microseconds (i.e. input nanoseconds was < 1000). In that case,
  we treat this as a timer for 1 microsecond. This way we wait a little more than what was requested
  (but never more than 1 microsecond of the input request) but we do not return prematurely (which is
  a no-no since for example HANG 1 should hang for >= 1 second but never < 1 second).

* Various files : Noticed that USE_POLL was defined in mdef.h for all UNIX platforms so removed
  all #ifndef USE_POLL code and replaced all POLL_ONLY(X) usages with X. Since USE_POLL and USE_SELECT
  are mutually exclusive, this also meant removing all #ifdef USE_SELECT (or SELECT_ONLY) code.

* sr_unix/mutex.c : Move 8-byte quantity typecast (gtm_uint64_t) to BEFORE the multiplication just in
  case there can be an overflow now that the multiplication factor is 1 billion (nanoseconds).

* Various files : Various places where a "struct timeval" was used and the "tv_usec" member was assigned
  from the tv_nsec member of a "struct timespec" structure, a typecast to "gtm_tv_usec_t" was done on
  the result rather than typecasting the nanosecond operand before the division.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants