Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RealCAN implementation #521

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

GGZ8
Copy link
Contributor

@GGZ8 GGZ8 commented May 14, 2024

Dear maintainers,

we are a research team at the University of Modena and Reggio Emilia, and we would like to contribute upstream our recent development on top of the canplayer tool.

This pull request introduces a new extension that we dubbed RealCAN, which enables replaying an existing CAN logfile respecting the original relative timestamps. Our use case is to replay the given CAN trace in a more realistic way, so as to exercise our test network with the correct inter-arrival times that the ECUs expect.

Changes:

  • Modified the frame sending mechanism to incorporate a timing check that aligns frame transmission with their scheduled inter-arrival times;
  • Added a new command-line option, -r, to enable real-time dispatching of CAN frames;
  • Updated the usage message to include instructions for the new soft real-time features.

We hope you'll find this contribution useful, and we are open to any suggestions on how to further refined this feature.

Wish you a great day!

Copy link
Member

@marckleinebudde marckleinebudde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello,

thanks for your contribution. For now I mainly looked at the coding style. There's a timespec vs. timeval confusion in the code.

I'll have a look at the timer handling later.

canplayer.c Outdated Show resolved Hide resolved
canplayer.c Outdated Show resolved Hide resolved
canplayer.c Outdated Show resolved Hide resolved
canplayer.c Outdated Show resolved Hide resolved
canplayer.c Outdated Show resolved Hide resolved
canplayer.c Outdated Show resolved Hide resolved
canplayer.c Outdated Show resolved Hide resolved
canplayer.c Outdated Show resolved Hide resolved
canplayer.c Outdated Show resolved Hide resolved
canplayer.c Outdated Show resolved Hide resolved
GGZ8 added a commit to SECloudUNIMORE/can-utils that referenced this pull request May 14, 2024
@delta512
Copy link
Contributor

Hello,
there you have two iterations of changes as per your requests. My commit is just every trivial correction to the code style, while @GGZ8 reshaped the real code stuff.

Let us know if there's anything more to tweak.

Cheers

@marckleinebudde
Copy link
Member

Please squash everything into one commit.

GGZ8 added a commit to SECloudUNIMORE/can-utils that referenced this pull request May 15, 2024
Replay pre-recorded CAN Bus dumps respecting the original relative timestamps.

Fix code style

Conform to codebase style as per maintainer's suggestions.

Refactor code logic according to suggestions for PR linux-can#521
@GGZ8
Copy link
Contributor Author

GGZ8 commented May 15, 2024

Please squash everything into one commit.

Done! 👍

@marckleinebudde
Copy link
Member

please rebase to latest master, which includes #522 and causes a conflict now.

@marckleinebudde
Copy link
Member

If I create a logfile with candump any,0:0,#FFFFFFFF -l and replay it with canplayer -I, canplayer will respect the timestamps in the created logfile.

  • What problem fixes the new '-r' option?
  • What benefit does it bring?
  • How is it supposed to be used?

canplayer.c Outdated
@@ -252,6 +261,38 @@ static int add_assignment(char *mode, int socket, char *txname,
return 0;
}

void load_gaps_from_file(FILE *fp, struct sleep *timestamps)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void load_gaps_from_file(FILE *fp, struct sleep *timestamps)
static void load_gaps_from_file(FILE *fp, struct sleep *timestamps)

canplayer.c Outdated
free(line);
}

void init_timestamps(struct sleep *timestamps, size_t init_size)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void init_timestamps(struct sleep *timestamps, size_t init_size)
static void init_timestamps(struct sleep *timestamps, size_t init_size)

canplayer.c Outdated

while ((read = getline(&line, &len, fp)) != -1) {
if (timestamps->idx == timestamps->size) {
timestamps->size++;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be beneficial to double the size instead of incrementing. How many lines does your usual files have?

canplayer.c Outdated
exit(1);
}
}
sscanf(line, "(%lu.%lu)", &timestamps->sleep_vector[timestamps->idx].tv_sec, &timestamps->sleep_vector[timestamps->idx].tv_usec);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: an intermediate variable, e.g. tv = &timestamps->sleep_vector[timestamps->idx] might be beneficial here.

canplayer.c Outdated
timestamps->size++;
timestamps->sleep_vector = realloc(timestamps->sleep_vector, timestamps->size * sizeof(struct timeval));
if (!timestamps->sleep_vector) {
fprintf(stderr, "Failed to create the timestamps vector!\n");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fprintf(stderr, "Failed to create the timestamps vector!\n");
fprintf(stderr, "Failed to realloc the timestamps vector!\n");

canplayer.c Outdated
while ((read = getline(&line, &len, fp)) != -1) {
if (timestamps->idx == timestamps->size) {
timestamps->size++;
timestamps->sleep_vector = realloc(timestamps->sleep_vector, timestamps->size * sizeof(struct timeval));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
timestamps->sleep_vector = realloc(timestamps->sleep_vector, timestamps->size * sizeof(struct timeval));
timestamps->sleep_vector = realloc(timestamps->sleep_vector, timestamps->size * sizeof(*(timestamps->sleep_vector));

canplayer.c Outdated
gap_from_file = true; /* using time delta from file */
init_timestamps(&timestamps, 1);
load_gaps_from_file(infile, &timestamps);
timestamps.idx = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
timestamps.idx = 0;

You already do this in init_timestamps().

@marckleinebudde
Copy link
Member

Ok, I understand which problem you want to solve. The current code sleeps for -g ms and checks for any CAN frames to send. That design is, let's say, sub-optimal.

Properly fixing this requires several steps on the existing code:

  • convert everything to struct timespec
  • read the first timestamp from the file
  • calculate the offset between the first timestamp and now clock_gettime(CLOCK_MONOTONIC)
  • in the inner loop: add the offset to the next timestamp and check against clock_gettime(CLOCK_MONOTONIC)
  • use clock_nanosleep() with TIMER_ABSTIME to wait for the next event

See f55ea38, where I added absolute timing to cangen.c

@hartkopp
Copy link
Member

Dear maintainers,

we are a research team at the University of Modena and Reggio Emilia, and we would like to contribute upstream our recent development on top of the canplayer tool.

This pull request introduces a new extension that we dubbed RealCAN, which enables replaying an existing CAN logfile respecting the original relative timestamps. Our use case is to replay the given CAN trace in a more realistic way, so as to exercise our test network with the correct inter-arrival times that the ECUs expect.

I have a general question about this patch.
IIRC the gaps between timestamps are already taken into account. Either when getting the logfile format from a file or from stdin:

(1715790289.248142) vcan0 5A8#EE41B4359BC73D
(1715790289.448425) vcan0 636#C47F
(1715790289.648726) vcan0 6FE#0E91243BAB988F49
(1715790289.849038) vcan0 118#7EB83B5D
(1715790290.049331) vcan0 32C#CB

So what functionality does your patch add?

@marckleinebudde
Copy link
Member

canplayer doesn't use proper sleep between 2 CAN messages. By default it sleep 1ms (configurable by -g parameter) and sends all CAN messages in a row that have an expired timestamp. The proper approach would be to convert canplayer to absolute time handling and sleep until the next CAN frame should be send.

@GGZ8
Copy link
Contributor Author

GGZ8 commented May 17, 2024

Sounds good, the changes you suggested enable a more precise replay of the original trace. The use of the nanosleep family of functions ensure a more efficient CPU sharing.

However, during our developments we could not saturate the bus, be it a virtual vcan network or a physical one with real CAN adapters. We observed that this limitation is actually due to the nanosleep function, which - as per its man page - suspends the execution of the calling thread until at least the time specified by request has elapsed.

We overcame this limitation using a spin-lock while waiting for a given deadline. Comparing our implementation against the upstream canplayer version, we found that the nanosleep approach introduces non-negligible latency between messages.
The test has been performed using a set of values for the -g option and a CAN trace with fixed inter-arrival time between CAN messages. This should behave the same as if the changes you suggested were already in place.

Thus, the -r option enables replaying with soft real-time capabilities when required by the scenario, leaving the nanosleep mechanism as the default behavior.

Since the paper is still under review, we can only share the results privately at the moment.

Let us know what you think about this evaluation, while we apply the minor correction stil pending.

@marckleinebudde
Copy link
Member

marckleinebudde commented May 17, 2024

spin_lock is a concept from the Linux kernel for short term locking. What do you mean by spin_lock?

If the overhead by nanosleep() is too big, you might want to use SO_TXTIME, see c6f2cf7 for an example.

Can you illustrate in a few lines of shell code, how the canplayer with -r should be used?

Works:

./canplayer -I ./candump-absolute.log

Adding -r doesn't work:

./canplayer -I ./candump-absolute.log -r 
Specify an input file for option -r !

Maybe -r earlier?

./canplayer -r -I ./candump-absolute.log

it hangs in:

#0  0x00007f8b3be8741d in __GI___libc_read (fd=0, buf=0x559ca2367340, nbytes=1024) at ../sysdeps/unix/sysv/linux/read.c:26
        sc_ret = -512
        __arg3 = <optimized out>
        _a2 = <optimized out>
        sc_ret = <optimized out>
        sc_ret = <optimized out>
        __arg1 = <optimized out>
        _a3 = <optimized out>
        sc_cancel_oldtype = <optimized out>
        resultvar = <optimized out>
        __arg2 = <optimized out>
        _a1 = <optimized out>
#1  0x00007f8b3be0ad93 in _IO_new_file_underflow (fp=0x7f8b3bf5f8e0 <_IO_2_1_stdin_>) at ./libio/libioP.h:1030
        count = <optimized out>
#2  0x00007f8b3be00090 in __GI___getdelim (lineptr=lineptr@entry=0x7ffdc14321c0, n=n@entry=0x7ffdc14321c8, delimiter=delimiter@entry=10, fp=fp@entry=0x7f8b3bf5f8e0 <_IO_2_1_stdin_>) at ./libio/iogetdelim.c:77
        _IO_acquire_lock_file = 0x7f8b3bf5f8e0 <_IO_2_1_stdin_>
        result = <optimized out>
        cur_len = 0
        len = 0
#3  0x0000559ca21e73b9 in getline (__lineptr=0x7ffdc14321c0, __n=0x7ffdc14321c8, __stream=0x7f8b3bf5f8e0 <_IO_2_1_stdin_>) at /usr/include/x86_64-linux-gnu/bits/stdio.h:120
No locals.
#4  load_gaps_from_file (fp=fp@entry=0x7f8b3bf5f8e0 <_IO_2_1_stdin_>, timestamps=timestamps@entry=0x7ffdc1432278) at ../canplayer.c:270
        line = 0x559ca23672c0 ""
        len = 120
        read = <optimized out>
#5  0x0000559ca21e766b in main (argc=4, argv=0x7ffdc1432408) at ../canplayer.c:388
        buf = '\000' <repeats 6344 times>
        device = '\000' <repeats 21 times>
        afrbuf = '\000' <repeats 6299 times>
        cu = {cc = {can_id = 0, {len = 0 '\000', can_dlc = 0 '\000'}, __pad = 0 '\000', __res0 = 0 '\000', len8_dlc = 0 '\000', data = "\000\000\000\000\000\000\000"}, fd = {can_id = 0, len = 0 '\000', flags = 0 '\000', __res0 = 0 '\000', 
            __res1 = 0 '\000', data = '\000' <repeats 63 times>}, xl = {prio = 0, flags = 0 '\000', sdt = 0 '\000', len = 0, af = 0, data = '\000' <repeats 2047 times>}}
        today_tv = {tv_sec = 0, tv_usec = 0}
        log_tv = {tv_sec = 0, tv_usec = 0}
        last_log_tv = {tv_sec = 0, tv_usec = 0}
        diff_tv = {tv_sec = 0, tv_usec = 0}
        verbose = 0
        opt = <optimized out>
        delay_loops = 0
        skipgap = 0
        loopback_disable = 0
        infinite_loops = 0
        loops = 1
        addr = {can_family = 0, can_ifindex = 0, can_addr = {tp = {rx_id = 0, tx_id = 0}, j1939 = {name = 0, pgn = 0, addr = 0 '\000'}}}
        vcid_opts = {flags = 2 '\002', tx_vcid = 0 '\000', rx_vcid = 0 '\000', rx_vcid_mask = 0 '\000'}
        sleep_ts = {tv_sec = 0, tv_nsec = 0}
        timestamps = {sleep_vector = 0x559ca23672a0, idx = 0, size = 1}
        act_time = {tv_sec = 140727845855936, tv_usec = 4194304}
        init_time = {tv_sec = 0, tv_usec = 0}
        infile = 0x7f8b3bf5f8e0 <_IO_2_1_stdin_>
        gap = 1
        use_timestamps = 1
        interactive = 0
        count = 0
        gap_from_file = true
        assignments = <optimized out>
        s = <optimized out>
        i = <optimized out>
        j = <optimized out>
        fret = <optimized out>
        eof = <optimized out>
        sec = <optimized out>
        usec = <optimized out>
        txidx = <optimized out>
        txmtu = <optimized out>
        send_time = <optimized out>

GGZ8 added a commit to SECloudUNIMORE/can-utils that referenced this pull request May 17, 2024
Replay pre-recorded CAN Bus dumps respecting the original relative timestamps.

Fix code style

Conform to codebase style as per maintainer's suggestions.

Refactor code logic according to suggestions for PR linux-can#521
@marckleinebudde
Copy link
Member

marckleinebudde commented May 17, 2024

e49499d throws the following warnings on amd64:

➜ (pts/57) frogger@hardanger:can-utils-tmp/x86 (pr-realcan) make
  CC       canplayer.o
../canplayer.c:283:29: warning: format specifies type 'unsigned long *' but the argument has type '__time_t' (aka 'long') [-Wformat]
                sscanf(line, "(%lu.%lu)", current->tv_sec, current->tv_usec);
                               ~~~        ^~~~~~~~~~~~~~~
../canplayer.c:283:46: warning: format specifies type 'unsigned long *' but the argument has type '__suseconds_t' (aka 'long') [-Wformat]
                sscanf(line, "(%lu.%lu)", current->tv_sec, current->tv_usec);
                                   ~~~                     ^~~~~~~~~~~~~~~~

on 32 bit arm:

../canplayer.c: In function ‘load_gaps_from_file’:
../canplayer.c:283:34: warning: format ‘%lu’ expects argument of type ‘long unsigned int *’, but argument 3 has type ‘__time_t’ {aka ‘long int’} [-Wformat=]
  283 |                 sscanf(line, "(%lu.%lu)", current->tv_sec, current->tv_usec);
      |                                ~~^        ~~~~~~~~~~~~~~~
      |                                  |               |
      |                                  |               __time_t {aka long int}
      |                                  long unsigned int *
../canplayer.c:283:38: warning: format ‘%lu’ expects argument of type ‘long unsigned int *’, but argument 4 has type ‘__suseconds_t’ {aka ‘long int’} [-Wformat=]
  283 |                 sscanf(line, "(%lu.%lu)", current->tv_sec, current->tv_usec);
      |                                    ~~^                     ~~~~~~~~~~~~~~~~
      |                                      |                            |
      |                                      long unsigned int *          __suseconds_t {aka long int}
../canplayer.c: In function ‘main’:
../canplayer.c:574:59: warning: ‘timestamps.idx’ may be used uninitialized [-Wmaybe-uninitialized]
  574 |                                         if (gap_from_file && timestamps.idx > 0) {
      |                                             ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~
../canplayer.c:327:22: note: ‘timestamps.idx’ was declared here
  327 |         struct sleep timestamps;
      |                      ^~~~~~~~~~

GGZ8 added a commit to SECloudUNIMORE/can-utils that referenced this pull request May 17, 2024
Replay pre-recorded CAN Bus dumps respecting the original relative timestamps.

Fix code style

Conform to codebase style as per maintainer's suggestions.

Refactor code logic according to suggestions for PR linux-can#521
@hartkopp
Copy link
Member

hartkopp commented May 19, 2024

I still have a problem with the new restriction, that requires a logfile and does not work with stdin anymore.

E.g. a common use-case to use a "remote" CAN interface on your local host is:
ssh user@host candump -L can0 | canplayer vcan0=can0

When looking at this code, which seems to be the relevant difference to the former nanosleep approach:

while (timercmp(&act_time, &send_time, <)){
    gettimeofday(&act_time, NULL);
    timersub(&act_time, &init_time, &act_time);
}

Why shouldn't this work with stdin too?

@marckleinebudde
Copy link
Member

If you read the timestamps into the array before stating the sending loop you need to rewind the input stream, this is not possible with stdin. Yes, it makes no sense to read the timestamps in the first place.

And the while loop Oliver mentioned in #521 (comment) is better coded with absolute timeouts, as I mentioned in #521 (comment).

GGZ8 added a commit to SECloudUNIMORE/can-utils that referenced this pull request May 22, 2024
Replay pre-recorded CAN Bus dumps respecting the original relative timestamps.

Fix code style

Conform to codebase style as per maintainer's suggestions.

Refactor code logic according to suggestions for PR linux-can#521
@GGZ8
Copy link
Contributor Author

GGZ8 commented May 22, 2024

I agree with marckleinebudde that is not possible to implement real time sending from stdin.
I modified the implementation to accept as input traces in which messages have absolute timestamps, rather than relative timestamps as before.
Rather than doing so in the main sending loop, I added some code to the initial pre-processing phase.

This solution does not introduce any delay in the sending phase.

@marckleinebudde
Copy link
Member

The question remains, why do you need to read all the time stamps before into the array?

And it's still not clear how the new option should be used:

$ ./canplayer -r -I ./candump-absolute.log
Specify an input file for option -r !
$ ./canplayer -r ./candump-absolute.log -I ./candump-absolute.log 
Specify an input file for option -r !

It only starts this way:

$ ./canplayer -I ./candump-absolute.log -r

I'm trying to replay candump-absolute.log. A CAN frame should be send every 100ms, but it doesn't work as expected:

$ candump any,0:0,#FFFFFFFF -cexdtd
 (000.000000)  vcan0  TX - -  002   [1]  00
 (000.100105)  vcan0  TX - -  002   [1]  01
 (000.099997)  vcan0  TX - -  002   [2]  02 00
 (000.099976)  vcan0  TX - -  002   [3]  03 00 00
 (000.100123)  vcan0  TX - -  002   [4]  04 00 00 00
 (000.599798)  vcan0  TX - -  002   [5]  05 00 00 00 00
 (000.000026)  vcan0  TX - -  002   [6]  06 00 00 00 00 00
 (000.000004)  vcan0  TX - -  002   [7]  07 00 00 00 00 00 00
 (000.000003)  vcan0  TX - -  002   [8]  08 00 00 00 00 00 00 00
 (000.000004)  vcan0  TX - -  002   [1]  09
 (000.000115)  vcan0  TX - -  002   [1]  0A
 (000.099956)  vcan0  TX - -  002   [2]  0B 00
 (000.100037)  vcan0  TX - -  002   [3]  0C 00 00
 (000.099992)  vcan0  TX - -  002   [4]  0D 00 00 00
 (000.099942)  vcan0  TX - -  002   [5]  0E 00 00 00 00
 (000.599921)  vcan0  TX - -  002   [6]  0F 00 00 00 00 00
 (000.000026)  vcan0  TX - -  002   [7]  10 00 00 00 00 00 00
 (000.000003)  vcan0  TX - -  002   [8]  11 00 00 00 00 00 00 00
 (000.000003)  vcan0  TX - -  002   [1]  12
 (000.000004)  vcan0  TX - -  002   [1]  13
 (000.000031)  vcan0  TX - -  002   [2]  14 00
 (000.100153)  vcan0  TX - -  002   [3]  15 00 00
 (000.099989)  vcan0  TX - -  002   [4]  16 00 00 00
 (000.100035)  vcan0  TX - -  002   [5]  17 00 00 00 00
 (000.099923)  vcan0  TX - -  002   [6]  18 00 00 00 00 00
 (000.599833)  vcan0  TX - -  002   [7]  19 00 00 00 00 00 00
 (000.000032)  vcan0  TX - -  002   [8]  1A 00 00 00 00 00 00 00
 (000.000007)  vcan0  TX - -  002   [1]  1B
 (000.000005)  vcan0  TX - -  002   [1]  1C
 (000.000004)  vcan0  TX - -  002   [2]  1D 00
 (000.000168)  vcan0  TX - -  002   [3]  1E 00 00
 (000.099850)  vcan0  TX - -  002   [4]  1F 00 00 00

@hartkopp
Copy link
Member

Additionally the canplayer has to be able to cope with timestamps that go back and forth as the timestamps created with candump any might have some jitter (at least) over different CAN interfaces. So there is no monotonic increasing timestamp.

The question to me is whether we can increase the accuracy on demand with some improvement that goes away from the "gap" approach.
But the features to read from stdin and the possibility to handle potential timestamp jitters have to remain,

@marckleinebudde
Copy link
Member

See: #521 (comment)

Replay pre-recorded CAN Bus dumps respecting the original relative timestamps.

Fix code style

Conform to codebase style as per maintainer's suggestions.

Refactor code logic according to suggestions for PR linux-can#521
@GGZ8
Copy link
Contributor Author

GGZ8 commented May 27, 2024

Since the -r option check the presence of an input file, the only way to use is -I input_file -r.

I implemented the solution you suggested in #521 (comment) removing the pre-processing phase while keeping the main loop implementation using the log_tv variable as absolute reference timestamp.
I tested the implementation using a CAN trace extracted from a real vehicle including over 1 billion of CAN frames and the results are comparable between the two implementations.

Thank you for the advise.

@hartkopp
Copy link
Member

Of course you name the variable to switch to the real time mode gap_from_file - but is this still the right naming?
When you now read the CAN log input in one single pass, are you sure you need to check whether you get this log file content from a file or from stdin ?

@GGZ8
Copy link
Contributor Author

GGZ8 commented Jun 4, 2024

I think that the variable name should remain gap_from_file since we are still reading the gaps from the input trace, just avoiding the preprocessing phase.

keep in mind that reading the messages directly from the stdin cannot guarantee any soft real-time requirements, due to all the overhead introduced by the OS from the sending program up to the exact moment we receive the message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants