Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

outchannel: eleminate type cast for compatibility reasons #5056

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

SchorppDA
Copy link

@SchorppDA SchorppDA commented Jan 3, 2023

Fixes #5047

@rgerhards rgerhards self-assigned this Jan 4, 2023
@rgerhards
Copy link
Member

Thanks for the PR, but we need to dig deeper into the root cause. I do not see why this can validly fix the issue. Note: I do not say "it does not", I just say "I do not understand why it works". I dug up some ARM v7 references, and char is by default unsigned so the same as unsigned char == uchar. So there is no reason why switching from char to uchar should make a difference.

Even worse: if it really does, we have probably many more places inside the codebase that we would need to look at. As such, I think it is vital to understand the root cause.

@SchorppDA would yo be willing to do a small rounds of debugging with more instrumentation to help nail this issue (I do not have a reproducer at hand)?

Besides that, the patch is also a kind of cleanup. It makes little sense to use char inside a function that is always called with `uchar's - but that's a different point. IOW: once we know the root cause, it would probably make sense to merge this PR for the cleanup :-)

@SchorppDA
Copy link
Author

I also did not really understand the cause, and therefore can understand why the PR is not yet accepted.
Yes I am also willing to do some debugging.
As I mentioned in Issues #5047 I feared a connection with the compiler and maybe some optimizations. Since also a debug line suddenly fixes the problem.

@rgerhards
Copy link
Member

Thanks!

If you just recompile the original code - without changes: does the problem persist? The reason is that I wonder if some compile time options by the packager influence the behavior. Also, can you confirm that skip_Comma() is actually never executed at all? Maybe via dbgprintf() at entry.

@rgerhards
Copy link
Member

side-note: I see invalid CI failures. It looks like some external testbench artifacts went away. I need to fix this via separate PR. Must have happened recently, maybe on Jan, 1st.

@SchorppDA
Copy link
Author

It is still not clear to me what is meant by "packager". But without the change the error is persistent in my setup.
If I make dbgprintf inside skip_Comma the problem is also suddenly solved, therefore also the suspicion with the compiler.

@rgerhards
Copy link
Member

It is still not clear to me what is meant by "packager".

The person/process which creates the binary package. Or did you manually compile and install?

If I make dbgprintf inside skip_Comma the problem is also suddenly solved

Do you see that dbgprintf inside the debug log? From the accompanying issue, I thought it was said that that skip_Comma was never executed.

Anyhow, can you apply this patch and post the relevant debug output:

diff --git a/outchannel.c b/outchannel.c
index 2f456b53b..46126edd4 100644
--- a/outchannel.c
+++ b/outchannel.c
@@ -75,12 +75,20 @@ static void skip_Comma(char **pp)
        assert(*pp != NULL);
 
        p = *pp;
-       while(isspace((int)*p))
+       dbgprintf("skip_Comma: enter *pp %p, p %p line: '%s'\n", *pp, p, p);
+       dbgprintf("skip_Comma: *p: %d '%c'\n", *p, *p);
+       while(isspace((int)*p)) {
+               dbgprintf("skip_Comma: seen space, next p %d '%c'\n", *p, *p);
                ++p;
+       }
+       dbgprintf("skip_Comma 2: *p: %d '%c' line: '%s'\n", *p, *p, p);
        if(*p == ',')
                ++p;
-       while(isspace((int)*p))
+       while(isspace((int)*p)) {
                ++p;
+               dbgprintf("skip_Comma 2: seen space, next p %d '%c'\n", *p, *p);
+       }
+       dbgprintf("skip_Comma: exit line: '%s'\n", p);
        *pp = p;
 }
 

@rgerhards rgerhards changed the title eleminate type cast outchannel: eleminate type cast for compatibility reasons Jan 11, 2023
@rgerhards
Copy link
Member

Just FYI: #5063

I would still be interested in the output of proposed patch above so that we can finally debug the issue in $outchannel

@rgerhards
Copy link
Member

@SchorppDA do you have any updates for me?

@patrickdepinguin
Copy link

patrickdepinguin commented Jan 31, 2023

Hi @rgerhards , I am in the same situation as OP, also finding this problem after upgrading the Yocto embedded build system to the Kirkstone (4) release.

I applied your debug patch, but with that in place the issue does not occur.
Just for reference, here is the output:

1993.256422375:main thread    : ../../rsyslog-8.2206.0/runtime/rsconf.c: cnf:global:cfsysline: $outchannel messages,/var/log/messages,1000000,/usr/sbin/logrotate /etc/logrotate-messages.conf
1993.256425785:main thread    : ../../rsyslog-8.2206.0/runtime/../outchannel.c: skip_Comma: enter *pp 0x55cc59b3b3c5, p 0x55cc59b3b3c5 line: '/var/log/messages,1000000,/usr/sbin/logrotate /etc/logrotate-messages.conf'
1993.256427165:main thread    : ../../rsyslog-8.2206.0/runtime/../outchannel.c: skip_Comma: *p: 47 '/'
1993.256428525:main thread    : ../../rsyslog-8.2206.0/runtime/../outchannel.c: skip_Comma 2: *p: 47 '/' line: '/var/log/messages,1000000,/usr/sbin/logrotate /etc/logrotate-messages.conf'
1993.256429645:main thread    : ../../rsyslog-8.2206.0/runtime/../outchannel.c: skip_Comma: exit line: '/var/log/messages,1000000,/usr/sbin/logrotate /etc/logrotate-messages.conf'
1993.256431295:main thread    : ../../rsyslog-8.2206.0/runtime/../outchannel.c: skip_Comma: enter *pp 0x55cc59b3b3d6, p 0x55cc59b3b3d6 line: ',1000000,/usr/sbin/logrotate /etc/logrotate-messages.conf'
1993.256432455:main thread    : ../../rsyslog-8.2206.0/runtime/../outchannel.c: skip_Comma: *p: 44 ','
1993.256433645:main thread    : ../../rsyslog-8.2206.0/runtime/../outchannel.c: skip_Comma 2: *p: 44 ',' line: ',1000000,/usr/sbin/logrotate /etc/logrotate-messages.conf'
1993.256434735:main thread    : ../../rsyslog-8.2206.0/runtime/../outchannel.c: skip_Comma: exit line: '1000000,/usr/sbin/logrotate /etc/logrotate-messages.conf'
1993.256436065:main thread    : ../../rsyslog-8.2206.0/runtime/../outchannel.c: skip_Comma: enter *pp 0x55cc59b3b3de, p 0x55cc59b3b3de line: ',/usr/sbin/logrotate /etc/logrotate-messages.conf'
1993.256437215:main thread    : ../../rsyslog-8.2206.0/runtime/../outchannel.c: skip_Comma: *p: 44 ','
1993.256438375:main thread    : ../../rsyslog-8.2206.0/runtime/../outchannel.c: skip_Comma 2: *p: 44 ',' line: ',/usr/sbin/logrotate /etc/logrotate-messages.conf'
1993.256439465:main thread    : ../../rsyslog-8.2206.0/runtime/../outchannel.c: skip_Comma: exit line: '/usr/sbin/logrotate /etc/logrotate-messages.conf'
[..]
1993.258258956:main thread    : ../../rsyslog-8.2206.0/runtime/../outchannel.c: Outchannel: Name='messages'
1993.258260526:main thread    : ../../rsyslog-8.2206.0/runtime/../outchannel.c: 	File Template: '/var/log/messages'
1993.258262246:main thread    : ../../rsyslog-8.2206.0/runtime/../outchannel.c: 	Max Size.....: 1000000
1993.258263766:main thread    : ../../rsyslog-8.2206.0/runtime/../outchannel.c: 	OnSizeLimtCmd: '/usr/sbin/logrotate /etc/logrotate-messages.conf'

With the changes of this PR in mind, I found the following note in 'man isspace':

NOTES
       The  standards  require  that  the argument c for these functions is either EOF or a value that is 
       representable in the type unsigned char.  If the argument c is of type char, it must be cast to 
       unsigned char, as in the following example:

           char c;
           ...
           res = toupper((unsigned char) c);

       This is necessary because char may be the equivalent of signed char, in which case a byte where
       the top bit is set would be sign extended when converting to int, yielding a value that is outside the
       range of unsigned char.

Going back to the original code, I tried the following:

  • compile without any change
  • changed the cast to 'int' in the isspace call to 'signed char' explicitly
  • changed the cast to 'int' in the isspace call to 'unsigned char'

In all the above cases, the error still occurred and the debug output is:

4080.547161184:main thread    : ../../rsyslog-8.2206.0/runtime/../outchannel.c: Outchannel: Name='messages'
4080.547162304:main thread    : ../../rsyslog-8.2206.0/runtime/../outchannel.c: 	File Template: '/var/log/messages'
4080.547163474:main thread    : ../../rsyslog-8.2206.0/runtime/../outchannel.c: 	Max Size.....: 0
4080.547164624:main thread    : ../../rsyslog-8.2206.0/runtime/../outchannel.c: 	OnSizeLimtCmd: ',1000000,/usr/sbin/logrotate /etc/logrotate-messages.conf'

I then applied basically the same patch as proposed in this PR, additionally removing the cast to 'int' (which makes the call to isspace in line with the example in the man page):

diff --git a/outchannel.c b/outchannel.c
index 2f456b53b..236281019 100644
--- a/outchannel.c
+++ b/outchannel.c
@@ -67,19 +67,19 @@ struct outchannel* ochConstruct(void)
 /* skips the next comma and any whitespace
  * in front and after it.
  */
-static void skip_Comma(char **pp)
+static void skip_Comma(uchar **pp)
 {
-	register char *p;
+	register uchar *p;
 
 	assert(pp != NULL);
 	assert(*pp != NULL);
 
 	p = *pp;
-	while(isspace((int)*p))
+	while(isspace(*p))
 		++p;
 	if(*p == ',')
 		++p;
-	while(isspace((int)*p))
+	while(isspace(*p))
 		++p;
 	*pp = p;
 }
@@ -98,7 +98,7 @@ static rsRetVal get_Field(uchar **pp, uchar **pField)
 	assert(*pp != NULL);
 	assert(pField != NULL);
 
-	skip_Comma((char**)pp);
+	skip_Comma(pp);
 	p = *pp;
 
 	CHKiRet(cstrConstruct(&pStrB));
@@ -135,7 +135,7 @@ static int get_off_t(uchar **pp, off_t *pOff_t)
 	assert(*pp != NULL);
 	assert(pOff_t != NULL);
 
-	skip_Comma((char**)pp);
+	skip_Comma(pp);
 	p = *pp;
 
 	val = 0;
@@ -166,7 +166,7 @@ static rsRetVal get_restOfLine(uchar **pp, uchar **pBuf)
 	assert(*pp != NULL);
 	assert(pBuf != NULL);
 
-	skip_Comma((char**)pp);
+	skip_Comma(pp);
 	p = *pp;
 
 	CHKiRet(cstrConstruct(&pStrB));

In this case, the error is gone and debug output is:

966:0075.888533410:main thread    : ../../rsyslog-8.2206.0/runtime/../outchannel.c: Outchannel: Name='messages'
967:0075.888534540:main thread    : ../../rsyslog-8.2206.0/runtime/../outchannel.c: 	File Template: '/var/log/messages'
968:0075.888535700:main thread    : ../../rsyslog-8.2206.0/runtime/../outchannel.c: 	Max Size.....: 1000000
969:0075.888536820:main thread    : ../../rsyslog-8.2206.0/runtime/../outchannel.c: 	OnSizeLimtCmd: '/usr/sbin/logrotate /etc/logrotate-messages.conf'

Finally, I compared the assembly for skip_Comma between this patched version and the original unchanged version.
The difference is one instruction (occurring several times):

  • original: movsbq (%rbx),%rax
  • patched without casts: movzbl (%rbx),%eax

Please find the full assembly for skip_Comma attached:
dump-original-skipcomma.txt
dump-nocast-skipcomma.txt

As far as I can see, movsbq uses sign-extension, and movzbl uses zero-extension. This difference is exactly what the isspace manpage refers to.

So in conclusion, I do think that removing the casts between char and unsigned char is a correct fix.

If there are other calls to such functions (isalnum, isalpha, isascii, isblank, iscntrl, isdigit, isgraph, islower, isprint, ispunct, isspace, isupper, isxdigit, isalnum_l, isalpha_l, isascii_l, isblank_l, iscntrl_l, isdigit_l, isgraph_l, islower_l, isprint_l, is‐ punct_l, isspace_l, isupper_l, isxdigit_l) then they should best be audited to make sure their input is unsigned.

@SchorppDA
Copy link
Author

Hi @rgerhards,

unfortunately I could not work on the problem the last weeks.
For the completeness the output with the debug patch attached.
output.txt
With the debug patch, however, the problem no longer occurs.

I will update my PR with the fixes from @patrickdepinguin and make an attempt of a better commit message

According to the manpage the input for isspace must be of type unsigend char.
Remove all unnecessary type cast to achieve this.
@patrickdepinguin
Copy link

Hi @rgerhards , would you have some time to look at this again please?

@rgerhards
Copy link
Member

sry, I have somehow overlooked the recent activity on this tracker. Even more unfortunately, I will probably not be able to check details before mid-April (but will try to be faster).

Please note that with the above-mentioned new style config in place, this issue has lower priority for me. Every can apply that patch and use new style with the new config. While I agree that this patch here seems to solve the problem, I have not yet a sound explanation why it does so. I do not like to merge just on the "seems to work" basis, even if it for sure is an improvement over the current situation: I would like to understand the core issue , so that I can also check if there are related areas in code. And also once I merge, everybody else probably looses interest in the issue ;-)

Again: #5063 is the solution for anyone looking to solve the situation right now.

@rgerhards
Copy link
Member

@patrickdepinguin I read your great post #5056 (comment) in-depth. It really is helpful. Just let me check some other things.

@rgerhards
Copy link
Member

OK, the int cast stems back to some compatibility issue with BSD build - but that was 16 years ago. I wonder if someone could confirm the new patch on BSD. It was only a warning druing build, if my mail archive doesn't misguide me.

489a51b#diff-7035bccafe4326c231eb577068f80a66304550db628ede2ceb9a12b7c000b243

Alternatively, we could do a "stupid save bet" by casting ((int) ((uchar *) c)

@patrickdepinguin what's you take on this?

@rgerhards
Copy link
Member

TBH, I cannot outrule that the BSD warning 16 yrs ago was for exactly the issue we now see. Unfortunately, I do not have any more info available. :-(

@patrickdepinguin
Copy link

Hi @rgerhards, sorry for the late reply, this slipped my mind.

I think the best approach should be to align with the man page of isspace. All of the man pages that I find (on my system as well as on the internet), specify something along these lines:

The c argument is an int, the value of which the application shall ensure is a character
representable as an unsigned char or equal to the value of the macro EOF. If the argument
has any other value, the behavior is undefined. 

The NetBSD man page does also: (https://man.netbsd.org/NetBSD-9.1/isspace.3)

The argument to isspace() must be EOF or representable as an unsigned
     char; otherwise, the behavior is undefined.  See the CAVEATS section of
     [ctype(3)](https://man.netbsd.org/NetBSD-9.1/ctype.3) for more details.

and that linked section says: (https://man.netbsd.org/NetBSD-9.1/ctype.3)

[CAVEATS]

     The argument of these functions is of type int, but only a very
     restricted subset of values are actually valid.  The argument must either
     be the value of the macro EOF (which has a negative value), or must be a
     non-negative value within the range representable as unsigned char.
     Passing invalid values leads to undefined behavior.

     Values of type int that were returned by [getc(3)](https://man.netbsd.org/NetBSD-9.1/getc.3), [fgetc(3)](https://man.netbsd.org/NetBSD-9.1/fgetc.3), and similar
     functions or macros are already in the correct range, and may be safely
     passed to these ctype functions without any casts.

     Values of type char or signed char must first be cast to unsigned char,
     to ensure that the values are within the correct range.  Casting a nega-
     tive-valued char or signed char directly to int will produce a negative-
     valued int, which will be outside the range of allowed values (unless it
     happens to be equal to EOF, but even that would not give the desired
     result).

     Because the bugs may manifest as silent misbehavior or as crashes only
     when fed input outside the US-ASCII range, the NetBSD implementation of
     the ctype functions is designed to elicit a compiler warning for code
     that passes inputs of type char in order to flag code that may pass nega-
     tive values at runtime that would lead to undefined behavior:

           #include <ctype.h>
           #include <locale.h>
           #include <stdio.h>

           int
           main(int argc, char **argv)
           {

                   if (argc < 2)
                           return 1;
                   setlocale(LC_ALL, "");
                   printf("%d %d\n", *argv[1], isprint(*argv[1]));
                   printf("%d %d\n", (int)(unsigned char)*argv[1],
                       isprint((unsigned char)*argv[1]));
                   return 0;
           }

     When compiling this program, GCC reports a warning for the line that
     passes char.  At runtime, you may get nonsense answers for some inputs
     without the cast -- if you're lucky and it doesn't crash:

           % gcc -Wall -o test test.c
           test.c: In function 'main':
           test.c:12:2: warning: array subscript has type 'char'
           % LC_CTYPE=C ./test $(printf '\270')
           -72 5
           184 0
           % LC_CTYPE=C ./test $(printf '\377')
           -1 0
           255 0
           % LC_CTYPE=fr_FR.ISO8859-1 ./test $(printf '\377')
           -1 0
           255 2

     Some implementations of libc, such as glibc as of 2018, attempt to avoid
     the worst of the undefined behavior by defining the functions to work for
     all integer inputs representable by either unsigned char or char, and
     suppress the warning.  However, this is not an excuse for avoiding con-
     version to unsigned char: if EOF coincides with any such value, as it
     does when it is -1 on platforms with signed char, programs that pass char
     will still necessarily confuse the classification and mapping of EOF with
     the classification and mapping of some non-EOF inputs.

Going back to the rsyslog code, all users of skip_Comma already had a pointer-pointer to uchar, which is the required type for isspace. The fact that skip_Comma takes a (pointer-pointer to) signed char, and then casting it back to uchar, is not correct and not needed. Instead, skip_Comma can simply change its signature and accept pointer-pointer to uchar, so that it can be passed unchanged to isspace. This is what the patch I proposed in my comment does. I think it should pass the NetBSD compilation as well.

@century6
Copy link

May I ask if there is any news?

@multiple1
Copy link

Any update on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Outchannel parsing is not working
5 participants