Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of mp_printf #550

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from

Conversation

czurnieden
Copy link
Contributor

Implementation of a mp_(f)printf with the additional type modifiers Z (signed big integer), M (unsigned limb), and N a printout of big_integer->dp, all formatable (to some extent) as in printf(3) and with a binary representation, too. See documentation and source for details.

Uses fprintf(3) for output and some of the formatting so it is bracketed inside #ifndef MP_NO_FILE.

It is rather large (6k - 12k stripped, depending on MP_xxBIT) but the folks over at GMP were right: you basically have to write a complete printf. Well, almost complete.

@czurnieden
Copy link
Contributor Author

Ah, it is the c89 mode that gives me trouble!
Ok, but not tonight.

@sjaeckel
Copy link
Member

Nice!

How about adding an option to have a stripped down version that can hook into glibc's printf via register_printf_function()? I can see the case where one wants a full printf implementation, but I guess a lot of our users are already on a glibc based system, so that'd bring less baggage for them.

@czurnieden
Copy link
Contributor Author

How about adding an option to have a stripped down version that can hook into glibc's printf via register_printf_function()?

Yes, that was exactly what I tried first. I'm quite lazy, y'know ;-)
But it isn't that simple. If I can find it&hellip&ah, here it is, a quick&dirty trial:

#include <stdio.h>
#include <stdlib.h>
#include <printf.h>
#include <tommath.h>

static int print_mp_int(FILE *stream,  const struct printf_info *info,  const void *const *args)
{
   mp_err err = MP_OKAY;
   const mp_int *a;
   char *buf;
   char hexbuf[3] = {0};
   int base = 10;
   size_t size, written;
   int len;

   a = *((const mp_int * const *)(args[0]));

   if(info->alt == 1u) {
      base = 16;
   }
   if ((err = mp_radix_size_overestimate(a, base, &size)) != MP_OKAY) {
      return -1;
   }
   buf = malloc(size);
   if (buf == NULL) {
      return -1;
   }
   if ((err = mp_to_radix(a, buf, size, &written, base)) != MP_OKAY) {
      free(buf);
      return -1;
   }
   if(base == 16) {
      hexbuf[0] = '0';
      hexbuf[1] = 'x';
   }
   /* It's not that simple, sadly. We need to do the whole string, cannot cobble it together with printf */
   len = fprintf(stream, "%s%*s", hexbuf, (info->left ? -info->width : info->width), buf);
   free(buf);
   return len;
}
static int print_mp_int_arginfo(const struct printf_info *info, size_t n,
                         int *argtypes, int *size)
{
   if (n > 0) {
      argtypes[0] = PA_POINTER;
      size[0] = sizeof(mp_int *);
   }
   return 1;
}
int main(void)
{
   mp_err err;
   mp_int a;

   if ((err = mp_init(&a)) != MP_OKAY)        goto LTM_ERR;
   if ((err = mp_rand(&a,2)) != MP_OKAY)        goto LTM_ERR;

   /* 'Z' is already taken (legacy, alternative to 'z') */
   register_printf_specifier('N', print_mp_int, print_mp_int_arginfo);

   printf("Right aligned AAA %50N BBB\n", &a);
   printf("Left aligned  AAA %-50N BBB\n", &a);
   printf("hex with right align  AAA %#50N BBB\n", &a);
   printf("hex with left align   AAA %#-50N BBB\n", &a);

   mp_clear(&a);
   exit(EXIT_SUCCESS);
LTM_ERR:
   mp_clear(&a);
   exit(EXIT_FAILURE);
}

$ clang-12 -Weverything ext_printf.c -o printf  /home/czurnieden/GITHUB/libtommath/libtommath.a
ext_printf.c:40:59: warning: unused parameter 'info' [-Wunused-parameter]
static int print_mp_int_arginfo(const struct printf_info *info, size_t n,
                                                          ^
ext_printf.c:62:33: warning: invalid conversion specifier 'N' [-Wformat-invalid-specifier]
   printf("Right aligned AAA %50N BBB\n", &a);
                             ~~~^
ext_printf.c:63:34: warning: invalid conversion specifier 'N' [-Wformat-invalid-specifier]
   printf("Left aligned  AAA %-50N BBB\n", &a);
                             ~~~~^
ext_printf.c:64:42: warning: invalid conversion specifier 'N' [-Wformat-invalid-specifier]
   printf("hex with right align  AAA %#50N BBB\n", &a);
                                     ~~~~^
ext_printf.c:65:43: warning: invalid conversion specifier 'N' [-Wformat-invalid-specifier]
   printf("hex with left align   AAA %#-50N BBB\n", &a);
$ ./printf
Right aligned AAA              1308344121715993167873922727409124568 BBB
Left aligned  AAA 1308344121715993167873922727409124568              BBB
hex with right align  AAA 0x                    FBFA58737F302F9A32DA26303884D8 BBB
hex with left align   AAA 0xFBFA58737F302F9A32DA26303884D8                     BBB

The parsed, how to call it, mark-up? is in the struct printf_info so we do not need to parse it "only" build the correct output string. It is most likely shorter than what I did where I did the parsing and left all the formatting to printf.

But I can complete my example (N for bigint) and see how it worked out.

@czurnieden
Copy link
Contributor Author

@sjaeckel Seems to work, so where to go?

  • Implementation of mp_printf #550 this, nearly complete but also quite large.. Standard C and almost posixly correct.
  • Print big integers with (f)printf #551 small but not yet complete (expect about doubling in size for mp_digit and a.dp, but would be smaller still), GLIBC only, and a bit awkward to use (although some change is possible).
  • none of the above, we're good.

@czurnieden
Copy link
Contributor Author

The standarization of printf is an outright mess. There is Posix, ISO 9899:xxxx, GLibC, and other libc's (mainly the BSD ones) that all do their own thing. Below is what I would implement, more or less what is expected from a standard printf.

; Extended and simplified Posix file format to support big integers

; This ABNF has been extended to use curly braces to mark a kind of
; a set. The example
;
;     {a, b, c} 
;
; translates into all subsets and their permutations;
;
;     ()/a/b/(a b)/(b a)/(a c)/(c a)/(b c)/(c b)/(a b c)/(b a c)/
;     (c a b)/(a c b)/(b c a)/(c b a)
;

; PRINTABLE   = %x20-7E

HEXDIGIT    = DIGIT / ( %x41-46 / %x61-66 )      ; 0-9 / (A-F/a-f)
HASH        = %x23                               ; '#'
PERIOD      = %x2E                               ; '.' (locale dependent if used as a decimal point)
FORMSTART   = %x25                               ; '%'
MINUS       = %x2D                               ; '-'
PLUS        = %x2B                               ; '+'
ZERO        = %x30                               ; '0' (for padding)
QUOT        = %x27                               ; "'" single quote

; The compiler will, in most cases, do the un-escaping.
ESCSIGN     = %x5C                               ; '\'
ESCCHAR     =  %s'a' / %s'b' / %s'f' / %s'n' / %s'r'
ESCCHAR     =/ %s't' / %s 'v'
ESCCHAR     =/ DQUOTE / ESCSIGN                  ; '\"' and  '\\'
ESCAPE      = ESCSIGN ESCCHAR

EXCOCTAL    = ESCSIGN ZERO 0*3DIGIT              ; no digits -> zero
EXCHEX      = ESCSIGN %i'x' 0*2HEXDIGIT          ; extension. no digits -> zero

FLOATSPEC   =  %s'a' / %s'A' / %s'd' / %s'e' / %s'E' ; Floating point conversion specifiers
FLOATSPEC   =/ %s'f' / %s'F' / %s'g' / %s'G'         ; Floating point conversion specifiers
INTSPEC     =/ %s'd' / %s'i' / %s'o' / %s'u' / %s'x' / %s'X' ; Integer conversion specifiers (d == i)
INTSPEC     =/ %s'b' / %s'B'                         ; changed from the Posix definition to binary representation
INTSPEC     =/ %x40                                  ; ('@' sign) extension for base64 representation
CHARSPEC    = %s'c'                                  ; one byte (integer)
STRINGSPEC  = %s's'                                  ; nul-terminated string
PTRSPEC     = %s'p'                                  ; value of a pointer
LENSPEC     = %s'n'                                  ; puts number of printed characters up to FORMSTART
                                                     ; in the argument g9iven
ERRSPEC     = %s'm'                                  ; Glibc extension to print output of strerror(errno)
                                                     ; without argument. (reads global variable errno)

INTMOD      =  %s'l' / %s"ll" / %s'h' / %s"hh" ; 'l' also extends "char" to a "wint_t", char* to wchar_t*
INTMOD      =/ %s'j' / %s'z' / %s't'           ; (u)intmax, (s)size_t, (signed?) ptrdiff
FLOATMOD    = %s'L'                            ; "long double". Extended "double".
                                               ; Most likely binary80, but can be binary128, too.
                                               ; Glibc's extension 'll == L' is not supported

; The non-standard extensions "q == ll", "C == lc", "S == ls" are not supported

; Big integer extensions
BIGINTMOD   =  %s'Z' ; The big integer itself (also a very old synonym for 'z')
BIGINTMOD   =/ %s'M' ; A single limb
BIGINTMOD   =/ %s'N' ; The raw limb-array

; Formatting

; The non-standard extension 'I' (upper case i) for locale dependent digits is not supported

; Flags
; Without the new "set" formatting {...} it would be this abomination
;FRMTFLAGS   =  ([(SP / PLUS)] [(MINUS / ZERO)] [HASH] [QUOT] )
;FRMTFLAGS   =/ ([(MINUS / ZERO)] [(SP / PLUS)] [HASH] [QUOT] )
;FRMTFLAGS   =/ ([HASH] [(SP / PLUS)] [(MINUS / ZERO)] [QUOT] )
;FRMTFLAGS   =/ ([(SP / PLUS)] [HASH] [(MINUS / ZERO)] [QUOT] )
;FRMTFLAGS   =/ ([(MINUS / ZERO)] [HASH] [(SP / PLUS)] [QUOT] )
;FRMTFLAGS   =/ ([HASH] [(MINUS / ZERO)] [(SP / PLUS)] [QUOT] )
;FRMTFLAGS   =/ ([HASH] [(MINUS / ZERO)] [QUOT]  [(SP / PLUS)])
;FRMTFLAGS   =/ ([(MINUS / ZERO)] [HASH] [QUOT]  [(SP / PLUS)])
;FRMTFLAGS   =/ ([QUOT]  [HASH] [(MINUS / ZERO)] [(SP / PLUS)])
;FRMTFLAGS   =/ ([HASH] [QUOT]  [(MINUS / ZERO)] [(SP / PLUS)])
;FRMTFLAGS   =/ ([(MINUS / ZERO)] [QUOT]  [HASH] [(SP / PLUS)])
;FRMTFLAGS   =/ ([QUOT]  [(MINUS / ZERO)] [HASH] [(SP / PLUS)])
;FRMTFLAGS   =/ ([QUOT]  [(SP / PLUS)] [HASH] [(MINUS / ZERO)])
;FRMTFLAGS   =/ ([(SP / PLUS)] [QUOT]  [HASH] [(MINUS / ZERO)])
;FRMTFLAGS   =/ ([HASH] [QUOT]  [(SP / PLUS)] [(MINUS / ZERO)])
;FRMTFLAGS   =/ ([QUOT]  [HASH] [(SP / PLUS)] [(MINUS / ZERO)])
;FRMTFLAGS   =/ ([(SP / PLUS)] [HASH] [QUOT]  [(MINUS / ZERO)])
;FRMTFLAGS   =/ ([HASH] [(SP / PLUS)] [QUOT]  [(MINUS / ZERO)])
;FRMTFLAGS   =/ ([(MINUS / ZERO)] [(SP / PLUS)] [QUOT]  [HASH])
;FRMTFLAGS   =/ ([(SP / PLUS)] [(MINUS / ZERO)] [QUOT]  [HASH])
;FRMTFLAGS   =/ ([QUOT]  [(MINUS / ZERO)] [(SP / PLUS)] [HASH])
;FRMTFLAGS   =/ ([(MINUS / ZERO)] [QUOT]  [(SP / PLUS)] [HASH])
;FRMTFLAGS   =/ ([(SP / PLUS)] [QUOT]  [(MINUS / ZERO)] [HASH])
;FRMTFLAGS   =/ ([QUOT]  [(SP / PLUS)] [(MINUS / ZERO)] [HASH])
FRMTFLAGS   = { (SP / PLUS), (MINUS / ZERO), HASH, QUOT }

; Width
FRMTWIDTH   = 1*DIGIT                          ; length is not arbitrary but restricted (min. 16 bit)

; Precision
FRMTPREC    = PERIOD *DIGIT                    ; no digits -> zero

; Specifier
FRMTINTSPEC = [INTMOD / BIGINTMOD] INTSPEC
FRMTFLTSPEC = [FLOATMOD] FLOATSPEC
FRMTCHRSPEC = [%s'l'] CHARSPEC
FRMTSTRSPEC = [%s'l'] STRINGSPEC
FRMTPTRSPEC = [INTMOD] PTRSPEC
FRMTLENSPEC = LENSPEC
FRMTERRSPEC = ERRSPEC

FRMTSPEC    = FRMTINTSPEC / FRMTFLTSPEC / FRMTCHRSPEC / FRMTSTRSPEC / FRMTPTRSPEC / FRMTLENSPEC / FRMTERRSPEC

; All together
FORMAT      =  (FORMSTART FORMSTART)                                  ; Print a '%'
FORMAT      =/ (FORMSTART [FRMTFLAG] [FRMTWIDTH] [FRMTPREC] FRMTSPEC) ; Min. format e.g.: "%d", "%u", "%x"

*pooh*

@czurnieden
Copy link
Contributor Author

It might be my purely subjective perception but the Ci seems to have grown a slight lack of stability lastly, doesn't it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants