Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quoted next to / vs Binary String #34

Open
ktomk opened this issue Mar 14, 2020 · 0 comments
Open

Quoted next to / vs Binary String #34

ktomk opened this issue Mar 14, 2020 · 0 comments

Comments

@ktomk
Copy link

ktomk commented Mar 14, 2020

request for comments

I make use of the exporter library via Phpunit.

When I'm looking for expected output and this fails and the output contains characters from the C0 / C1 range (control characters), the string is hex-dumped.

This is kinda cumbersome as it is not easily readable any longer. I can copy the hex-string (after beginning 0x) and convert it to a readable string then, e.g.:

$ xclip -o -selection cliboard | xxd -r -p | less

This works but needs extra manual labour.

In these cases I can also patch exporter by replacing in src/Exporter.php

                return 'Binary String: 0x' . \bin2hex($value);

with

                return 'Quoted String: "' . \addcslashes($value, "\0..\11!@\14..\37!@\177..\377") . '"';

which perfectly fit my needs. This is especially well fitting in my case as I have merely standard output with very little control characters (mainly flow control) of which some are fine to render (like \n) and some are fine to transpose (with the example code above in octal notation).

I'm filing this issue as I'd like to get some comments on this. My main aim here is readability by the user which I think the exporter is for but this is a broad topic and needs may vary across users.

Also I'm asking myself what a good strategy would be. For pure binary strings the octal representation might be more verbose than the hexits so the hex notation is still preferable for purely binary strings.

The routine could look for a NUL byte in the string and if that is the case have it like before as "Binary String: 0x....". This can be compared on how grep does decide whether a file is binary or not.

I normally only need from Unicode the part it shares in the UTF-8 variant with the US-ASCII character set, so UTF-8 encoding is not specifically addressed here but it might play a role for others, too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant