Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Word List Substitution #718

Open
Ne0nd0g opened this issue Apr 25, 2023 · 9 comments
Open

Feature Request: Word List Substitution #718

Ne0nd0g opened this issue Apr 25, 2023 · 9 comments
Labels
enhancement New feature or request

Comments

@Ne0nd0g
Copy link

Ne0nd0g commented Apr 25, 2023

In order to fight against entropy, it would be useful to have Garble combine N number of words from a provided word list and use that for string replacement instead of random characters. Could also provide a max string length and trim the string at that length.

@lu4p
Copy link
Member

lu4p commented Apr 26, 2023

This could surely be done, however I don't quite understand why we would do this. Can you elaborate?

@mvdan
Copy link
Member

mvdan commented Apr 27, 2023

This reminds me of #593, which was designed to make it a little less trivial to detect that a binary was built with garble. I'm fine with those kind of changes in general, as long as they don't have downsides like noticeably bigger binaries.

Right now, the names get replaced by hashes, and we have enough bits that collisions are extremely unlikely, and this allows us to not need to have book-keeping in terms of how we obfuscated each name. We simply hash again as needed.

My only worry with this approach is that, with a word list, we would need to pick many words to have enough bits using the same mechanism. And since some words can be long, this could make names very long, and binaries noticeably bigger as well.

Maybe this is OK if the word list is long enough and we aggressively abbreviate some of the longer words (without causing duplicates). We'd have to experiment a bit.

@mvdan
Copy link
Member

mvdan commented Apr 27, 2023

We could always add the obfuscated name book-keeping as well, and to some degree we already record what names we did not obfuscate, which is the opposite. This would allow for shorter obfuscated names, but we would need to be very careful to assign names in a deterministic order.

@pagran
Copy link
Member

pagran commented May 5, 2023

very careful to assign names in a deterministic order.

My attempt to implement this is stuck on the //linkname obfuscation

@pagran
Copy link
Member

pagran commented May 12, 2023

I seem to have got something usable. Here's an example of how "realistic" naming works before and after

Names generated based on scrapped identifiers: https://github.com/pagran/go-identifiers-database

@mvdan
Copy link
Member

mvdan commented May 12, 2023

You might find https://github.com/mvdan/corpus/blob/master/top-1000.tsv useful in terms of collecting more "top" modules. Although it only scrapes github right now.

@pagran
Copy link
Member

pagran commented May 12, 2023

After x2

1,327,195 identifiers!

@mvdan
Copy link
Member

mvdan commented Jun 13, 2023

We already have two large PRs in flight. If you want us to work faster, sponsor us, particularly @pagran in this case :)

@lu4p lu4p added the enhancement New feature or request label Dec 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Development

No branches or pull requests

5 participants
@mvdan @Ne0nd0g @lu4p @pagran and others