Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regex not working as expected: last 7 characters are not matching #81

Open
VityaSchel opened this issue Jun 29, 2022 · 2 comments
Open

Comments

@VityaSchel
Copy link

Hello. First of all I would like to thank you for your work and adding regex. But it seems I can't match last 7 characters in onion address (before .onion).

This example works:

> ./mkp224o -d ../test "...............................................ffff"
set workdir: ../test/
filters:
	...............................................ffff
in total, 1 filter
using 4 threads
5h6qylfywhcrm5cbjtdy6cxq5f7ao4hkxjtq4p2hd2vczxzffffoxryd.onion
oz4mpcslnjkuznvzjzkykhe2hhs777i4jko5un5cyipb33rfffftmsyd.onion
rc4eq533eu2ub7qyhdrbnknegstpsbrrpz22eajwroesunkffhfieead.onion
4vi6mkfrg52idbkw3xokx72n6m3sovuqoxi22gdfsnuw7uzffff3moqd.onion

But when you try to add characters after that and match last 4 chars, it just can't find it:

> ./mkp224o -d ../test "......................................................yd"
set workdir: ../test/
filters:
	......................................................yd
in total, 1 filter
using 4 threads
^Cwaiting for threads to finish... done.

I noticed that all addresses has "d" in end, so I tried matching character before it:

> ./mkp224o -d ../test "......................................................y"
set workdir: ../test/
filters:
	......................................................y
in total, 1 filter
using 4 threads
^Cwaiting for threads to finish... done.

Not working :(

Also why is there no "$" to match end of string?

And why sometimes it replaces one char in matching string?

@scribblemaniac
Copy link
Contributor

scribblemaniac commented Jun 29, 2022

Most of this has been answered in #5:

matching of end of onion address is not supported because of performance optimization.
at the end of address, checksum is stored, and it's SHA3 so computing it before checking would slow down cases where checking of end of address isn't needed.
regex filter you mentioned doesn't end with $ therefore only smallest portion (2 first characters) are matched.
they all end with d because of constant version byte which is included in all v3 addresses (specification, [ONIONADDRESS] section).

The last 3 bytes, or the last 5 base32 characters, are not calculated until the filtering is completed so you cannot filter based off of them.

The regex $ character does work. There is an implicit ^ at the start of every regex query, so if you want to use $ as well, it needs to be a full match. Ex: .*nyan.{2}$.

As for your issue about replacing one char in the matching string, if you're referring to any of the last 5 chars, again that's expected for the reasons already mentioned. The 50th character also seems not to match the filter sometimes and I do not have an explanation for that, but that probably deserves its own github issue.

@VityaSchel
Copy link
Author

VityaSchel commented Jun 29, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants