High Number of Junk Lines #1

berzerk0 · 2017-05-28T17:19:55Z

BEWGor makes a ton of lines, maybe too many.

It is such a stark contrast to the Probable-Wordlists that the dictionaries created by BEWGor have so many lines that just don't seem to be of good quality.

What kind of junk?

BEWGor goes through given dates, creates variations and extracts specifics.
If you fed it today's date, 28052017 - it would create the following with a max permutation length of 2, lines produced would include the following.

2805, 285, 2017, 28517, 52817, 5282017 - These are legitimate, quality variations.
2852805, 201717, 528285 - These are NOT quality variations.

If someone is going to include a date in their password, they might do it in a number of different formats (*5/28, 28/5, 05/28, 28/05, 28/05/2017, 28/05/17...) but it is highly unlikely they would include more than one format in the same password!

Now, I predict it would be RARE to have this kind of redundancy, but ultimately it is POSSIBLE.
Here we get to the age-old balance of security - there are always more steps you could take, but how many of them are practical? How many of the steps become overkill, not worth the trouble?

What can be done about the junk? Isn't this problem going to get worse?

As the detail increases, and more specific details are added about the Subjects, the permutations are going to grow exponentially and simply get out of hand. As a result, I will need to refine this process to do things like weed out alternative formats of redundant information.

So far the ideas I have had would require intensely specific creation of password formats, which has plenty of room for design holes. Instead of one implementation of a permutation function, I may end up having a gigantic bundle of nested for loops with conditional exclusions and re-writing of strings that would eat up all the RAM.

For example, I'd need to have a section that uses 'Initials + Birthday(no year),' then 'Birthday(no year) + Initials' then 'Birthday(with year) + Initials,' ...but for every. single. kind. of. in.for.ma.tion.
Nightmarish. CUPP, the program that inspired this one, may have limited the amount of information prompted for exactly this reason.

The answer here might be some kind of machine learning; Some way for the program to recognize that a given string contains redundant information. Unfortunately, I predict this is far above my head at this time.

But all is not lost, I will keep brainstorming and hunting down ways to slim the output down.
In addition, BEWGor only exists on the World's Largest Collaborative Software platform, so I have access to an excellently helpful community. In addition to my own pursuits, any outside suggestions on how to slim down the wordlist without sacrificing too much fidelity would be much appreciated!

Who is asking these questions?

Is posting my own issue like retweeting myself? I mean, I am asking these questions to myself and then answering them. It's a real rhetorical device, right?

TLDR - BEWGor has junk lines. Some of them contain redundant information. I'm trying to put a stop to that - suggestions are appreciated.

glozanoa · 2021-06-23T21:41:58Z

Hello @berzerk0 .
I thing splitting the BEWGor.py script into small scripts make more maintainable you repository. I have added a PR to do that.

berzerk0 added enhancement help wanted labels May 28, 2017

berzerk0 self-assigned this May 28, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High Number of Junk Lines #1

High Number of Junk Lines #1

berzerk0 commented May 28, 2017

glozanoa commented Jun 23, 2021

High Number of Junk Lines #1

High Number of Junk Lines #1

Comments

berzerk0 commented May 28, 2017

BEWGor makes a ton of lines, maybe too many.

What kind of junk?

What can be done about the junk? Isn't this problem going to get worse?

Who is asking these questions?

TLDR - BEWGor has junk lines. Some of them contain redundant information. I'm trying to put a stop to that - suggestions are appreciated.

glozanoa commented Jun 23, 2021