Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor OUI and IAB database handling. #145

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

mpontillo
Copy link

  • Collapse test case duplicates.
  • Allow OUI class to optionally take in the index and registry file.
  • Add test cases to ensure strangely-encoded entries work.

 * Collapse test case duplicates.
 * Allow OUI class to optionally take in the index and registry file.
 * Add test case to ensure strangely-encoded entries work.
@mpontillo
Copy link
Author

Hm, this one needs some work. Tests are failing on Python 2. (It might be that I wrote a legitimately failing test case, but I think I also missed a difference between Python 2 and Python 3.)

For some context, this change was motivated by my desire to ensure that the OUI database can handle strange non-UTF-8 artifacts in the IEEE data. Our software is seeing some tracebacks from netaddr, but after writing these tests, I'm no longer convinced there is a bug in netaddr. Rather, I think it has to do with a long-running application and the OUI index cache. If netaddr is updated while our application is running, the OUI index on disk would change, but not in memory.

Also, Debian/Ubuntu may have a separate issue; they chose to decouple the OUI database from netaddr itself, which means that there is a greater chance that the index will be out-of-sync with the IEEE source data file.

Missing implicit_prefix arg was causing unnecessary calls to
cidr_abbrev_to_verbose, when the prefix is already normalized. Was also
causing an unnecessary guess of the IP version. These make a difference
when expanding many items.
@drkjam
Copy link
Collaborator

drkjam commented Jan 22, 2017

Merge conflicts with this PR. Please can you update your fork and resubmit.

@drkjam drkjam modified the milestones: 0.7.19, 0.7.20 Jan 23, 2017
@mpontillo
Copy link
Author

I resolved the conflicts using GitHub's editor, but I didn't get a chance to test the branch again.

@drkjam
Copy link
Collaborator

drkjam commented Jan 26, 2017

I've been reviewing this pull request.

Some recent changes I made to parse the IEEE files and generate the indexes under both Python 2.x and 3.x correctly mean that merging this PR is no longer clean and introduces breaking changes to the codebase. For this reason, I can't accept it as is.

However, I certainly do want to provide the flexibility for users to introduce their own versions of the underlying data files so they can stay up-to-date on their own release cycles without necessarily being chain linked to netaddr's own releases just to gain access to new data files.

I did have an idea related to this.

Would it make life easier if I exposed the paths to the various IEEE data and index files as environment variables so their location could be more easily overridden? That way you could use netaddr as is and (assuming you've generated the indexes correctly from the underlying files) not have to write a bunch of additional code to try and override the files programatically.

I'm happy to support both a code override method via the constructors (as you have demonstrated in the PR) and environment variable overrides.

Let me know what you think and I'll have a go implementing the necessary changes.

@mpontillo
Copy link
Author

mpontillo commented Jan 26, 2017

Thanks for the reply.

My apologies; I think my intentions with this weren't entirely clear.

When I started working on this pull request, my goal was to write a unit test that demonstrated a bug in netaddr, and then go fix it. However, the bug turned out to be not at all what I was expecting. So in the end, all this PR accomplished was to refactor the index handling a little, so that I could enhance the unit testing. (I don't think functionality like this is strictly needed for production use, but I thought it was probably a good step in the right direction.)

With regard to environment variables, I'm neutral on the idea. I think it would be nice to have the flexibility, but it would neither help nor hurt our particular use case.

So I feel like I should take a step back and talk about the potential requirements I think could be addressed to make things better for netaddr users. I went ahead and wrote down in a spreadsheet what I believe to be the user stories/requirements we have for this functionality. I've given you edit access, so feel free to use this document as you see fit, and/or comment as appropriate.

To me, the quick and easy fix is to do rows 8 and 9 (in the current spreadsheet); that is, the ability to regenerate the index on the first lookup, and the ability to keep the data file descriptor open. It would be nice to have some sort of integrity check, too, but in Debian's use case, I think it would be more reliable to just assume the index is out of date, since the data file is supplied by a completely different .deb package.

@jstasiak jstasiak changed the base branch from rel-0.7.x to master June 21, 2020 01:12
@jstasiak
Copy link
Contributor

I'll think about this, I have some ideas how to tackle that.

@jstasiak
Copy link
Contributor

@mpontillo Thank you for the contribution! I decided I won't merge it as a whole but I'm happy to pick at least one or two pieces from this PR in separate commits, you'll be appropriately credited of course. Please let me know if that's ok with you.

@mpontillo
Copy link
Author

No worries @jstasiak; please feel free to use whatever you'd like from this PR. I'm happy to discuss further, if needed - though unfortunately I don't work with code that uses netaddr on a day-to-day basis any more, so it's not top-of-mind.

@jstasiak
Copy link
Contributor

Thank you! I don't want your work to go to waste, so I'll try to get at least part of that merged. I'll ask questions if needed.

@jstasiak
Copy link
Contributor

PS. That's a great writeup you provided (https://docs.google.com/spreadsheets/d/1_yfJSQ2H-CwmkExyYPzms-e9qCm16ymraWCjn8MeFGQ/edit), it'll be useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants