Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question concerning License of PyASN's ancestors #25

Open
dmth opened this issue Apr 25, 2016 · 32 comments
Open

Question concerning License of PyASN's ancestors #25

dmth opened this issue Apr 25, 2016 · 32 comments

Comments

@dmth
Copy link

dmth commented Apr 25, 2016

Hi,
I've got a Question concerning the License of PyASN's ancestors:
Is there any code left from the mrtd-project which was licensed under 4-Clause BSD?

Because clause 3 would be interesting to manage:

    1. All advertising materials mentioning features or use of
  • this software must display the following acknowledgement:
    *
  • This product includes software developed by the University of
  • Michigan, Merit Network, Inc., and their contributors.

BR dmth

@hadiasghari
Copy link
Owner

Hi @dmth, a few updates on this:

I checked the specific file you mention (_radix.c), and talked with a FOSS licensing expert. The inclusion is fine in pyasn given its referenced in both source and documentation. But the 'advertising' clause is indeed restrictive and might cause issues for derivative projects. I further contacted the chain of authors of the code to see if they can release it with another license; it turns out that indeed it originates in MRT which I am next going to contact.

BR Hadi

@dmth
Copy link
Author

dmth commented Jun 17, 2016

Hadi, thank you very much for your efforts!

@hadiasghari hadiasghari mentioned this issue Jun 17, 2016
@hadiasghari hadiasghari self-assigned this Jun 17, 2016
@sebix
Copy link
Contributor

sebix commented Aug 10, 2016

Is there any progress on this?

@hadiasghari
Copy link
Owner

Hi @dmth , @sebix . I've filed an open issue with the MRT project (see: deepfield/MRT#1).

@hadiasghari
Copy link
Owner

Deepfield has not responded to the issue or email contact. So @dmth @sebix, shall we close this issue, try to find another way to contact them, or leave it be for now?

@sebix
Copy link
Contributor

sebix commented Sep 16, 2016

This is blocking any kind of bundling and/or shipping this software. If this is not solved, we are forced to switch the backend for certtools/intelmq, which I'd like to avoid if possible.

It therefore blocks #20

@jkrauska
Copy link
Contributor

Without knowing anything about the code you're using -- are these viable alternatives without the same restrictions?

BSD
https://github.com/freebsd/freebsd/blob/master/sys/net/radix.c

Apple Variant
https://opensource.apple.com/source/xnu/xnu-1699.24.8/bsd/net/radix.c

I too would love to have a Debian package (#20).

@sebix
Copy link
Contributor

sebix commented Sep 16, 2016

First one should be no problem at all, then both the python and the c code have MIT.

Second uses [APSL-2.0]. This is incompatible with GPL. Tough it's problematic, linking it from code with other licences is allowed

@jkrauska
Copy link
Contributor

Neglecting the licenses -- Does the MIT licensed code provide similar capabilities? I sadly admit that I only have a basic understanding of what a radix tree is -- and I haven't looked at how pyasn or MRTs use them at all.

@sebix
Copy link
Contributor

sebix commented Sep 16, 2016

I guess I have less knowledge on this topic than you have.

@hadiasghari
Copy link
Owner

The radix tree is a fundamental part of the module. It's what enables the longest prefix to be matched. Changing it is possible, but it would require much testing for both correctness and performance issues. So I guess we should try to reach-out to DeepField via other ways.

Btw, @sebix , the FOSS lawyer I talked with said that the advertising clause only holds if the particular functionality is your product's main functionality.

@jkrauska
Copy link
Contributor

@sebix FWIW this python library https://github.com/YoshiyukiYamauchi/mrtparse can parse MRT datafiles (both gz and bz2) and DOES NOT use Merit's radix code. It's pure python (no c), so it's much slower than pyasn, but it includes a mrt2bgpdump.py script which could be trimmed down to be pretty much the same as the text file you get from the convert tool.

Update: I spent a little time hacking this, and it's DOG slow on the routeviews table. (so many peers) and can result in HUGE bgpdump outputs. (I killed it after >10 min and had a 1.2GB text file)

@hadiasghari
Copy link
Owner

@jkrauska just for clarification, pyasn uses the radix tree for its .lookup() function, not for the parsing .

@hadiasghari
Copy link
Owner

Hi all @sebix @jkrauska @dmth @anoroozian, an update and question/poll.

I've contacted the radix.c author directly (in addition to previously contacting deepfield), and received absolutely no response. I'm not sure how to interpret a non-response, other than that the license is what it is.

About the next step: On a scale of 1-5 how much are you in favor of switching to another of radix implementations, or have other suggestions? And could you contribute to the coding or testing?

Thanks!

@jkrauska
Copy link
Contributor

jkrauska commented Feb 8, 2017

+5 switch -- the original author is probably on a beach somewhere...

Best to find/use implementations without any licensing concerns.

There seems to be an unencumbered one in the linux kernel.

https://github.com/torvalds/linux/blob/master/lib/radix-tree.c

@dmth
Copy link
Author

dmth commented Feb 9, 2017

@hadiasghari thank you very much for keeping this issue alive and spending so much time for solving this issue.

Did anyone reach out to https://github.com/mjschultz/py-radix ?
Because he must have the same problem like us. Maybe we can combine efforts here.

Right now I cannot tell if switching to another radix implementation is the way to go.

Is the use of a radix-tree a requirement? Do other data-structures serve our needs, which would allow a similar speed?

A first approach to measure speed might be:

import pyasn
from timeit import default_timer as timer

testdata = '/tmp/latest-bview.dat'

asndb = None
print("Creating Radix-Tree")
start = timer()
asndb = pyasn.pyasn(testdata)
end = timer()
time_spent = end - start
end = start = None
print("Time spend for creating Data-Structure: %s seconds"% time_spent)


print("Testing 100 IP Lookups")
start = timer()
for i in range(100):
    look = ('192.%i.1.1'% i)
    asndb.lookup(look)
end = timer()
time_spent = end - start
end = start = None
print("Time spend for 100 Lookups: %s seconds"% time_spent)


print("Testing 100 Prefix Lookups")
start = timer()
for i in range(100):
    asndb.get_as_prefixes(i)
end = timer()
time_spent = end - start
end = start = None
print("Time spend for 100 Prefix-lookups: %s seconds"% time_spent)

Results:

Creating Radix-Tree
Time spend for creating Data-Structure: 0.27312132100087183 seconds
Testing 100 IP Lookups
Time spend for 100 Lookups: 0.00023630400028196163 seconds
Testing 100 Prefix Lookups
Time spend for 100 Prefix-lookups: 1.406222734000039 seconds

@dmth
Copy link
Author

dmth commented Feb 9, 2017

@jkrauska

There seems to be an unencumbered one in the linux kernel.
https://github.com/torvalds/linux/blob/master/lib/radix-tree.c

Thanks for finding this one, but I'm unsure if adding GPL-code does serve the needs of all users of PyASN.

@dmth
Copy link
Author

dmth commented Feb 15, 2017

Alternative pure python implementations, found by @bernhardreiter:

@anoroozian
Copy link
Collaborator

Ok, joining this thread a bit late but here is my 2 cents:

It seems to me that with respect to this question we are discussing two separate things:

  1. Issue Packages #20 - (Debian packages)
  2. Potential licensing problem of including pyasn alongside other software.

The latest version of Debian already includes the problematic code and (going by the advice of the FOSS license expert that @hadiasghari talked to) the code it is not a main feature of Debian, therefore no problem on the advertising clause. Pyasn itself has no problem either and correctly includes the acknowledgements. Even if it were packaged along side Debian it would not be a main feature. So following the same logic, with respect to issue #20, I think that we should be able to move on as far as my understanding goes. Of course I'm in no way a licensing expert. Does anybody agree with this or am I completely missing something here?

With respect to the second issue, which I think is the subject of the Hadi's question/poll, I would vote +3 on changing the radix tree code, simply because it is such a core part of pyasn. Unless we have a reliable and efficient replacement that we can test I would be leaning slightly towards not changing the core part. At the moment, I won't be able to help out because of different priorities, but just in case we actually go in that direction I'm in favor of the "linux kernel" alternative pointed out

#25 (comment)

@ghost
Copy link

ghost commented Feb 15, 2017

The latest version of Debian already includes the problematic code

Do you refer to python-radix?

@anoroozian
Copy link
Collaborator

The latest version of Debian already includes the problematic code

Do you refer to python-radix?

@wagner-certat Exactly

@bernhardreiter
Copy link

Hi,
as far as I understood @wagner-certat is mainly interested in using pyasn for intelmq (so am I) and his packaging efforts are related to it. As intelmq is under GNU AGPL v>=3 the "original 4-clause BSD license" is incompatible with it, see https://www.gnu.org/licenses/license-list.en.html#OriginalBSD.

As this "Original BSD" license is considered Free Software, it can be shipped by Debian, still no
GNU GPL or AGPL software products can be derived from it.

As cidrtree and pytricia both promise performance improvements over python-radix when being used for network addresses, I think looking into it would be a well spend effort. I expect the interface to the lookup part of pyasn to be small, which should make this feasable.

@hadiasghari
Copy link
Owner

The radix tree is a fundamental part of the module: it's what enables the most-specific-prefix match to happen when doing a lookup.

Changing it is possible, but it would require much testing. I originally picked a C implementation to ensure fast db loads and memory efficiency. I will compare the pure Python alternatives.

A new question for pytricia: is its GPL license compatible with pyasn's MIT license? Or would we want two branches if we pick it?

@dmth
Copy link
Author

dmth commented Feb 16, 2017

A new question for pytricia: is its GPL license compatible with pyasn's MIT license?

@bernhardreiter already asked the author to consider the licensing: jsommers/pytricia#10 let's see what happens.

@ghost
Copy link

ghost commented Feb 16, 2017 via email

@bernhardreiter
Copy link

But I create separate packages for intelmq and pyasn.

You can do this and help pyasn alone. The licensing problem with using it in intelmq is not solved by separate packaging. The legal question is whether the work 'intelmq product' is derived from 'pyasn' which probably is yes, because you cannot leave out pyasn for good.

@hadiasghari
Copy link
Owner

I have good news and a suggestion. I tested pytricia and the performance is on par with pyasn's radix tree.

  • Initial load (750k prefixes): pyasn-radix=0.28s, pytricia=0.96s
  • Memory use: pyasn-radix=140MB, pytricia=129MB
  • 100,000 V4+V6 lookups: pyasn-radix=0.17s, pytricia=0.21s

My suggestion is to:

  • add a switch to pyasn setup to allow building without radix.c
  • add a backend parameter to pyasn.__init__ that selects the desired radix backend; if None, it is set automatically
  • add unit-tests for both backends

Will this work for everyone? IMO it allows having multiple licenses, and avoid breaking code in case of bugs in alternative backends.

@bernhardreiter
Copy link

@hadiasghari that is really good news! Thanks for testing pytricia!

As far as I can say this will work for us (@dmth and IntelMQ).

We would be using only pytricia and my suggestion for Debian packaging would be to go in this direction as well. It is fine to keep the ability to use the old code with the less wanted 4-clause BSD license in, as long as it does not pose a maintenance burden. (This is also the main criteria for possibly removing it, even small options would usually have to tested and thus add to the complexity.)

@ghost
Copy link

ghost commented Feb 21, 2017

@hadiasghari
Great news. Thanks for all your work!

@bernhardreiter

You can do this and help pyasn alone. The licensing problem with using it in intelmq is not solved by separate packaging. The legal question is whether the work 'intelmq product' is derived from 'pyasn' which probably is yes, because you cannot leave out pyasn for good.

The functionality is also provided by other tools, so pyasn is not the only option.

@hadiasghari
Copy link
Owner

@bernhardreiter,

We would be using only pytricia and my suggestion for Debian packaging would be to go in this direction as well. It is fine to keep the ability to use the old code with the less wanted 4-clause BSD license in, as long as it does not pose a maintenance burden. (This is also the main criteria for possibly removing it, even small options would usually have to tested and thus add to the complexity.)

I do not want to change the default backend for everyone, because I don't want anything to break if pytricia changes. Also, the load-time is slower, and I noticed a possible memory leak. Regarding the maintenance burden with two-backends, I don't think it'll be a big issue, as the radix-code is stable and doesn't change. To enable multiple radix-backends, I'll create a common abstract class that wraps their functionality.

@bernhardreiter
Copy link

bernhardreiter commented Feb 22, 2017

@hadiasghari

I do not want to change the default backend for everyone [..] I'll create a common abstract class that wraps their functionality.

works for me. 👍

I noticed a possible memory leak

It would be cool if you could report this to https://github.com/jsommers/pytricia @jsommers is certainly interested in such a report.

I do not want to change the default backend for everyone

Personally I would change the default or even remove the code that is less-desirably licensed even if it is a tiny bit slower because I think this licensing trouble will cause problems with some people later where the efforts to resolve it will have multiplied. If you would want to test pytricia before removal I'd still change the default to it to get more exposure and deprecate the old-code to remove it with some following release. (This is just me, it is your call, I'd just wanted to give you my take on it. :) )

Thanks for pyasn as Free Software!

@ghost
Copy link

ghost commented Jul 12, 2020

Shouldn't the BSD-4-Clause license of the radix code be reflected in the license specification of pyasn? (for simplicity and as the plan is to go with pytricia anyway, I would only adapt

pyasn/setup.py

Line 41 in bebed27

license='MIT',
to MIT and BSD-4-Clause)

Repository owner deleted a comment from ha0min Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants