Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Wolfram's primitive polynomials #464

Open
wants to merge 21 commits into
base: release/0.3.x
Choose a base branch
from

Conversation

iyanmv
Copy link
Contributor

@iyanmv iyanmv commented Jan 18, 2023

In the same line as #462 regarding #140

  • New script to create a primitive polynomials database
  • New PrimitivePolyDatabase class to handle access to the database
  • Add Wolfram's primitive polynomials: Up to GF(2^1_200), GF(3^660), GF(5^430), and GF(7^358)
  • Integrate with galois.primitive_poly
  • Add unit tests

@iyanmv iyanmv marked this pull request as draft January 18, 2023 15:41
@codecov
Copy link

codecov bot commented Jan 18, 2023

Codecov Report

Base: 96.40% // Head: 96.35% // Decreases project coverage by -0.05% ⚠️

Coverage data is based on head (d3c6f6d) compared to base (0c4386b).
Patch coverage: 88.57% of modified lines in pull request are covered.

Additional details and impacted files
@@                Coverage Diff                @@
##           release/0.3.x     #464      +/-   ##
=================================================
- Coverage          96.40%   96.35%   -0.05%     
=================================================
  Files                 43       44       +1     
  Lines               5701     5733      +32     
=================================================
+ Hits                5496     5524      +28     
- Misses               205      209       +4     
Impacted Files Coverage Δ
src/galois/_polys/_primitive.py 97.11% <78.57%> (-2.89%) ⬇️
src/galois/_databases/_primitive.py 95.00% <95.00%> (ø)
src/galois/_databases/__init__.py 100.00% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@iyanmv
Copy link
Contributor Author

iyanmv commented Jan 18, 2023

How do you propose handling accessing this database? In general, Wolfram's primitive polynomials are not the same as the ones obtained using galois.primitive_poly(). For example,

from galois import Poly, primitive_poly
from galois._databases import PrimitivePolyDatabase

db = PrimitivePolyDatabase()

for degree in range(2, 50):
    degrees, coeffs = db.fetch(2, degree)
    poly = Poly.Degrees(degrees, coeffs)
    poly_ = primitive_poly(2, degree)
    if poly != poly_:
        print(f"GF(2^{degree})")
        print(f"primitive_poly(2, {degree}) = {poly_}")
        print(f"Wolfram's primitive poly = {poly}\n")
        assert poly.is_primitive()

GF(2^18)
primitive_poly(2, 18) = x^18 + x^5 + x^2 + x + 1
Wolfram's primitive poly = x^18 + x^7 + 1

GF(2^32)
primitive_poly(2, 32) = x^32 + x^7 + x^5 + x^3 + x^2 + x + 1
Wolfram's primitive poly = x^32 + x^7 + x^6 + x^2 + 1

GF(2^33)
primitive_poly(2, 33) = x^33 + x^6 + x^4 + x + 1
Wolfram's primitive poly = x^33 + x^13 + 1

GF(2^34)
primitive_poly(2, 34) = x^34 + x^7 + x^6 + x^5 + x^2 + x + 1
Wolfram's primitive poly = x^34 + x^8 + x^4 + x^3 + 1

GF(2^36)
primitive_poly(2, 36) = x^36 + x^6 + x^5 + x^4 + x^2 + x + 1
Wolfram's primitive poly = x^36 + x^11 + 1

GF(2^37)
primitive_poly(2, 37) = x^37 + x^5 + x^4 + x^3 + x^2 + x + 1
Wolfram's primitive poly = x^37 + x^6 + x^4 + x + 1

GF(2^42)
primitive_poly(2, 42) = x^42 + x^5 + x^4 + x^3 + x^2 + x + 1
Wolfram's primitive poly = x^42 + x^7 + x^4 + x^3 + 1

GF(2^46)
primitive_poly(2, 46) = x^46 + x^8 + x^5 + x^3 + x^2 + x + 1
Wolfram's primitive poly = x^46 + x^8 + x^7 + x^6 + 1

GF(2^48)
primitive_poly(2, 48) = x^48 + x^7 + x^5 + x^4 + x^2 + x + 1
Wolfram's primitive poly = x^48 + x^9 + x^7 + x^4 + 1

GF(2^49)
primitive_poly(2, 49) = x^49 + x^6 + x^5 + x^4 + 1
Wolfram's primitive poly = x^49 + x^9 + 1

I don't know if the ones generated by galois.primitive_poly() have some desired properties. If not, then I guess the simplest way is to query the database when method="min" is used. If there is a match, use it. Otherwise, compute it as it happens right now. What do you think?

@mhostetter mhostetter added poly Related to polynomials performance Affects speed/performance labels Jan 18, 2023
@mhostetter
Copy link
Owner

This is great, @iyanmv. Thanks for submitting this!

How do you propose handling accessing this database? In general, Wolfram's primitive polynomials are not the same as the ones obtained using galois.primitive_poly()

I was hoping these polynomials were the lexicographically-first primitive polynomial with minimal terms. I believe this is the case from the spot checks below (using my #463 branch). Although, we should do more confirmation and testing.

In [1]: import galois

In [2]: galois.primitive_poly(2, 18)
Out[2]: Poly(x^18 + x^5 + x^2 + x + 1, GF(2))

In [3]: galois.primitive_poly(2, 18, terms=3)
Out[3]: Poly(x^18 + x^7 + 1, GF(2))

In [4]: galois.primitive_poly(2, 32)
Out[4]: Poly(x^32 + x^7 + x^5 + x^3 + x^2 + x + 1, GF(2))

In [5]: galois.primitive_poly(2, 32, terms=5)
Out[5]: Poly(x^32 + x^7 + x^6 + x^2 + 1, GF(2))

In [6]: galois.primitive_poly(2, 33)
Out[6]: Poly(x^33 + x^6 + x^4 + x + 1, GF(2))

In [7]: galois.primitive_poly(2, 33, terms=3)
Out[7]: Poly(x^33 + x^13 + 1, GF(2))

In [8]: galois.primitive_poly(2, 34)
Out[8]: Poly(x^34 + x^7 + x^6 + x^5 + x^2 + x + 1, GF(2))

In [9]: galois.primitive_poly(2, 34, terms=5)
Out[9]: Poly(x^34 + x^8 + x^4 + x^3 + 1, GF(2))

If this is generally true, I suggest we hook into primitive_poly() the same way we intend for irreducible_poly() -- perform a database lookup if method="min" and terms="min".

@iyanmv
Copy link
Contributor Author

iyanmv commented Jan 18, 2023

I was hoping these polynomials were the lexicographically-first primitive polynomial with minimal terms. I believe this is the case from the spot checks below (using my #463 branch). Although, we should do more confirmation and testing.

I think that's the case, but I will check a bit more.

If this is generally true, I suggest we hook into primitive_poly() the same way we intend for irreducible_poly() -- perform a database lookup if method="min" and terms="min".

Okay, sounds good. Last commit makes primitive_poly() to check the database with method="min" (just to see what tests break). When #463 is ready I will modify it to only use the database with method="min" and terms="min".

mhostetter and others added 10 commits January 22, 2023 15:03
- New script to create a primitive polynomials database
- New PrimitivePolyDatabase class to handle access to the database
- Add Wolfram's primitive polynomials
- Try to fetch nonzero degrees and coefficients from the database when
  primitive_poly() is called with terms="min". If poly is not in the
  database, fallback to compute it.
- Add LUTs for high degree primitive polynomials from Wolfram's database
- Add test_minimum_terms_from_database with a timeout=1s
@iyanmv iyanmv marked this pull request as ready for review January 22, 2023 21:01
@iyanmv
Copy link
Contributor Author

iyanmv commented Jan 22, 2023

Windows doesn't like the pytest-timeout. Should I remove it? Or do you have any suggestions? My thinking here was to test not only correctness but also speed since it should be fast to fetch from the database. I guess some of those polynomials would take forever to compute without the database, so the timeout given by Github Actions should be enough.

@mhostetter
Copy link
Owner

Windows doesn't like the pytest-timeout.

It looks like pytest-timeout calls signal.SIGALM, which is only available on Unix systems (https://docs.python.org/3/library/signal.html#signal.SIGALRM). Perhaps that might be a bug in or improvement that could be made to pytest-timeout.

My thinking here was to test not only correctness but also speed since it should be fast to fetch from the database. I guess some of those polynomials would take forever to compute without the database, so the timeout given by Github Actions should be enough.

Yeah. We could always implement a poor man's timeout.

@pytest.mark.parametrize("order,degree,polys", PARAMS_DB)
def test_minimum_terms_from_database(order, degree, polys):
    tick = time.time()
    p = galois.primitive_poly(order, degree, terms="min")
    tock = time.time()
    assert tock - tick < 1.0
    db_degrees = p.nonzero_degrees.tolist()
    db_coeffs = p.nonzero_coeffs.tolist()
    exp_degrees, exp_coeffs = polys
    assert db_degrees == exp_degrees and db_coeffs == exp_coeffs

@mhostetter
Copy link
Owner

What's the best way for me to locally test your changes? Should I clone your forks and checkout your branch? Is there an easier way to do it?

@iyanmv
Copy link
Contributor Author

iyanmv commented Jan 22, 2023

What's the best way for me to locally test your changes? Should I clone your forks and checkout your branch? Is there an easier way to do it?

I think you can directly fetch from your repo without having to clone my fork.

git fetch origin pull/464/head:iyanmv.primitive-wolfram
git checkout iyanmv.primitive-wolfram

@iyanmv
Copy link
Contributor Author

iyanmv commented Jan 22, 2023

I updated previous command to create a new local branch.

@mhostetter
Copy link
Owner

mhostetter commented Jan 22, 2023

I'm running a local test to verify that the polynomials returned by the database are the same as those found by this library on its own. I likely won't get through testing all of them, but I'd like to test a bunch.

import galois

order = 2
for degree in range(1, 1200):
    p1 = galois.primitive_poly(order, degree, use_database=False)
    p2 = galois.primitive_poly(order, degree, use_database=True)
    print(f"{order}^{degree}: {p1 == p2}\t{p1}\t{p2}")

I'm running each order in parallel. I'll let it run for a few hours (or maybe overnight). I'll report back with findings.

@iyanmv
Copy link
Contributor Author

iyanmv commented Jan 22, 2023

Is use_database a kwarg you would like to add to the irreducible_poly() and primitive_poly() functions, or you just added now for testing?

@mhostetter
Copy link
Owner

Is use_database a kwarg you would like to add to the irreducible_poly() and primitive_poly() functions, or you just added now for testing?

I just did this for testing. I believe we should always use the database once we are confident it always gives us the answer we expect.

    if use_database and terms == "min" and method == "min":

Currently up to this state, and still running strong 💪:

2^327: True     x^327 + x^7 + x^6 + x^5 + x^3 + x^2 + 1 x^327 + x^7 + x^6 + x^5 + x^3 + x^2 + 1
3^40: True      x^40 + x + 2    x^40 + x + 2
5^109: True     x^109 + x^2 + 2x + 3    x^109 + x^2 + 2x + 3
7^107: True     x^107 + 6x^2 + x + 2    x^107 + 6x^2 + x + 2

@mhostetter
Copy link
Owner

mhostetter commented Jan 22, 2023

One thing I think we should do is add a blurb discussing the performance benefit (for select orders and degrees) that comes with the database when using terms="min" (which at present is not the default -- that could be a separate discussion. I see pros/cons of making terms="min" the new default.)

The reason I think the blurb is important is because someone may call galois.primitive_poly(2, 1000) and it will take forever, but they could've called galois.primitive_poly(2, 1000, terms="min") and it would've returned instantly.

We can place an admonition inline with the argument like this.

    Arguments:
        order: The prime power order :math:`q` of the field :math:`\mathrm{GF}(q)` that the polynomial is over.
        degree: The degree :math:`m` of the desired irreducible polynomial.
        terms: The desired number of non-zero terms :math:`t` in the polynomial.

            - `None` (default): Disregards the number of terms while searching for the polynomial.
            - `int`: The exact number of non-zero terms in the polynomial.
            - `"min"`: The minimum possible number of non-zero terms.

            .. important::
                If `terms="min"` and `method="min"`, a database lookup is used for...

        method: The search method for finding the irreducible polynomial.

            - `"min"` (default): Returns the lexicographically-first polynomial.
            - `"max"`: Returns the lexicographically-last polynomial.
            - `"random"`: Returns a random polynomial.

@mhostetter
Copy link
Owner

mhostetter commented Jan 22, 2023

Well that was embarrassing... my script didn't even test the code it was supposed to! It needed a terms="min" kwarg.

I updated the script
import pprint
import time

import galois

order = 2
if order == 2:
    degrees = range(1, 1200)
elif order == 3:
    degrees = range(1, 660)
elif order == 5:
    degrees = range(1, 430)
elif order == 7:
    degrees = range(1, 358)
else:
    raise AssertionError

different_polys = []

for degree in degrees:
    print(f"{order}^{degree}:")
    tick = time.time_ns()
    p1 = galois.primitive_poly(order, degree, terms="min", use_database=False)
    tock = time.time_ns()
    print(f"\tWithout database: {(tock - tick)*1e-9} seconds")

    tick = time.time()
    p2 = galois.primitive_poly(order, degree, terms="min", use_database=True)
    tock = time.time()
    print(f"\tWith database: {(tock - tick)*1e-9} seconds")

    print(f"\t{p1 == p2}\t{p1}\t{p2}")
    if p1 != p2:
        different_polys.append((order, degree, p1, p2))
        pprint.pprint(different_polys)

However, I've run into some discrepancies.

(The binary cases haven't disagreed yet.)

3^17:
        Without database: 0.023376024000000002 seconds
        With database: 5.013704299926758e-12 seconds
        True    x^17 + 2x + 1   x^17 + 2x + 1
3^18:
        Without database: 1.1321898590000001 seconds
        With database: 4.955530166625977e-12 seconds
        False   x^18 + x^9 + x^5 + 2    x^18 + x^13 + x + 2
Traceback (most recent call last):
  File "test_3.py", line 30, in <module>
    assert p1 == p2
AssertionError
5^3:
        Without database: 0.05411400500000001 seconds
        With database: 5.683422088623047e-12 seconds
        True    x^3 + 3x + 2    x^3 + 3x + 2
5^4:
        Without database: 0.12074545 seconds
        With database: 5.507707595825196e-12 seconds
        False   x^4 + x^2 + 2x + 2      x^4 + 4x^2 + x + 2
Traceback (most recent call last):
  File "test_3.py", line 30, in <module>
    assert p1 == p2
AssertionError
7^3:
        Without database: 0.060279397000000005 seconds
        With database: 5.127191543579102e-12 seconds
        True    x^3 + 3x + 2    x^3 + 3x + 2
7^4:
        Without database: 0.20767921 seconds
        With database: 5.424976348876954e-12 seconds
        False   x^4 + x^2 + 3x + 5      x^4 + 6x^2 + x + 3
Traceback (most recent call last):
  File "test_3.py", line 30, in <module>
    assert p1 == p2
AssertionError

I haven't debugged the root issues yet, but wanted to share my findings. And share the script so you can run it locally too.

@iyanmv
Copy link
Contributor Author

iyanmv commented Jan 23, 2023

I think the problem is that we are not doing what we want when the database doesn't contain the polynomial:

if terms == "min" and method == "min":
try:
db = PrimitivePolyDatabase()
degrees, coeffs = db.fetch(order, degree)
field = _factory.FIELD_FACTORY(order)
poly = Poly.Degrees(degrees, coeffs, field=field)
return poly
except LookupError:
pass
try:
if method == "min":
return next(primitive_polys(order, degree, terms))
if method == "max":
return next(primitive_polys(order, degree, terms, reverse=True))
# Random search
if terms is None:
return next(_random_search(order, degree, "is_primitive"))
if terms == "min":
terms = _minimum_terms(order, degree, "is_primitive")
return next(_random_search_fixed_terms(order, degree, terms, "is_primitive"))

For example, when you do primitive_poly(3, 18, terms="min", use_database=False), current code goes to line 147 instead of 155.

@iyanmv
Copy link
Contributor Author

iyanmv commented Jan 23, 2023

Okay, there were two issues. First, the try/if statements were not right. And second, Wolfram's database does not contain lexicographically-first primitive polynomials, but lexicographically-last ones!

@iyanmv
Copy link
Contributor Author

iyanmv commented Jan 23, 2023

Actually no, it looks like it's something in between. Can you share with me some reference to understand better the lexicographically-first and lexicographically-last? I understand your code, but I don't know how these terms are used in the literature. Maybe there are different conventions?

@iyanmv
Copy link
Contributor Author

iyanmv commented Jan 23, 2023

Last commit fixes what I meant by wrong order in the if statements. I think it's better to compute the number of min terms in the except part. I kept the use_database to help debug but I will remove later.

@iyanmv
Copy link
Contributor Author

iyanmv commented Jan 23, 2023

Last commit fixes what I meant by wrong order in the if statements. I think it's better to compute the number of min terms in the except part. I kept the use_database to help debug but I will remove later.

That was not a good idea. Now I think I got right all possible cases.

@mhostetter
Copy link
Owner

I think the problem is that we are not doing what we want when the database doesn't contain the polynomial. For example, when you do primitive_poly(3, 18, terms="min", use_database=False), current code goes to line 147 instead of 155.

I think the original code and order of if statements was correct. In your example, we do want to go to line 147. Lines 147 and 149 handle the deterministic search for the lexicographically first or last polynomial. In those cases, primitive_polys() handles the terms value. Lines 152-156 handle the random search case, in which we must first find the number of minimum terms if terms="min" is specified.

Can you share with me some reference to understand better the lexicographically-first and lexicographically-last? I understand your code, but I don't know how these terms are used in the literature.

See https://en.wikipedia.org/wiki/Lexicographic_order. My understanding of lexicographical order is that it is equivalent to ordering the polynomials by their integer representation. See the example below of degree-3 polynomials over GF(3) in lexicographical order.

In [6]: field = galois.GF(3)

In [7]: for i in range(field.order**3, 2*field.order**3):
   ...:     poly = galois.Poly.Int(i, field=field)
   ...:     print(poly)
   ...: 
x^3
x^3 + 1
x^3 + 2
x^3 + x
x^3 + x + 1
x^3 + x + 2
x^3 + 2x
x^3 + 2x + 1
x^3 + 2x + 2
x^3 + x^2
x^3 + x^2 + 1
x^3 + x^2 + 2
x^3 + x^2 + x
x^3 + x^2 + x + 1
x^3 + x^2 + x + 2
x^3 + x^2 + 2x
x^3 + x^2 + 2x + 1
x^3 + x^2 + 2x + 2
x^3 + 2x^2
x^3 + 2x^2 + 1
x^3 + 2x^2 + 2
x^3 + 2x^2 + x
x^3 + 2x^2 + x + 1
x^3 + 2x^2 + x + 2
x^3 + 2x^2 + 2x
x^3 + 2x^2 + 2x + 1
x^3 + 2x^2 + 2x + 2

Actually no, it looks like it's something in between.

I think this is correct. I think Wolfram often gives the lexicographically-first polynomial, but sometimes it is close to the first. I don't know why or how they did this. I would understand randomly-found polynomials or deterministically-first polynomials (as with the HP irreducible list you found), but an assortment is perplexing. 😞

@mhostetter
Copy link
Owner

Since the Wolfram database doesn't neatly plug in to our existing cases (lexicographically first or last, minimal terms or any terms), I see a couple of options.

Approach 1

Maybe we offer a wolfram_primitive_poly() function. I'm not the biggest fan of this because it expands the namespace and isn't the most extensible, especially if future tables/libraries don't have such nice names.

Approach 2

Perhaps we offer the Wolfram database as a separate option inside primitive_poly(). Maybe method="wolfram" to access the polynomial that Wolfram would provide. We could then raise a LookupError to the user if the polynomial does not exist in Wolfram's database.

@export
def primitive_poly(
    order: int,
    degree: int,
    terms: int | str | None = None,
    method: Literal["min", "max", "random", "matlab", "wolfram"] = "min",
    use_database: bool = True,
) -> Poly:

If we do this, then I may later graft matlab_primitive_poly(p, m) into primitive_poly(p, m, method="matlab"). It, too, is often the lexicographically-first primitive polynomial with minimal terms (but not always 😞 ). I think conway_poly() should stay separate because it is a different kind of polynomial -- primitive but also satisfying special consistency checks with Conway polynomials of smaller degree.

What are your thoughts on those two approaches? Do you have any other ideas?

@iyanmv
Copy link
Contributor Author

iyanmv commented Jan 23, 2023

I think the original code and order of if statements was correct. In your example, we do want to go to line 147. Lines 147 and 149 handle the deterministic search for the lexicographically first or last polynomial. In those cases, primitive_polys() handles the terms value. Lines 152-156 handle the random search case, in which we must first find the number of minimum terms if terms="min" is specified.

Oh, I see, I didn't realize primitive_polys() was computing the number of minimum terms. Then I think I broke it without realizing and then I unbroke it in a different and less efficient way (I guess I'm computing many times the number of min terms). I will fix this later.

See https://en.wikipedia.org/wiki/Lexicographic_order. My understanding of lexicographical order is that it is equivalent to ordering the polynomials by their integer representation. See the example below of degree-3 polynomials over GF(3) in lexicographical order.

Yes, it makes sense.

I think this is correct. I think Wolfram often gives the lexicographically-first polynomial, but sometimes it is close to the first. I don't know why or how they did this. I would understand randomly-found polynomials or deterministically-first polynomials (as with the HP irreducible list you found), but an assortment is perplexing. 😞

I contacted their support (taking advantage of the uni license) asking about how that resource was computed. I will let you know when (if) they answer.

What are your thoughts on those two approaches? Do you have any other ideas?

I guess I prefer second approach. Let me think a bit about it.

@mhostetter
Copy link
Owner

I contacted their support (taking advantage of the uni license) asking about how that resource was computed. I will let you know when (if) they answer.

Hey, that's great! I'm curious if and how they'll respond.

@mhostetter
Copy link
Owner

Regarding approaches moving forward, here's another thing to think about.

I previously had the idea of making custom polynomial databases. Basically running common cases (primitive_poly(p, m) and maybe primitive_poly(p, m, terms="min") for small primes (2, 3, 5, 7) for many degrees. Basically just using personal compute power to cache results that would otherwise take multiple minutes to find again. So even if the Wolfram database doesn't plug directly into primitive_poly(p, m, terms="min"), maybe another custom database can.

Just another idea for you to consider.

@iyanmv
Copy link
Contributor Author

iyanmv commented Feb 23, 2023

I finally got this response. At least they are looking at it.

Hello Iyán,

Thank you for the follow-up email.

It does appear that primitive polynomials in ResourceData is not behaving properly. I have forwarded an issue report to our developers with the information you provided, and added your contact information to the report so that you can be notified when it is resolved.

We are always interested in improving Mathematica, and I want to thank you once again for bringing this issue to our attention. If you run into any other problems with any of our products or have any additional questions, please do not hesitate to contact us.

Sincerely,
Mi Yan, PhD
Wolfram Technical Support
Wolfram Research Inc.
http://support.wolfram.com

lines = self._consume_to_next_section()

# Added: strip whitespace from lines to satisfy _parse_numpydoc_see_also_section()
for i in range(len(lines)):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can just do

for line in lines:
    line = line.strip()
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Affects speed/performance poly Related to polynomials
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants