Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automate mirror submission #448

Open
wants to merge 19 commits into
base: master
Choose a base branch
from

Conversation

Torxed
Copy link
Member

@Torxed Torxed commented Feb 17, 2023

Concept so far:

  • Public endpoint that doesn't require an arch account
  • Captcha to combat most spam
  • When submitted, the mirror is default Active: False and Public: False. These change to True when:
    • Upon (automated?) verification of Completion: 100.0%
    • After email to maintainers is sent and they manually change the two values after inspection.
  • Tier 1 requests gets automatically rejected (with a nice email explaining why) if:
    • No valid Tier 2 in the database matching the same name and URL endpoints.
    • Tier 2 is newer than registered < MIN days
    • Tier 2 has a log containing multiple suspicious errors

rec

Copy link
Member

@jelly jelly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! I've made some minor comments in the PR. Would be nice to also add a test later, happy to help if you have some questions.

.gitignore Outdated Show resolved Hide resolved
requirements.txt Outdated Show resolved Hide resolved
templates/mirrors/mirror_submit.html Outdated Show resolved Hide resolved
mirrors/views/mirrorlist.py Outdated Show resolved Hide resolved

<ul>
<li><a href="/mirrors/tier/1/">Tier 1 mirrors</a></li>
<li><a href="/mirrors/tier/2/">Tier 2 mirrors</a></li>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally this uses the url template tag

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I'll learn how those work and see if I can get it working.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how to work around the fact that the tier url's use re_path(r'^tier/(?P<tier>\d+)/$', mirrors, name='mirror-list-tier'), with one common name (I assume that's how those links work after reading up).

templates/mirrors/mirror_submit.html Outdated Show resolved Hide resolved
@Torxed
Copy link
Member Author

Torxed commented Feb 18, 2023

Cool! I've made some minor comments in the PR. Would be nice to also add a test later, happy to help if you have some questions.

Thanks for the feedback, me and the misses will get onto fixing the last few things before I mark this as ready instead of WIP.

The whole Django thing is still new to me and I have no idea where things are stored in the db 🙈
Tests are coming too, once I know it's all working in manual tests.

@Torxed
Copy link
Member Author

Torxed commented Feb 18, 2023

Oh, and I will run a code linter once I feel ready. Just can't stand the syntax while developing :)

@Torxed
Copy link
Member Author

Torxed commented Feb 18, 2023

Mail function looks good:
screenshot

Aside from automatic check to make the mirror public after certain criterias, I think the PR is mostly ready.

… should be ok as some errors such as faulty DNS lookups do occur on some hosts momentarily. That we have to account for
@Torxed
Copy link
Member Author

Torxed commented Feb 18, 2023

@jelly I consider the PR ready, as I might need help with writing tests for this.
But as mentioned, I can't figure out why the pip+git breaks (even in the runner):

Obtaining cssmin from git+git://github.com/fredj/cssmin.git@master#egg=cssmin (from -r requirements.txt (line 1))
  Cloning git://github.com/fredj/cssmin.git (to revision master) to ./src/cssmin
  Running command git clone --filter=blob:none --quiet git://github.com/fredj/cssmin.git /home/runner/work/archweb/archweb/src/cssmin
  fatal: unable to connect to github.com:
  github.com[0: 140.82.112.3]: errno=Connection timed out

  error: subprocess-exited-with-error
  
  × git clone --filter=blob:none --quiet git://github.com/fredj/cssmin.git /home/runner/work/archweb/archweb/src/cssmin did not run successfully.
  │ exit code: 128
  ╰─> See above for output.
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× git clone --filter=blob:none --quiet git://github.com/fredj/cssmin.git /home/runner/work/archweb/archweb/src/cssmin did not run successfully.
│ exit code: 128
╰─> See above for output.

@Torxed Torxed marked this pull request as ready for review February 18, 2023 20:37
@@ -0,0 +1,3 @@
def test_mirror_registration(client, mirrorurl):
response = client.get('/mirrorlist/submit/?name=test3&tier=2&upstream=1&admin_email=anton%40hvornum.se&alternate_email=&isos=on&rsync_user=&rsync_password=&notes=&active=True&public=True&url1-url=rsync%3A%2F%2Ftest3.com%2Farchlinux&url1-country=SE&url1-bandwidth=1234&url1-active=on&url2-url=&url2-country=&url2-bandwidth=&url2-active=on&url3-url=&url3-country=&url3-bandwidth=&url3-active=on&ip=&captcha_0=d5a017cc3851fb59898167f666759c99b42afd52&captcha_1=tdof')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a crude test, but will at least test that the endpoint exists and that it accepts the request data (captcha will fail)

@Torxed
Copy link
Member Author

Torxed commented Feb 19, 2023

A thought that hit me while administrating the manual process was that adding a bunch of URL's to the same mirror might be desirable.

That currently won't work with this approach, but very well could in v2.0 of this endpoint.
I feel it's a bit of an undertaking adding support for 3+ URL's, especially dynamically added URL's.
So if no one objects, I'd prefer if this goes in before I start working on v2.0 of this.

@jelly
Copy link
Member

jelly commented Mar 12, 2023

This PR still has CI issues.

Copy link
Member

@jelly jelly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please / rebase cleanup the comments and I'll take another look

requirements.txt Outdated
@@ -1,4 +1,4 @@
-e git+git://github.com/fredj/cssmin.git@master#egg=cssmin
-e git+https://github.com/fredj/cssmin.git@master#egg=cssmin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you on an old branch? On master it's

   1   │ -e git+https://github.com/fredj/cssmin.git@master#egg=cssmin

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing.

.gitignore Outdated
@@ -6,6 +6,7 @@ local_settings.py
archweb.db
archweb.db-*
database.db
/*.tar.gz
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please drop this, this shouldn't be required anymore. It's also unrelated to the other changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure it's "unrelated", but the fact remains that whenever you work from the project folder and follow some instructions about downloading a database or something similar you run the risk of pushing it globally.

If the project does not need .tar.gz files to be built or executed, there's really no risk of adding it to the ignore list. And I thought while I'm at it, I'll add it.

@@ -3,6 +3,7 @@

urlpatterns = [
path('', views.generate_mirrorlist, name='mirrorlist'),
path('submit/', views.submit_mirror, name='mirrorsubmit'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not linked anywhere on archlinux.org, maybe we should add a link with some text on https://archlinux.org/mirrors.

requirements.txt Outdated
@@ -1,5 +1,5 @@
cssmin==0.2.0
Django==4.0.1
Django==4.1.6
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be in here? Please rebase and drop it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but saw an opportunity to update old stuff seeing as they have a few vulns in 4.0 that should be fixed in 4.1. But I'll revert it and someone can update later.

@@ -37,3 +37,5 @@ Thayer Williams <thayer@archlinux.org>
Thomas Bächler <thomas@archlinux.org>
Tom Willemsen <tom.willemsen@archlinux.us>
Tyler Dence <tyzoid.d@gmail.com>
Anton Hvornum <anton@hvornum.se>
Nina Nick <5041877+ninchester@users.noreply.github.com>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add this in a separate commit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate commit to the same PR, or separate PR with a separate commit?

@@ -0,0 +1,3 @@
def test_mirror_registration(client, mirrorurl):
response = client.get('/mirrorlist/submit/?name=test3&tier=2&upstream=1&admin_email=anton%40hvornum.se&alternate_email=&isos=on&rsync_user=&rsync_password=&notes=&active=True&public=True&url1-url=rsync%3A%2F%2Ftest3.com%2Farchlinux&url1-country=SE&url1-bandwidth=1234&url1-active=on&url2-url=&url2-country=&url2-bandwidth=&url2-active=on&url3-url=&url3-country=&url3-bandwidth=&url3-active=on&ip=&captcha_0=d5a017cc3851fb59898167f666759c99b42afd52&captcha_1=tdof')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks wrong, a form should never be able to be posted with a GET, it should always be a POST.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also please create the url programmatically so you can easily assert the data when expanding the test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as below:

@@ -19,7 +19,6 @@

import random


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unneeded change

@@ -234,41 +233,64 @@ def find_mirrors_simple(request, protocol):
proto = get_object_or_404(MirrorProtocol, protocol=protocol)
return find_mirrors(request, protocols=[proto])


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry, but PEP8#Blank-Lines dictates two blank lines between function definitions. The rest of the code is also spaced out this way.

For reference:

return urls
def find_mirrors(request, countries=None, protocols=None, use_status=False,


<p>Before you can submit a <b>Tier 1</b> request the mirror in question must first be a registered <b>Tier 2</b> for a certain amount of time with proven reliablity. Once the submitted information is verified the mirror will be visible under the appropriate tier list above. This process usually takes 5 minutes.</p>

<form id="list-generator" method="get">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A Form should never be a GET, but always a POST and should have:

    <form method="post">{% csrf_token %}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And just for reference, I took the method from this form:

@@ -5,6 +5,7 @@ Markdown==3.3.7
bencode.py==4.0.0
django-countries==7.3.2
django-extensions==3.1.3
django-simple-captcha==0.5.17
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I like the captch dependency, we don't have it for flagging packages. Do you think it's really required?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well this is what people said I should put in when I was asking in the arch channels, I don't mind dropping it but to reduce spam this is what was suggested.

To quote: "Just make a captcha thing"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants