Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GH-395] Port to Python 3 #409

Draft
wants to merge 360 commits into
base: master
Choose a base branch
from

Conversation

elsiehupp
Copy link

@elsiehupp elsiehupp commented Jun 9, 2021

Fixes #395.

EDIT 2: I've removed the entire body of this comment because it's difficult to keep it up to date with progress elsewhere.

EDIT 3: I've personally unsubscribed from this thread because I was getting an email for every single commit.

To interact with this draft pull request, please consult the README on the forked repository. If you run into any problems, opening an issue there will be more effective than commenting about it here.

@nemobis
Copy link
Member

nemobis commented Jun 9, 2021 via email

@elsiehupp
Copy link
Author

Il 09/06/21 23:34, Elsie Hupp ha scritto:
I had to port both poster and wikitools to Python 3 in order to get this to work, so I included those in their own folders in the repository.
Thank you very much, but can you also post them upstream?

Probably yes? I figured it was easiest to do it here to begin with.

@elsiehupp elsiehupp force-pushed the python3 branch 2 times, most recently from ea6bfea to d3d26f0 Compare June 9, 2021 23:03
@elsiehupp elsiehupp marked this pull request as draft June 9, 2021 23:03
@elsiehupp
Copy link
Author

Currently what I’m stuck on is URL encoding, which can probably be simplified by porting to Requests and/or urllib3.

@elsiehupp elsiehupp force-pushed the python3 branch 2 times, most recently from f56b1bc to 66fd814 Compare June 15, 2021 13:42
@nemobis
Copy link
Member

nemobis commented Aug 23, 2021

You probably didn't want to commit .vscode/settings.json

@elsiehupp
Copy link
Author

You probably didn't want to commit .vscode/settings.json

I think I just forgot to revert that when I flattened my commits (though the change stopped being relevant once I started using pipenv).

FWIW I think it might be worth my migrating from pipenv to poetry for the purpose of facilitating distribution on, e.g., PyPI.

Do you have any other immediate feedback? IIRC the main issue I was running into was with the test suite, so I haven’t been able to fully validate the new code.

@elsiehupp elsiehupp changed the title [GH-395] Port to Python 3 (including poster and wikitools) [GH-395] Port to Python 3 Aug 24, 2021
@elsiehupp
Copy link
Author

wikitools seems to be abandoned by its maintainer (though I haven’t tried particularly hard to reach him), so I went ahead and published the version from this pull request on PyPI as wikitools3, which allowed me to specify it as an external dependency. I did some digging, and it turned out that someone else had already made a Python 3 version of poster called poster3, so I used that as the dependency for wikitools3.

I migrated wikitools3 to use poetry, which seems to come with a lot of advantages, so I might want to migrate from pipenv to poetry here, as well.

@elsiehupp
Copy link
Author

Hi @GreenReaper—can you try the updated version?

In the cloned wikiteam directory, try:

$ git pull
$ poetry install
$ poetry run python dumpgenerator.py --xml --xmlrevisions https://furry.wiki.opencura.com

I ran the above commands myself several times, so the encoding issues should be fixed?

Thanks again for helping me find bugs!

@GreenReaper
Copy link

GreenReaper commented Sep 28, 2021

That works, thanks! However I tried it with an older wiki, in an attempt to ensure that encoding was saving correctly, and it seems the --xml case (no --xmlrevisions) is still broken on xmlfile.write in generateXMLDump, both on this wiki and the opencura one.

For this wiki I needed to add |class="mediawiki to the search regex in getWikiEngine, because it was otherwise detected as Unknown (since we removed the generator head lines as superfluous :-) - I tried --force but it didn't seem to do anything:

# poetry run python dumpgenerator.py --xml --curonly https://zh.wikifur.com/ --api https://zh.wikifur.com/w/api.php --index https://zh.wikifur.com/w/index.php
Checking API... https://zh.wikifur.com/w/api.php
API is OK: https://zh.wikifur.com/w/api.php
Checking index.php... https://zh.wikifur.com/w/index.php
index.php is OK
#########################################################################
# Welcome to DumpGenerator 0.4.0-alpha by WikiTeam (GPL v3)                   #
# More info at: https://github.com/WikiTeam/wikiteam                    #
#########################################################################

#########################################################################
# Copyright (C) 2011-2021 WikiTeam developers                           #

# This program is free software: you can redistribute it and/or modify  #
# it under the terms of the GNU General Public License as published by  #
# the Free Software Foundation, either version 3 of the License, or     #
# (at your option) any later version.                                   #
#                                                                       #
# This program is distributed in the hope that it will be useful,       #
# but WITHOUT ANY WARRANTY; without even the implied warranty of        #
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the         #
# GNU General Public License for more details.                          #
#                                                                       #
# You should have received a copy of the GNU General Public License     #
# along with this program.  If not, see <http://www.gnu.org/licenses/>. #
#########################################################################

Analysing https://zh.wikifur.com/w/api.php
Trying generating a new dump into a new directory...
Loading page titles from namespaces = all
Excluding titles from namespaces = None
20 namespaces found
    Retrieving titles in the namespace 0
    541 titles retrieved in the namespace 0
    Retrieving titles in the namespace 1
    30 titles retrieved in the namespace 1
    Retrieving titles in the namespace 2
    61 titles retrieved in the namespace 2
    Retrieving titles in the namespace 3
    93 titles retrieved in the namespace 3
    Retrieving titles in the namespace 4
    183 titles retrieved in the namespace 4
    Retrieving titles in the namespace 5
    2 titles retrieved in the namespace 5
    Retrieving titles in the namespace 6
    422 titles retrieved in the namespace 6
    Retrieving titles in the namespace 7
    1 titles retrieved in the namespace 7
    Retrieving titles in the namespace 8
    36 titles retrieved in the namespace 8
    Retrieving titles in the namespace 9
    1 titles retrieved in the namespace 9
    Retrieving titles in the namespace 10
    140 titles retrieved in the namespace 10
    Retrieving titles in the namespace 11
    4 titles retrieved in the namespace 11
    Retrieving titles in the namespace 12
    49 titles retrieved in the namespace 12
    Retrieving titles in the namespace 13
    2 titles retrieved in the namespace 13
    Retrieving titles in the namespace 14
    241 titles retrieved in the namespace 14
    Retrieving titles in the namespace 15
    2 titles retrieved in the namespace 15
    Retrieving titles in the namespace 828
    0 titles retrieved in the namespace 828
    Retrieving titles in the namespace 829
    0 titles retrieved in the namespace 829
    Retrieving titles in the namespace 100
    4 titles retrieved in the namespace 100
    Retrieving titles in the namespace 101
    1 titles retrieved in the namespace 101
Titles saved at... zhwikifurcom_w-20210928-titles.txt
1813 page titles loaded
https://zh.wikifur.com/w/api.php
Retrieving the XML for every page from "start"
Traceback (most recent call last):
  File "dumpgenerator.py", line 2850, in <module>
    main()
  File "dumpgenerator.py", line 2841, in main
    createNewDump(config=config, other=other)
  File "dumpgenerator.py", line 2361, in createNewDump
    generateXMLDump(config=config, titles=titles, session=other["session"])
  File "dumpgenerator.py", line 852, in generateXMLDump
    xmlfile.write(bytes(header, 'utf-8'))
TypeError: write() argument must be str, not bytes

Incidentally, it says it saved in "a new directory" but it doesn't say which directory, which can be confusing.

@elsiehupp
Copy link
Author

@GreenReaper Okay!

I’m not 100% sure what you’re describing |class="mediawiki, so I didn’t do anything with that.

I fixed two more encoding bugs. I also changed the default path to be a subdirectory of the parent directory rather than the working directory (so that the default path isn’t inside the cloned repository) and added a console message that prints when the --path argument is not used:

No --path argument provided. Defaulting to:
  ../[domain_prefix]-[date]-wikidump
Which expands to:
  ../zhwikifurcom_w-20210928-wikidump

(I could probably make the argument parsing more verbose across the board.)

Anyway, in the cloned wikiteam directory, try:

$ git pull
$ poetry install
$ poetry run python wikiteam3/dumpgenerator.py [args]

Note that dumpgenerator.py is now wikiteam3/dumpgenerator.py (a change that has to do with packaging and isn’t quite relevant here, yet).

I tried the following, and while I didn’t let it run its entire course, I didn’t get any errors for the first minute or two it was running:

$ poetry run python wikiteam3/dumpgenerator.py --xml --curonly https://zh.wikifur.com/ --api https://zh.wikifur.com/w/api.php --index https://zh.wikifur.com/w/index.php

@GreenReaper
Copy link

GreenReaper commented Sep 28, 2021

Yeah, I could have been clearer there. I meant getWikiEngine's detection, without which it refused to proceed; I changed to:

    elif re.search(
        '(?im)(alt="Powered by MediaWiki"|<meta name="generator" content="MediaWiki|class="mediawiki)',
        result,
    ):
        wikiengine = "MediaWiki"

There was a similar regex in checkIndex:

     if re.search(
        '(This wiki is powered by|<h2 id="mw-version-license">|meta name="generator" content="MediaWiki|class="mediawiki)',
         raw,
     ):
         return True

I tried the commands above and it worked for a while, then broke (trying to save the constant footer string?):

Downloaded 1810 pages
    新聞:Krystal的三明治在Fur Affinity爆红, 1 edit
    新聞:中文WikiFur的前綴名全面中文化, 1 edit
    新聞:英语 WikiFur 迁入 wikifur.com, 1 edit
    新聞討論:羽鲨, 1 edit
Traceback (most recent call last):
  File "wikiteam3/dumpgenerator.py", line 2854, in <module>
    main()
  File "wikiteam3/dumpgenerator.py", line 2845, in main
    createNewDump(config=config, other=other)
  File "wikiteam3/dumpgenerator.py", line 2365, in createNewDump
    generateXMLDump(config=config, titles=titles, session=other["session"])
  File "wikiteam3/dumpgenerator.py", line 883, in generateXMLDump
    xmlfile.write(bytes(footer, 'utf-8'))
TypeError: write() argument must be str, not bytes

@GreenReaper
Copy link

GreenReaper commented Sep 28, 2021

Unfortunately the config file was allegedly not written so it had to start again. In fact, it looks like it was written, but if it's meant to be a text file, it's unreadable, so maybe that bit needs to be changed?

As for the footer, I tried changing the existing line close to the end of generateXMLDump that mentions the footer to

xmlfile.write(str(footer))

and will see how that goes... though on consideration, it really should already be a str, so perhaps that is unnecessary? Anyway, adjusting that line resulted in a completed XML file, so it's definitely the issue.

@elsiehupp
Copy link
Author

I’m actually running the test again myself, though I added --delay 1 to avoid a timeout.

You can pull the latest changes again if you want.

Regarding the config file, I ran into the same issue myself, so that’s another thing I need to fix, lol.

Also, by the way, it can be helpful if you refer to line numbers, like, e.g. with the blocks where you added |class="mediawiki. (I was able to find them on my own, but line numbers can make it easier.)

@elsiehupp
Copy link
Author

--delay 1 slows things down pretty dramatically, so you might want to try a smaller fraction of a second.

@elsiehupp
Copy link
Author

Aaaaand the delay printout doesn’t display fractional seconds, so I fixed that.

robkam and others added 30 commits August 21, 2023 10:27
too many insteads
Fixes
#82.

This branch is based on
#177, so
it cannot be merged before the CONTRIBUTING branch is merged.

Note that I believe it is less than ideal for me, Elsie Hupp, to have
sole responsibility for CoC reports, but I also think it's better to
have an initial framework in place rather than none at all.

---------

Signed-off-by: Elsie Hupp <github@elsiehupp.com>
Co-authored-by: Rob Kam <robkam@gmx.com>
Re: How to find api.php.
Branch `python3` refactored by [Sourcery](https://sourcery.ai/github/).

If you're happy with these changes, merge this Pull Request using the
*Squash and merge* strategy.

See our documentation
[here](https://docs.sourcery.ai/GitHub/Using-Sourcery-for-GitHub/).

<details>
<summary>Run Sourcery locally</summary>
<p>
Reduce the feedback loop during development by using the Sourcery editor
plugin:
</p>
<ul>
<li><a href="https://sourcery.ai/download/?editor=vscode">VS
Code</a></li>
<li><a
href="https://sourcery.ai/download/?editor=pycharm">PyCharm</a></li>
</ul>
</details>

<details>
<summary>Review changes via command line</summary>
<p>To manually merge these changes, make sure you're on the
<code>python3</code> branch, then run:</p>
<pre>
git fetch origin sourcery/python3
git merge --ff-only FETCH_HEAD
git reset HEAD^
</pre>
</details>

Help us
[improve](https://research.typeform.com/to/j06Spdfr?type=branch_refactor&github_login=elsiehupp&base_repo=https%3A%2F%2Fgithub.com%2Fmediawiki-client-tools%2Fmediawiki-scraper.git&base_remote_ref=python3&base_ref=python3&base_sha=6d044c0c62c509751f57dfcb8edeca0906a974ab&head_repo=https%3A%2F%2Fgithub.com%2Fmediawiki-client-tools%2Fmediawiki-scraper.git&head_ref=sourcery%2Fpython3)
this pull request!

---------

Co-authored-by: Sourcery AI <>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
I've created [a new
repository](https://github.com/mediawiki-client-tools/upstream-to-sort)
for these files, in case we ever feel like doing anything with them.

Signed-off-by: Elsie Hupp <github@elsiehupp.com>
)

Fixes
#65.

Addresses @yzqzss'
[comment](https://github.com/orgs/mediawiki-client-tools/discussions/61#discussioncomment-6831973):

> * `scraper` is an evil name. (for webmasters)

Uses similar naming to
[`mediawiki-dump`](https://github.com/macbre/mediawiki-dump), from one
of the past contributors to `wikitools`. (I'm not 100% sure, but this
might be a more modern replacement for `wikitools`... either way,
potentially someone to be friendly with!)

I already created [a placeholder on
PyPI](https://pypi.org/project/mediawiki-dump-generator/), and it seems
like we're like 99% of the way there to being able to publish there.

I can change the name of this repository to match the new name right
when I merge this.

Signed-off-by: Elsie Hupp <github@elsiehupp.com>
I didn't complete placating `mypy` quite yet, but it isn't strictly
necessary if the tests pass. (I can keep working on this, but if the
tests still pass, it would be nice to be able to merge this for now.)

Signed-off-by: Elsie Hupp <github@elsiehupp.com>
Reverts #186

This shouldn't have been merged quite yet, since it didn't pass the
tests.
rm line that doesn't make sense.
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
scraper to dump-generator

---------

Co-authored-by: Elsie Hupp <github@elsiehupp.com>
Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.26.16 to
1.26.17.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/urllib3/urllib3/releases">urllib3's
releases</a>.</em></p>
<blockquote>
<h2>1.26.17</h2>
<ul>
<li>Added the <code>Cookie</code> header to the list of headers to strip
from requests when redirecting to a different host. As before, different
headers can be set via <code>Retry.remove_headers_on_redirect</code>.
(GHSA-v845-jxx5-vc9f)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/urllib3/urllib3/blob/main/CHANGES.rst">urllib3's
changelog</a>.</em></p>
<blockquote>
<h1>1.26.17 (2023-10-02)</h1>
<ul>
<li>Added the <code>Cookie</code> header to the list of headers to strip
from requests when redirecting to a different host. As before, different
headers can be set via <code>Retry.remove_headers_on_redirect</code>.
(<code>[#3139](urllib3/urllib3#3139)
&lt;https://github.com/urllib3/urllib3/pull/3139&gt;</code>_)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/urllib3/urllib3/commit/c9016bf464751a02b7e46f8b86504f47d4238784"><code>c9016bf</code></a>
Release 1.26.17</li>
<li><a
href="https://github.com/urllib3/urllib3/commit/01220354d389cd05474713f8c982d05c9b17aafb"><code>0122035</code></a>
Backport GHSA-v845-jxx5-vc9f (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3139">#3139</a>)</li>
<li><a
href="https://github.com/urllib3/urllib3/commit/e63989f97d206e839ab9170c8a76e3e097cc60e8"><code>e63989f</code></a>
Fix installing <code>brotli</code> extra on Python 2.7</li>
<li><a
href="https://github.com/urllib3/urllib3/commit/2e7a24d08713a0131f0b3c7197889466d645cc49"><code>2e7a24d</code></a>
[1.26] Configure OS for RTD to fix building docs</li>
<li><a
href="https://github.com/urllib3/urllib3/commit/57181d6ea910ac7cb2ff83345d9e5e0eb816a0d0"><code>57181d6</code></a>
[1.26] Improve error message when calling urllib3.request() (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3058">#3058</a>)</li>
<li><a
href="https://github.com/urllib3/urllib3/commit/3c0148048a523325819377b23fc67f8d46afc3aa"><code>3c01480</code></a>
[1.26] Run coverage even with failed jobs</li>
<li>See full diff in <a
href="https://github.com/urllib3/urllib3/compare/1.26.16...1.26.17">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=urllib3&package-manager=pip&previous-version=1.26.16&new-version=1.26.17)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/mediawiki-client-tools/mediawiki-dump-generator/network/alerts).

</details>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.26.17 to
1.26.18.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/urllib3/urllib3/releases">urllib3's
releases</a>.</em></p>
<blockquote>
<h2>1.26.18</h2>
<ul>
<li>Made body stripped from HTTP requests changing the request method to
GET after HTTP 303 &quot;See Other&quot; redirect responses.
(GHSA-g4mx-q9vg-27p4)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/urllib3/urllib3/blob/main/CHANGES.rst">urllib3's
changelog</a>.</em></p>
<blockquote>
<h1>1.26.18 (2023-10-17)</h1>
<ul>
<li>Made body stripped from HTTP requests changing the request method to
GET after HTTP 303 &quot;See Other&quot; redirect responses.</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/urllib3/urllib3/commit/9c2c2307dd1d6af504e09aac0326d86ee3597a0b"><code>9c2c230</code></a>
Release 1.26.18 (<a
href="https://redirect.github.com/urllib3/urllib3/issues/3159">#3159</a>)</li>
<li><a
href="https://github.com/urllib3/urllib3/commit/b594c5ceaca38e1ac215f916538fb128e3526a36"><code>b594c5c</code></a>
Merge pull request from GHSA-g4mx-q9vg-27p4</li>
<li><a
href="https://github.com/urllib3/urllib3/commit/944f0eb134485f41bc531be52de12ba5a37bca73"><code>944f0eb</code></a>
[1.26] Use vendored six in urllib3.contrib.securetransport</li>
<li>See full diff in <a
href="https://github.com/urllib3/urllib3/compare/1.26.17...1.26.18">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=urllib3&package-manager=pip&previous-version=1.26.17&new-version=1.26.18)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/mediawiki-client-tools/mediawiki-dump-generator/network/alerts).

</details>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
One intra-documentation link gave me a 404 so I updated it to the
filename. All other markdown links look ok.

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
I found some URLs in test file have redirect to non wiki URL so I update
and remove it for avoid unnecessary error when running check.
…228)

This commit is backport from
[saveweb/wikiteam3](https://github.com/saveweb/wikiteam3) all credit
goes to the original author.

Close #170 
Fix size mismatch error when some wiki do server-side image
resizing/compression without re-upload/update data in wiki.

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
In commit 265855d, I forget to change MediaWiki version in
site_info_test.py so this commit fix this problem.

Also I change function name in this file from `test_mediawiki_1_16` to
`test_mediawiki_version_match` to make it clear what it really does.
When I downloaded Fandom wiki, I found that sometime it throw HTTP error
403 randomly.

After investigation, this error is from some user-agent that
Dumpgenerator use, so this PR. fix this problem by using latest Chrome
version on Windows and use only one user-agent for easier to debug in
future.

This PR. also update Mediawiki version in test to match version of the
wiki we used to test.

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
L113 make command use pip3 - becuase it doesn't work with pip
fixes #211 
Uses Path(wikidir) to get the path compatible for both Windows and Linux

---------

Co-authored-by: Elsie Hupp <github@elsiehupp.com>
Bumps [idna](https://github.com/kjd/idna) from 3.4 to 3.7.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/kjd/idna/releases">idna's
releases</a>.</em></p>
<blockquote>
<h2>v3.7</h2>
<h2>What's Changed</h2>
<ul>
<li>Fix issue where specially crafted inputs to encode() could take
exceptionally long amount of time to process. [CVE-2024-3651]</li>
</ul>
<p>Thanks to Guido Vranken for reporting the issue.</p>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/kjd/idna/compare/v3.6...v3.7">https://github.com/kjd/idna/compare/v3.6...v3.7</a></p>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/kjd/idna/blob/master/HISTORY.rst">idna's
changelog</a>.</em></p>
<blockquote>
<p>3.7 (2024-04-11)
++++++++++++++++</p>
<ul>
<li>Fix issue where specially crafted inputs to encode() could
take exceptionally long amount of time to process. [CVE-2024-3651]</li>
</ul>
<p>Thanks to Guido Vranken for reporting the issue.</p>
<p>3.6 (2023-11-25)
++++++++++++++++</p>
<ul>
<li>Fix regression to include tests in source distribution.</li>
</ul>
<p>3.5 (2023-11-24)
++++++++++++++++</p>
<ul>
<li>Update to Unicode 15.1.0</li>
<li>String codec name is now &quot;idna2008&quot; as overriding the
system codec
&quot;idna&quot; was not working.</li>
<li>Fix typing error for codec encoding</li>
<li>&quot;setup.cfg&quot; has been added for this release due to some
downstream
lack of adherence to PEP 517. Should be removed in a future release
so please prepare accordingly.</li>
<li>Removed reliance on a symlink for the &quot;idna-data&quot; tool to
comport
with PEP 517 and the Python Packaging User Guide for sdist
archives.</li>
<li>Added security reporting protocol for project</li>
</ul>
<p>Thanks Jon Ribbens, Diogo Teles Sant'Anna, Wu Tingfeng for
contributions
to this release.</p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/kjd/idna/commit/1d365e17e10d72d0b7876316fc7b9ca0eebdd38d"><code>1d365e1</code></a>
Release v3.7</li>
<li><a
href="https://github.com/kjd/idna/commit/c1b3154939907fab67c5754346afaebe165ce8e6"><code>c1b3154</code></a>
Merge pull request <a
href="https://redirect.github.com/kjd/idna/issues/172">#172</a> from
kjd/optimize-contextj</li>
<li><a
href="https://github.com/kjd/idna/commit/0394ec76ff022813e770ba1fd89658790ea35623"><code>0394ec7</code></a>
Merge branch 'master' into optimize-contextj</li>
<li><a
href="https://github.com/kjd/idna/commit/cd58a23173d2b0a40b95ee680baf3e59e8d33966"><code>cd58a23</code></a>
Merge pull request <a
href="https://redirect.github.com/kjd/idna/issues/152">#152</a> from
elliotwutingfeng/dev</li>
<li><a
href="https://github.com/kjd/idna/commit/5beb28b9dd77912c0dd656d8b0fdba3eb80222e7"><code>5beb28b</code></a>
More efficient resolution of joiner contexts</li>
<li><a
href="https://github.com/kjd/idna/commit/1b121483ed04d9576a1291758f537e1318cddc8b"><code>1b12148</code></a>
Update ossf/scorecard-action to v2.3.1</li>
<li><a
href="https://github.com/kjd/idna/commit/d516b874c3388047934938a500c7488d52c4e067"><code>d516b87</code></a>
Update Github actions/checkout to v4</li>
<li><a
href="https://github.com/kjd/idna/commit/c095c75943413c75ebf8ac74179757031b7f80b7"><code>c095c75</code></a>
Merge branch 'master' into dev</li>
<li><a
href="https://github.com/kjd/idna/commit/60a0a4cb61ec6834d74306bd8a1fa46daac94c98"><code>60a0a4c</code></a>
Fix typo in GitHub Actions workflow key</li>
<li><a
href="https://github.com/kjd/idna/commit/5918a0ef8034379c2e409ae93ee11d24295bb201"><code>5918a0e</code></a>
Merge branch 'master' into dev</li>
<li>Additional commits viewable in <a
href="https://github.com/kjd/idna/compare/v3.4...v3.7">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=idna&package-manager=pip&previous-version=3.4&new-version=3.7)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/mediawiki-client-tools/mediawiki-dump-generator/network/alerts).

</details>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Bumps [tqdm](https://github.com/tqdm/tqdm) from 4.66.1 to 4.66.3.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/tqdm/tqdm/releases">tqdm's
releases</a>.</em></p>
<blockquote>
<h2>tqdm v4.66.3 stable</h2>
<ul>
<li><code>cli</code>: <code>eval</code> safety (fixes CVE-2024-34062,
GHSA-g7vv-2v7x-gj9p)</li>
</ul>
<h2>tqdm v4.66.2 stable</h2>
<ul>
<li><code>pandas</code>: add <code>DataFrame.progress_map</code> (<a
href="https://redirect.github.com/tqdm/tqdm/issues/1549">#1549</a>)</li>
<li><code>notebook</code>: fix HTML padding (<a
href="https://redirect.github.com/tqdm/tqdm/issues/1506">#1506</a>)</li>
<li><code>keras</code>: fix resuming training when
<code>verbose&gt;=2</code> (<a
href="https://redirect.github.com/tqdm/tqdm/issues/1508">#1508</a>)</li>
<li>fix <code>format_num</code> negative fractions missing leading zero
(<a
href="https://redirect.github.com/tqdm/tqdm/issues/1548">#1548</a>)</li>
<li>fix Python 3.12 <code>DeprecationWarning</code> on
<code>import</code> (<a
href="https://redirect.github.com/tqdm/tqdm/issues/1519">#1519</a>)</li>
<li>linting: use f-strings (<a
href="https://redirect.github.com/tqdm/tqdm/issues/1549">#1549</a>)</li>
<li>update tests (<a
href="https://redirect.github.com/tqdm/tqdm/issues/1549">#1549</a>)
<ul>
<li>fix <code>pandas</code> warnings</li>
<li>fix <code>asv</code> (<a
href="https://redirect.github.com/airspeed-velocity/asv/issues/1323">airspeed-velocity/asv#1323</a>)</li>
<li>fix macos <code>notebook</code> docstring indentation</li>
</ul>
</li>
<li>CI: bump actions (<a
href="https://redirect.github.com/tqdm/tqdm/issues/1549">#1549</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/tqdm/tqdm/commit/4e613f84ed2ae029559f539464df83fa91feb316"><code>4e613f8</code></a>
Merge pull request from GHSA-g7vv-2v7x-gj9p</li>
<li><a
href="https://github.com/tqdm/tqdm/commit/b53348c73080b4edeb30b4823d1fa0d8d2c06721"><code>b53348c</code></a>
cli: eval safety</li>
<li><a
href="https://github.com/tqdm/tqdm/commit/cc372d09dcd5a5eabdc6ed4cf365bdb0be004d44"><code>cc372d0</code></a>
bump version, merge pull request <a
href="https://redirect.github.com/tqdm/tqdm/issues/1549">#1549</a> from
tqdm/devel</li>
<li><a
href="https://github.com/tqdm/tqdm/commit/e9f0c05097dc167031575391d83240d37556f098"><code>e9f0c05</code></a>
use PyPI trusted publishing</li>
<li><a
href="https://github.com/tqdm/tqdm/commit/7323d5bcc9b032d525f9d6468a9713f5be9c4174"><code>7323d5b</code></a>
slight makefile clean</li>
<li><a
href="https://github.com/tqdm/tqdm/commit/5306125133d76e0f9326d747d29781fefe273c77"><code>5306125</code></a>
tests: bump pre-commit</li>
<li><a
href="https://github.com/tqdm/tqdm/commit/4a6fd4f690a4add231f4bef601521ed9bee513fb"><code>4a6fd4f</code></a>
fix datetime.utcfromtimestamp py3.12 warning (<a
href="https://redirect.github.com/tqdm/tqdm/issues/1519">#1519</a>)</li>
<li><a
href="https://github.com/tqdm/tqdm/commit/6f13759f4a0e1047a09732e72f6d07e44d3e6855"><code>6f13759</code></a>
tests: fix macos notebook indentation</li>
<li><a
href="https://github.com/tqdm/tqdm/commit/3abcd2ac90ecb01ac7f64071af600f803eab6a21"><code>3abcd2a</code></a>
tests: fix asv</li>
<li><a
href="https://github.com/tqdm/tqdm/commit/a4d15c8e2f6c7322c1a1cd1d845927f037281da1"><code>a4d15c8</code></a>
tests: fix pandas warnings</li>
<li>Additional commits viewable in <a
href="https://github.com/tqdm/tqdm/compare/v4.66.1...v4.66.3">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=tqdm&package-manager=pip&previous-version=4.66.1&new-version=4.66.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/mediawiki-client-tools/mediawiki-dump-generator/network/alerts).

</details>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Bumps [requests](https://github.com/psf/requests) from 2.31.0 to 2.32.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/psf/requests/releases">requests's
releases</a>.</em></p>
<blockquote>
<h2>v2.32.0</h2>
<h2>2.32.0 (2024-05-20)</h2>
<h2>🐍 PYCON US 2024 EDITION 🐍</h2>
<p><strong>Security</strong></p>
<ul>
<li>Fixed an issue where setting <code>verify=False</code> on the first
request from a
Session will cause subsequent requests to the <em>same origin</em> to
also ignore
cert verification, regardless of the value of <code>verify</code>.
(<a
href="https://github.com/psf/requests/security/advisories/GHSA-9wx4-h78v-vm56">https://github.com/psf/requests/security/advisories/GHSA-9wx4-h78v-vm56</a>)</li>
</ul>
<p><strong>Improvements</strong></p>
<ul>
<li><code>verify=True</code> now reuses a global SSLContext which should
improve
request time variance between first and subsequent requests. It should
also minimize certificate load time on Windows systems when using a
Python
version built with OpenSSL 3.x. (<a
href="https://redirect.github.com/psf/requests/issues/6667">#6667</a>)</li>
<li>Requests now supports optional use of character detection
(<code>chardet</code> or <code>charset_normalizer</code>) when
repackaged or vendored.
This enables <code>pip</code> and other projects to minimize their
vendoring
surface area. The <code>Response.text()</code> and
<code>apparent_encoding</code> APIs
will default to <code>utf-8</code> if neither library is present. (<a
href="https://redirect.github.com/psf/requests/issues/6702">#6702</a>)</li>
</ul>
<p><strong>Bugfixes</strong></p>
<ul>
<li>Fixed bug in length detection where emoji length was incorrectly
calculated in the request content-length. (<a
href="https://redirect.github.com/psf/requests/issues/6589">#6589</a>)</li>
<li>Fixed deserialization bug in JSONDecodeError. (<a
href="https://redirect.github.com/psf/requests/issues/6629">#6629</a>)</li>
<li>Fixed bug where an extra leading <code>/</code> (path separator)
could lead
urllib3 to unnecessarily reparse the request URI. (<a
href="https://redirect.github.com/psf/requests/issues/6644">#6644</a>)</li>
</ul>
<p><strong>Deprecations</strong></p>
<ul>
<li>Requests has officially added support for CPython 3.12 (<a
href="https://redirect.github.com/psf/requests/issues/6503">#6503</a>)</li>
<li>Requests has officially added support for PyPy 3.9 and 3.10 (<a
href="https://redirect.github.com/psf/requests/issues/6641">#6641</a>)</li>
<li>Requests has officially dropped support for CPython 3.7 (<a
href="https://redirect.github.com/psf/requests/issues/6642">#6642</a>)</li>
<li>Requests has officially dropped support for PyPy 3.7 and 3.8 (<a
href="https://redirect.github.com/psf/requests/issues/6641">#6641</a>)</li>
</ul>
<p><strong>Documentation</strong></p>
<ul>
<li>Various typo fixes and doc improvements.</li>
</ul>
<p><strong>Packaging</strong></p>
<ul>
<li>Requests has started adopting some modern packaging practices.
The source files for the projects (formerly <code>requests</code>) is
now located
in <code>src/requests</code> in the Requests sdist. (<a
href="https://redirect.github.com/psf/requests/issues/6506">#6506</a>)</li>
<li>Starting in Requests 2.33.0, Requests will migrate to a PEP 517
build system
using <code>hatchling</code>. This should not impact the average user,
but extremely old
versions of packaging utilities may have issues with the new packaging
format.</li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a
href="https://github.com/matthewarmand"><code>@​matthewarmand</code></a>
made their first contribution in <a
href="https://redirect.github.com/psf/requests/pull/6258">psf/requests#6258</a></li>
<li><a href="https://github.com/cpzt"><code>@​cpzt</code></a> made their
first contribution in <a
href="https://redirect.github.com/psf/requests/pull/6456">psf/requests#6456</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/psf/requests/blob/main/HISTORY.md">requests's
changelog</a>.</em></p>
<blockquote>
<h2>2.32.0 (2024-05-20)</h2>
<p><strong>Security</strong></p>
<ul>
<li>Fixed an issue where setting <code>verify=False</code> on the first
request from a
Session will cause subsequent requests to the <em>same origin</em> to
also ignore
cert verification, regardless of the value of <code>verify</code>.
(<a
href="https://github.com/psf/requests/security/advisories/GHSA-9wx4-h78v-vm56">https://github.com/psf/requests/security/advisories/GHSA-9wx4-h78v-vm56</a>)</li>
</ul>
<p><strong>Improvements</strong></p>
<ul>
<li><code>verify=True</code> now reuses a global SSLContext which should
improve
request time variance between first and subsequent requests. It should
also minimize certificate load time on Windows systems when using a
Python
version built with OpenSSL 3.x. (<a
href="https://redirect.github.com/psf/requests/issues/6667">#6667</a>)</li>
<li>Requests now supports optional use of character detection
(<code>chardet</code> or <code>charset_normalizer</code>) when
repackaged or vendored.
This enables <code>pip</code> and other projects to minimize their
vendoring
surface area. The <code>Response.text()</code> and
<code>apparent_encoding</code> APIs
will default to <code>utf-8</code> if neither library is present. (<a
href="https://redirect.github.com/psf/requests/issues/6702">#6702</a>)</li>
</ul>
<p><strong>Bugfixes</strong></p>
<ul>
<li>Fixed bug in length detection where emoji length was incorrectly
calculated in the request content-length. (<a
href="https://redirect.github.com/psf/requests/issues/6589">#6589</a>)</li>
<li>Fixed deserialization bug in JSONDecodeError. (<a
href="https://redirect.github.com/psf/requests/issues/6629">#6629</a>)</li>
<li>Fixed bug where an extra leading <code>/</code> (path separator)
could lead
urllib3 to unnecessarily reparse the request URI. (<a
href="https://redirect.github.com/psf/requests/issues/6644">#6644</a>)</li>
</ul>
<p><strong>Deprecations</strong></p>
<ul>
<li>Requests has officially added support for CPython 3.12 (<a
href="https://redirect.github.com/psf/requests/issues/6503">#6503</a>)</li>
<li>Requests has officially added support for PyPy 3.9 and 3.10 (<a
href="https://redirect.github.com/psf/requests/issues/6641">#6641</a>)</li>
<li>Requests has officially dropped support for CPython 3.7 (<a
href="https://redirect.github.com/psf/requests/issues/6642">#6642</a>)</li>
<li>Requests has officially dropped support for PyPy 3.7 and 3.8 (<a
href="https://redirect.github.com/psf/requests/issues/6641">#6641</a>)</li>
</ul>
<p><strong>Documentation</strong></p>
<ul>
<li>Various typo fixes and doc improvements.</li>
</ul>
<p><strong>Packaging</strong></p>
<ul>
<li>Requests has started adopting some modern packaging practices.
The source files for the projects (formerly <code>requests</code>) is
now located
in <code>src/requests</code> in the Requests sdist. (<a
href="https://redirect.github.com/psf/requests/issues/6506">#6506</a>)</li>
<li>Starting in Requests 2.33.0, Requests will migrate to a PEP 517
build system
using <code>hatchling</code>. This should not impact the average user,
but extremely old
versions of packaging utilities may have issues with the new packaging
format.</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/psf/requests/commit/d6ebc4a2f1f68b7e355fb7e4dd5ffc0845547f9f"><code>d6ebc4a</code></a>
v2.32.0</li>
<li><a
href="https://github.com/psf/requests/commit/9a40d1277807f0a4f26c9a37eea8ec90faa8aadc"><code>9a40d12</code></a>
Avoid reloading root certificates to improve concurrent performance (<a
href="https://redirect.github.com/psf/requests/issues/6667">#6667</a>)</li>
<li><a
href="https://github.com/psf/requests/commit/0c030f78d24f29a459dbf39b28b4cc765e2153d7"><code>0c030f7</code></a>
Merge pull request <a
href="https://redirect.github.com/psf/requests/issues/6702">#6702</a>
from nateprewitt/no_char_detection</li>
<li><a
href="https://github.com/psf/requests/commit/555b870eb19d497ddb67042645420083ec8efb02"><code>555b870</code></a>
Allow character detection dependencies to be optional in post-packaging
steps</li>
<li><a
href="https://github.com/psf/requests/commit/d6dded3f00afcf56a7e866cb0732799045301eb0"><code>d6dded3</code></a>
Merge pull request <a
href="https://redirect.github.com/psf/requests/issues/6700">#6700</a>
from franekmagiera/update-redirect-to-invalid-uri-test</li>
<li><a
href="https://github.com/psf/requests/commit/bf24b7d8d17da34be720c19e5978b2d3bf94a53b"><code>bf24b7d</code></a>
Use an invalid URI that will not cause httpbin to throw 500</li>
<li><a
href="https://github.com/psf/requests/commit/2d5f54779ad174035c5437b3b3c1146b0eaf60fe"><code>2d5f547</code></a>
Pin 3.8 and 3.9 runners back to macos-13 (<a
href="https://redirect.github.com/psf/requests/issues/6688">#6688</a>)</li>
<li><a
href="https://github.com/psf/requests/commit/f1bb07d39b74d6444e333879f8b8a3d9dd4d2311"><code>f1bb07d</code></a>
Merge pull request <a
href="https://redirect.github.com/psf/requests/issues/6687">#6687</a>
from psf/dependabot/github_actions/github/codeql-act...</li>
<li><a
href="https://github.com/psf/requests/commit/60047ade64b0b882cbc94e047198818ab580911e"><code>60047ad</code></a>
Bump github/codeql-action from 3.24.0 to 3.25.0</li>
<li><a
href="https://github.com/psf/requests/commit/31ebb8102c00f8cf8b396a6356743cca4362e07b"><code>31ebb81</code></a>
Merge pull request <a
href="https://redirect.github.com/psf/requests/issues/6682">#6682</a>
from frenzymadness/pytest8</li>
<li>Additional commits viewable in <a
href="https://github.com/psf/requests/compare/v2.31.0...v2.32.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=requests&package-manager=pip&previous-version=2.31.0&new-version=2.32.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/mediawiki-client-tools/mediawiki-dump-generator/network/alerts).

</details>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Bumps [pymysql](https://github.com/PyMySQL/PyMySQL) from 1.1.0 to 1.1.1.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/PyMySQL/PyMySQL/releases">pymysql's
releases</a>.</em></p>
<blockquote>
<h2>v1.1.1</h2>
<blockquote>
<p>[!WARNING]
This release fixes a vulnerability (CVE-2024-36039).
All users are recommended to update to this version.</p>
<p>If you can not update soon, check the input value from untrusted
source has an expected type.
Only dict input from untrusted source can be an attack vector.</p>
</blockquote>
<h2>What's Changed</h2>
<ul>
<li>Prohibit dict parameter for <code>Cursor.execute()</code>. It didn't
produce valid SQL
and might cause SQL injection. (CVE-2024-36039)</li>
<li>Added ssl_key_password param by <a
href="https://github.com/svaskov"><code>@​svaskov</code></a> in <a
href="https://redirect.github.com/PyMySQL/PyMySQL/pull/1145">PyMySQL/PyMySQL#1145</a></li>
</ul>
<h2>Merged PRs</h2>
<ul>
<li>Add support for Python 3.12 by <a
href="https://github.com/hugovk"><code>@​hugovk</code></a> in <a
href="https://redirect.github.com/PyMySQL/PyMySQL/pull/1134">PyMySQL/PyMySQL#1134</a></li>
<li>chore(deps): update actions/checkout action to v4 by <a
href="https://github.com/renovate"><code>@​renovate</code></a> in <a
href="https://redirect.github.com/PyMySQL/PyMySQL/pull/1136">PyMySQL/PyMySQL#1136</a></li>
<li>Update codecov/codecov-action action to v4 by <a
href="https://github.com/renovate"><code>@​renovate</code></a> in <a
href="https://redirect.github.com/PyMySQL/PyMySQL/pull/1137">PyMySQL/PyMySQL#1137</a></li>
<li>ci: use codecov@v3 by <a
href="https://github.com/methane"><code>@​methane</code></a> in <a
href="https://redirect.github.com/PyMySQL/PyMySQL/pull/1142">PyMySQL/PyMySQL#1142</a></li>
<li>chore(deps): update dessant/lock-threads action to v5 by <a
href="https://github.com/renovate"><code>@​renovate</code></a> in <a
href="https://redirect.github.com/PyMySQL/PyMySQL/pull/1141">PyMySQL/PyMySQL#1141</a></li>
<li>doc: use rtd theme by <a
href="https://github.com/methane"><code>@​methane</code></a> in <a
href="https://redirect.github.com/PyMySQL/PyMySQL/pull/1143">PyMySQL/PyMySQL#1143</a></li>
<li>use Ruff as formatter by <a
href="https://github.com/methane"><code>@​methane</code></a> in <a
href="https://redirect.github.com/PyMySQL/PyMySQL/pull/1144">PyMySQL/PyMySQL#1144</a></li>
<li>chore(deps): update dependency sphinx-rtd-theme to v2 by <a
href="https://github.com/renovate"><code>@​renovate</code></a> in <a
href="https://redirect.github.com/PyMySQL/PyMySQL/pull/1147">PyMySQL/PyMySQL#1147</a></li>
<li>chore(deps): update actions/setup-python action to v5 by <a
href="https://github.com/renovate"><code>@​renovate</code></a> in <a
href="https://redirect.github.com/PyMySQL/PyMySQL/pull/1152">PyMySQL/PyMySQL#1152</a></li>
<li>chore(deps): update github/codeql-action action to v3 by <a
href="https://github.com/renovate"><code>@​renovate</code></a> in <a
href="https://redirect.github.com/PyMySQL/PyMySQL/pull/1154">PyMySQL/PyMySQL#1154</a></li>
<li>chore(deps): update codecov/codecov-action action to v4 by <a
href="https://github.com/renovate"><code>@​renovate</code></a> in <a
href="https://redirect.github.com/PyMySQL/PyMySQL/pull/1158">PyMySQL/PyMySQL#1158</a></li>
<li>Support error packet without sqlstate by <a
href="https://github.com/methane"><code>@​methane</code></a> in <a
href="https://redirect.github.com/PyMySQL/PyMySQL/pull/1160">PyMySQL/PyMySQL#1160</a></li>
<li>test json - mariadb without JSON type by <a
href="https://github.com/grooverdan"><code>@​grooverdan</code></a> in <a
href="https://redirect.github.com/PyMySQL/PyMySQL/pull/1165">PyMySQL/PyMySQL#1165</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/hugovk"><code>@​hugovk</code></a> made
their first contribution in <a
href="https://redirect.github.com/PyMySQL/PyMySQL/pull/1134">PyMySQL/PyMySQL#1134</a></li>
<li><a href="https://github.com/svaskov"><code>@​svaskov</code></a> made
their first contribution in <a
href="https://redirect.github.com/PyMySQL/PyMySQL/pull/1145">PyMySQL/PyMySQL#1145</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/PyMySQL/PyMySQL/compare/v1.1.0...v1.1.1">https://github.com/PyMySQL/PyMySQL/compare/v1.1.0...v1.1.1</a></p>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/PyMySQL/PyMySQL/blob/main/CHANGELOG.md">pymysql's
changelog</a>.</em></p>
<blockquote>
<h2>v1.1.1</h2>
<p>Release date: 2024-05-21</p>
<blockquote>
<p>[!WARNING]
This release fixes a vulnerability (CVE-2024-36039).
All users are recommended to update to this version.</p>
<p>If you can not update soon, check the input value from
untrusted source has an expected type. Only dict input
from untrusted source can be an attack vector.</p>
</blockquote>
<ul>
<li>Prohibit dict parameter for <code>Cursor.execute()</code>. It didn't
produce valid SQL
and might cause SQL injection. (CVE-2024-36039)</li>
<li>Added ssl_key_password param. <a
href="https://redirect.github.com/PyMySQL/PyMySQL/issues/1145">#1145</a></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/PyMySQL/PyMySQL/commit/2cab9ecc641e962565c6254a5091f90c47f59b35"><code>2cab9ec</code></a>
v1.1.1</li>
<li><a
href="https://github.com/PyMySQL/PyMySQL/commit/521e40050cb386a499f68f483fefd144c493053c"><code>521e400</code></a>
forbid dict parameter</li>
<li><a
href="https://github.com/PyMySQL/PyMySQL/commit/7f032a699d55340f05101deb4d7d4f63db4adc11"><code>7f032a6</code></a>
remove coveralls from requirements</li>
<li><a
href="https://github.com/PyMySQL/PyMySQL/commit/69f6c7439bee14784e0ea70ae107af6446cc0c67"><code>69f6c74</code></a>
ruff format</li>
<li><a
href="https://github.com/PyMySQL/PyMySQL/commit/b4ed6884a1105df0a27f948f52b3e81d5585634f"><code>b4ed688</code></a>
test json - mariadb without JSON type (<a
href="https://redirect.github.com/PyMySQL/PyMySQL/issues/1165">#1165</a>)</li>
<li><a
href="https://github.com/PyMySQL/PyMySQL/commit/bbd049f40db9c696574ce6f31669880042c56d79"><code>bbd049f</code></a>
Support error packet without sqlstate (<a
href="https://redirect.github.com/PyMySQL/PyMySQL/issues/1160">#1160</a>)</li>
<li><a
href="https://github.com/PyMySQL/PyMySQL/commit/9694747ae619e88b792a8e0b4c08036572452584"><code>9694747</code></a>
pyupgrade</li>
<li><a
href="https://github.com/PyMySQL/PyMySQL/commit/1f0b7856de4008e7e4c1e8c1b215d5d4dfaecd1a"><code>1f0b785</code></a>
chore(deps): update codecov/codecov-action action to v4 (<a
href="https://redirect.github.com/PyMySQL/PyMySQL/issues/1158">#1158</a>)</li>
<li><a
href="https://github.com/PyMySQL/PyMySQL/commit/1e28be81c24dde66f8acbf4c5e24f60d6b5e72e7"><code>1e28be8</code></a>
chore(deps): update github/codeql-action action to v3 (<a
href="https://redirect.github.com/PyMySQL/PyMySQL/issues/1154">#1154</a>)</li>
<li><a
href="https://github.com/PyMySQL/PyMySQL/commit/f13f054abcc18b39855a760a84be0a517f0da658"><code>f13f054</code></a>
chore(deps): update actions/setup-python action to v5 (<a
href="https://redirect.github.com/PyMySQL/PyMySQL/issues/1152">#1152</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/PyMySQL/PyMySQL/compare/v1.1.0...v1.1.1">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=pymysql&package-manager=pip&previous-version=1.1.0&new-version=1.1.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/mediawiki-client-tools/mediawiki-dump-generator/network/alerts).

</details>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Using dumpgenerator.py with Python 3