Merge pull request #163 from ConorMacBride/improve-subtests

Improve `tests/subtests`
matplotlib · Jun 14, 2022 · 1c58479 · 1c58479
2 parents b06a22c + 34fa1bb
commit 1c58479
Show file tree

Hide file tree

Showing 36 changed files with 601 additions and 612 deletions.
diff --git a/setup.cfg b/setup.cfg
@@ -30,6 +30,7 @@ install_requires =
     importlib_resources;python_version<'3.8'
     packaging
     Jinja2
+    Pillow
 
 [options.entry_points]
 pytest11 =

diff --git a/tests/subtests/README.rst b/tests/subtests/README.rst
@@ -0,0 +1,78 @@
+Testing ``pytest-mpl`` using the ``tests/subtests``
+**************************************************
+
+``pytest-mpl`` can output JSON summaries (``--mpl-generate-summary=json``) which contain lots of machine readable information relating to the internal state of the plugin while it was run.
+This test module (``test_subtest.py``) runs the test file ``subtest.py`` multiple times with different combinations of
+``pytest-mpl`` arguments.
+After each test, it compares the outputted JSON summary to a "baseline" JSON summary for that specific combination of arguments (``summaries/*.json``).
+
+These tests are very sensitive to deviations in the documented behaviour of the ``pytest-mpl`` configuration arguments.
+And the exact behaviour of each comparison mode (such as images, hashes or both) can be asserted.
+If the format of the hash libraries or the baseline summaries are changed, ``test_subtest.py`` and ``helpers.py`` may require modifications.
+
+By using various helper functions defined in ``helpers.py``, the baseline summaries are not specific to the MPL/FreeType versions.
+This is implemented through regex in the log output, and by replacing baseline hashes with hashes in a version specific baseline hash library ``hashes/*.json`` and replacing result hashes with hashes in a version specific "baseline" result hash library ``result_hashes/*.json``.
+The baseline images used for the image comparison tests are included in ``baseline/*.png``.
+
+Generating baseline data
+========================
+
+The baseline image, hashes and summaries are generated automatically without the need to manually set the data which should fail the tests which are expected to fail.
+All of the test names should follow the existing convention (e.g., ``test_hdiff_imatch``), including one flag from both of the categories below.
+This ensures the script generates the correct baseline data which should achieve the expected test result.
+Full details on how the baselines are modified for each case are given below:
+
+**Hash comparison status flags:**
+
+:``hmatch``: Hash comparison must pass, so same hash in baseline and result hash libraries.
+
+:``hdiff``: Hash comparison must fail, so baseline hash is set to the same as the result hash except the first four characters are changed to ``d1ff``.
+
+:``hmissing``: Baseline hash must be missing, so baseline hash is deleted from the baseline hash library but not the result hash library.
+
+**Image comparison status flags:**
+
+:``imatch``: Image comparison must pass, so correct image is included in the baseline directory.
+
+:``idiff``: Image comparison must fail, so baseline image is edited to include a red cross such that the RMS is greater than the tolerance.
+
+:``idiffshape``: Image comparison must fail due to a different shape, so baseline image is resized to be half the generated width and height before saving.
+
+:``imissing``: Baseline image must be missing, so baseline image is deleted from the baseline directory.
+
+Generating for each version of matplotlib
+-----------------------------------------
+
+Baseline data should be generated for each version of matplotlib separately.
+For each version of matplotlib (defined within the tox environments in ``tox.ini``), follow the three steps in this section. (Only update one version at a time.)
+
+So the baseline data can be recreated easily, do not make any manual adjustments to the generated files.
+Instead, updates the functions which generate the baseline data.
+
+To generate the baseline hashes, result hashes and baseline images run the following command.
+If you are generating for a new version of matplotlib, create empty files such as ``hashes/mpl39_ft261.json`` and ``result_hashes/mpl39_ft261.json`` so it knows you require hashes for this version.
+
+::
+
+  MPL_UPDATE_BASELINE=1 tox -e <envname>
+
+Make sure this command runs without any failures or errors.
+Inspect the generated data to ensure it looks correct, and ``git add``.
+Then generate baseline summaries for the baseline hashes and images by running:
+
+::
+
+  MPL_UPDATE_SUMMARY=1 tox -e <envname>
+
+This will update/create baseline summaries in the ``summaries`` directory.
+Make sure this command runs without any failures or errors.
+It is very important that you check every change made to the baseline summaries as these summaries define how the plugin should be running internally for each test, for each plugin configuration.
+If the summaries are correct, ``git add``.
+
+Now run tox normally to ensure the tests pass:
+
+::
+
+  tox -e <envname>
+
+If the tests pass, ``git commit`` the updated baselines.
diff --git a/tests/subtests/baseline/test_hdiff_idiff.png b/tests/subtests/baseline/test_hdiff_idiff.png
diff --git a/tests/subtests/baseline/test_hdiff_idiff_tolerance.png b/tests/subtests/baseline/test_hdiff_idiff_tolerance.png
diff --git a/tests/subtests/baseline/test_hdiff_idiffshape.png b/tests/subtests/baseline/test_hdiff_idiffshape.png
diff --git a/tests/subtests/baseline/test_hdiff_imatch.png b/tests/subtests/baseline/test_hdiff_imatch.png
diff --git a/tests/subtests/baseline/test_hdiff_imatch_removetext.png b/tests/subtests/baseline/test_hdiff_imatch_removetext.png
diff --git a/tests/subtests/baseline/test_hdiff_imatch_savefig.png b/tests/subtests/baseline/test_hdiff_imatch_savefig.png
diff --git a/tests/subtests/baseline/test_hdiff_imatch_style.png b/tests/subtests/baseline/test_hdiff_imatch_style.png
diff --git a/tests/subtests/baseline/test_hdiff_imatch_tolerance.png b/tests/subtests/baseline/test_hdiff_imatch_tolerance.png
diff --git a/tests/subtests/baseline/test_hmatch_idiff.png b/tests/subtests/baseline/test_hmatch_idiff.png
diff --git a/tests/subtests/baseline/test_hmatch_idiffshape.png b/tests/subtests/baseline/test_hmatch_idiffshape.png
diff --git a/tests/subtests/baseline/test_hmatch_imatch.png b/tests/subtests/baseline/test_hmatch_imatch.png
diff --git a/tests/subtests/baseline/test_hmissing_idiff.png b/tests/subtests/baseline/test_hmissing_idiff.png
diff --git a/tests/subtests/baseline/test_hmissing_idiffshape.png b/tests/subtests/baseline/test_hmissing_idiffshape.png
diff --git a/tests/subtests/baseline/test_hmissing_imatch.png b/tests/subtests/baseline/test_hmissing_imatch.png
diff --git a/tests/subtests/hashes/mpl33_ft261.json b/tests/subtests/hashes/mpl33_ft261.json
@@ -1,15 +1,15 @@
 {
-  "subtests.subtest.test_hmatch_imatch": "42c391b37022e2c4edb53f5fd988e94f421905b40cea1544e62ffb3c049292a8",
-  "subtests.subtest.test_hmatch_idiff": "c14ba098dbda0988e35be5724ffb15b8e666253a4b37dec6a21203607c17473d",
-  "subtests.subtest.test_hmatch_idiffshape": "d23fa57068c6888307575623e5bdbe5e577d935910fee8d41deab426677acecb",
-  "subtests.subtest.test_hmatch_imissing": "6c07931bac1a926c88bea5d07c40c8c1ce30648712e3fc963028193863e3ae65",
-  "subtests.subtest.test_hdiff_imatch": "d1ff383721a0c395c856302be7de8a8138a2693651425dc181ede262860aef7b",
-  "subtests.subtest.test_hdiff_idiff": "d1fff55ace5ef7e45dcd9913b54e0d9970028cae59666e937ccb3586d0f76e9a",
-  "subtests.subtest.test_hdiff_idiffshape": "d1ff76e20951e78fd3dedfff6a6f8f2eab4c569860d1a0da7867114cdcdf7c2c",
-  "subtests.subtest.test_hdiff_imissing": "d1ff35845c5887c034230e02aa4b60e053c779c693867e4803e1d72dde9240f7",
-  "subtests.subtest.test_hdiff_imatch_tolerance": "d1ff6912989a4b47ea910b04edfa58cf5d756d60825ea52ad59dcde8e03d4d8b",
-  "subtests.subtest.test_hdiff_idiff_tolerance": "d1ff6912989a4b47ea910b04edfa58cf5d756d60825ea52ad59dcde8e03d4d8b",
-  "subtests.subtest.test_hdiff_imatch_savefig": "d1ff5dc3f9e8acda06b0097ee893819be62ca9adbbcca7d2300602f079a93b92",
-  "subtests.subtest.test_hdiff_imatch_style": "d1ff7692747ec72d3c8669cdb3d66468426b83ecf49a214cd918b8f5a0752a1f",
-  "subtests.subtest.test_hdiff_imatch_removetext": "d1ff0d60d794a7cdfec884463c4fe14612ab1fe7fda4bc7fa702c8f1615e1539"
+  "subtests.subtest.test_hmatch_imatch": "d21af7f9a2c1cbaf3c9bca3598f1b32b36891ac9d5db47e81a7bcaa342f7d4fc",
+  "subtests.subtest.test_hmatch_idiff": "085fcb22e9d6cfbb2bb6e0efbf749fa598be27e837c348130adc21a6dc2fc5fe",
+  "subtests.subtest.test_hmatch_idiffshape": "a8f866c3b765e274c217d49ba72c9ce3bd4b316491ffd34a124ef03643ce45b8",
+  "subtests.subtest.test_hmatch_imissing": "f06e910b6c80db28e1eb08fdb8e1ab9211434498c134d00820900a13a4f2568c",
+  "subtests.subtest.test_hdiff_imatch": "d1ff5c6bc631fbdaffa23d3d57fc027768fcded889f3b269941da859110ce282",
+  "subtests.subtest.test_hdiff_idiff": "d1ff014f73cdfea555e46a29aaac43c4394c3c4c21998e54971edb773eee6c95",
+  "subtests.subtest.test_hdiff_idiffshape": "d1ff3bafdcc8350c612bc925269fc4332dd9062a6399701067863b178568b219",
+  "subtests.subtest.test_hdiff_imissing": "d1ffd5868d14547557653c051d23d3fd48d198d3f59006dc5ba390433d6670ff",
+  "subtests.subtest.test_hdiff_imatch_tolerance": "d1ffa66a7c02ae64c8b2512021e0450cbe64c084c9d5f7e2600a7342a559c0b1",
+  "subtests.subtest.test_hdiff_idiff_tolerance": "d1ffa66a7c02ae64c8b2512021e0450cbe64c084c9d5f7e2600a7342a559c0b1",
+  "subtests.subtest.test_hdiff_imatch_savefig": "d1ff14c35f1da18de3f4ceb1901501e5a8a5a0d18eb8a7b4db5cfde170b57423",
+  "subtests.subtest.test_hdiff_imatch_style": "d1ffd00c4b99c6087d04f84ca071a5997b4ecf76cf859ce3548634e67841a79b",
+  "subtests.subtest.test_hdiff_imatch_removetext": "d1ffd7512c6d886262b1bcb4501374bfc61ef8569d24930b0258dab08e6eca9a"
 }
diff --git a/tests/subtests/hashes/mpl34_ft261.json b/tests/subtests/hashes/mpl34_ft261.json
@@ -1,15 +1,15 @@
 {
-  "subtests.subtest.test_hmatch_imatch": "573f4c1482192b7b15bbe4f2bd370ae3b5ab40c9afa441543b87e15a71e9f672",
-  "subtests.subtest.test_hmatch_idiff": "8b5c00365e6f784d5c8947091f09a92fd7d222e790431f297b4848f173e64857",
-  "subtests.subtest.test_hmatch_idiffshape": "2ee75301c4de2dcb9f839b278c6371be2e751de40b131213e375d4dcc5542382",
-  "subtests.subtest.test_hmatch_imissing": "fd069e642e3b154c24077a4996b545e1c4dbffdbed34ea5ad34c7b36873af68f",
-  "subtests.subtest.test_hdiff_imatch": "d1ffdde5a6682dc6abba7121f5df702c3664b1ce09593534fc0d7c3514eb07e1",
-  "subtests.subtest.test_hdiff_idiff": "d1ff61bdd0efd1cdd343eabf73af6f20439d4834ab5503a574ac7ec28e0c2b43",
-  "subtests.subtest.test_hdiff_idiffshape": "d1ffae8ab2b65de3fa297be17ce973ff871e703c9550679e9566179dd785f6eb",
-  "subtests.subtest.test_hdiff_imissing": "d1ff63d656d7a586cc4e498bc32b970f8cb7c7c47bbd2fec33b931219fc0690e",
-  "subtests.subtest.test_hdiff_imatch_tolerance": "d1ffe85fda98298347c274adae98ca7728f9bb2444ca8a49295145b0727b8c96",
-  "subtests.subtest.test_hdiff_idiff_tolerance": "d1ffe85fda98298347c274adae98ca7728f9bb2444ca8a49295145b0727b8c96",
-  "subtests.subtest.test_hdiff_imatch_savefig": "d1ffc2c68c2d34c03a89ab394e3c11349b76594d0c8837374daef299ac227568",
-  "subtests.subtest.test_hdiff_imatch_style": "d1ffd1b702c7bbd810370b12e46ecea4b9c9eb87b743397f1d4a50177e7ba7f7",
-  "subtests.subtest.test_hdiff_imatch_removetext": "d1fff83a43cb89f5e13923f532fe5c9bedbf7d13585533efef2f4051c4968b5e"
+  "subtests.subtest.test_hmatch_imatch": "d21af7f9a2c1cbaf3c9bca3598f1b32b36891ac9d5db47e81a7bcaa342f7d4fc",
+  "subtests.subtest.test_hmatch_idiff": "085fcb22e9d6cfbb2bb6e0efbf749fa598be27e837c348130adc21a6dc2fc5fe",
+  "subtests.subtest.test_hmatch_idiffshape": "a8f866c3b765e274c217d49ba72c9ce3bd4b316491ffd34a124ef03643ce45b8",
+  "subtests.subtest.test_hmatch_imissing": "f06e910b6c80db28e1eb08fdb8e1ab9211434498c134d00820900a13a4f2568c",
+  "subtests.subtest.test_hdiff_imatch": "d1ff5c6bc631fbdaffa23d3d57fc027768fcded889f3b269941da859110ce282",
+  "subtests.subtest.test_hdiff_idiff": "d1ff014f73cdfea555e46a29aaac43c4394c3c4c21998e54971edb773eee6c95",
+  "subtests.subtest.test_hdiff_idiffshape": "d1ff3bafdcc8350c612bc925269fc4332dd9062a6399701067863b178568b219",
+  "subtests.subtest.test_hdiff_imissing": "d1ffd5868d14547557653c051d23d3fd48d198d3f59006dc5ba390433d6670ff",
+  "subtests.subtest.test_hdiff_imatch_tolerance": "d1ffa66a7c02ae64c8b2512021e0450cbe64c084c9d5f7e2600a7342a559c0b1",
+  "subtests.subtest.test_hdiff_idiff_tolerance": "d1ffa66a7c02ae64c8b2512021e0450cbe64c084c9d5f7e2600a7342a559c0b1",
+  "subtests.subtest.test_hdiff_imatch_savefig": "d1ff14c35f1da18de3f4ceb1901501e5a8a5a0d18eb8a7b4db5cfde170b57423",
+  "subtests.subtest.test_hdiff_imatch_style": "d1ffd00c4b99c6087d04f84ca071a5997b4ecf76cf859ce3548634e67841a79b",
+  "subtests.subtest.test_hdiff_imatch_removetext": "d1ffd7512c6d886262b1bcb4501374bfc61ef8569d24930b0258dab08e6eca9a"
 }
diff --git a/tests/subtests/hashes/mpl35_ft261.json b/tests/subtests/hashes/mpl35_ft261.json
@@ -1,15 +1,15 @@
 {
-  "subtests.subtest.test_hmatch_imatch": "4a47c9b7920779cc83eabe2bbb64b9c40745d9d8abfa57857f93a5d8f12a5a03",
-  "subtests.subtest.test_hmatch_idiff": "2b48790b0a2cee4b41cdb9820336acaf229ba811ae21c6a92b4b92838843adfa",
-  "subtests.subtest.test_hmatch_idiffshape": "e3fed4ad2d22aff2cd771c5503dcb30c6161b21d154430ededa5faa1ec54366e",
-  "subtests.subtest.test_hmatch_imissing": "e937fa1997d088c904ca35b1ab542e2285ea47b84df976490380f9c5f5b5f8ae",
-  "subtests.subtest.test_hdiff_imatch": "d1ff8f315d44b06de8f45d937af46a67bd1389edd6e4cde32f9feb4b7472284f",
-  "subtests.subtest.test_hdiff_idiff": "d1ff21206ef454a25417e3ba0bd3235c84518cb202c2d1fa7afcfdfcde5fdcde",
-  "subtests.subtest.test_hdiff_idiffshape": "d1ff745bbdf2aac6743dbd830afb1877c2ac25a5f926d4f6483c1e24d19a0580",
-  "subtests.subtest.test_hdiff_imissing": "d1ff11cfa34db3a5819ac4127704e86acf27d24d1ea2410718853d3d7e1d6ae0",
-  "subtests.subtest.test_hdiff_imatch_tolerance": "d1ff3273d63a2a26a27e788ff0f090e86c9df7f9f191b7c566321c57de8266d6",
-  "subtests.subtest.test_hdiff_idiff_tolerance": "d1ff3273d63a2a26a27e788ff0f090e86c9df7f9f191b7c566321c57de8266d6",
-  "subtests.subtest.test_hdiff_imatch_savefig": "d1ff803a4b4026d8c6dc0ab950228793ea255cd9b6c629c39db9e6315a9af6bc",
-  "subtests.subtest.test_hdiff_imatch_style": "d1ffde36c2bad7dca131e4cbbfe229f882b5beec62750fb7da29314fd6a1ff13",
-  "subtests.subtest.test_hdiff_imatch_removetext": "d1ff6cf613c6836c1b1202abaae69cf65bc2232a8e31ab1040454bedc8e31e7a"
+  "subtests.subtest.test_hmatch_imatch": "d21af7f9a2c1cbaf3c9bca3598f1b32b36891ac9d5db47e81a7bcaa342f7d4fc",
+  "subtests.subtest.test_hmatch_idiff": "085fcb22e9d6cfbb2bb6e0efbf749fa598be27e837c348130adc21a6dc2fc5fe",
+  "subtests.subtest.test_hmatch_idiffshape": "a8f866c3b765e274c217d49ba72c9ce3bd4b316491ffd34a124ef03643ce45b8",
+  "subtests.subtest.test_hmatch_imissing": "f06e910b6c80db28e1eb08fdb8e1ab9211434498c134d00820900a13a4f2568c",
+  "subtests.subtest.test_hdiff_imatch": "d1ff5c6bc631fbdaffa23d3d57fc027768fcded889f3b269941da859110ce282",
+  "subtests.subtest.test_hdiff_idiff": "d1ff014f73cdfea555e46a29aaac43c4394c3c4c21998e54971edb773eee6c95",
+  "subtests.subtest.test_hdiff_idiffshape": "d1ff3bafdcc8350c612bc925269fc4332dd9062a6399701067863b178568b219",
+  "subtests.subtest.test_hdiff_imissing": "d1ffd5868d14547557653c051d23d3fd48d198d3f59006dc5ba390433d6670ff",
+  "subtests.subtest.test_hdiff_imatch_tolerance": "d1ffa66a7c02ae64c8b2512021e0450cbe64c084c9d5f7e2600a7342a559c0b1",
+  "subtests.subtest.test_hdiff_idiff_tolerance": "d1ffa66a7c02ae64c8b2512021e0450cbe64c084c9d5f7e2600a7342a559c0b1",
+  "subtests.subtest.test_hdiff_imatch_savefig": "d1ff14c35f1da18de3f4ceb1901501e5a8a5a0d18eb8a7b4db5cfde170b57423",
+  "subtests.subtest.test_hdiff_imatch_style": "d1ffd00c4b99c6087d04f84ca071a5997b4ecf76cf859ce3548634e67841a79b",
+  "subtests.subtest.test_hdiff_imatch_removetext": "d1ffd7512c6d886262b1bcb4501374bfc61ef8569d24930b0258dab08e6eca9a"
 }
diff --git a/tests/subtests/helpers.py b/tests/subtests/helpers.py
@@ -1,8 +1,12 @@
+import os
 import re
 import json
 from pathlib import Path
 
-__all__ = ['diff_summary', 'assert_existence', 'patch_summary', 'apply_regex']
+from PIL import Image, ImageDraw
+
+__all__ = ['diff_summary', 'assert_existence', 'patch_summary', 'apply_regex',
+           'remove_specific_hashes', 'transform_hashes', 'transform_images']
 
 
 class MatchError(Exception):
@@ -255,3 +259,86 @@ def apply_regex(file, regex_paths, regex_strs):
 
     with open(file, 'w') as f:
         json.dump(summary, f, indent=2)
+
+
+def remove_specific_hashes(summary_file):
+    """Replace all hashes in a summary file with placeholder values.
+
+    This is done because the actual hashes used for testing are taken from
+    separate files for each specific matplotlib version.
+    """
+
+    baseline_placeholder = "###_BASELINE_HASH_###"
+    result_placeholder = "###_RESULT_HASH_###"
+
+    with open(summary_file, "r") as f:
+        summary = json.load(f)
+
+    for test in summary.keys():
+
+        # Get actual hashes
+        baseline = summary[test]["baseline_hash"]
+        result = summary[test]["result_hash"]
+
+        # Replace with placeholders (if summary has hashes)
+        if baseline is not None:
+            summary[test]["baseline_hash"] = baseline_placeholder
+            summary[test]["status_msg"] = \
+                summary[test]["status_msg"].replace(baseline, baseline_placeholder)
+        if result is not None:
+            summary[test]["result_hash"] = result_placeholder
+            summary[test]["status_msg"] = \
+                summary[test]["status_msg"].replace(result, result_placeholder)
+
+    with open(summary_file, "w") as f:
+        json.dump(summary, f, indent=2)
+
+
+def transform_hashes(hash_file):
+    """Make hash comparison tests fail correctly.
+
+    Makes hashes of tests *hdiff* in hash_file fail hash comparison
+    and remove *hmissing* hashes that should be missing.
+    """
+
+    with open(hash_file, "r") as f:
+        hashes = json.load(f)
+
+    for test in list(hashes.keys()):
+        h = hashes[test]
+        if "hdiff" in test and h is not None:
+            # Replace first four letters with d1ff to force mismatch
+            hashes[test] = "d1ff" + h[4:]
+        if "hmissing" in test and h is not None:
+            # Remove hashes that should be missing
+            del hashes[test]
+
+    with open(hash_file, "w") as f:
+        json.dump(hashes, f, indent=2)
+
+
+def transform_images(baseline_path):
+    """Make image comparison tests fail correctly.
+
+    Makes images of tests *idiff* under baseline_path fail image comparison
+    and deletes images for *imissing* tests.
+    """
+
+    # Delete imissing files
+    for file in baseline_path.glob("**/*imissing*.png"):
+        file.unlink()
+
+    # Add red cross to idiff files
+    for file in baseline_path.glob("**/*idiff*.png"):
+        with Image.open(file) as im:
+            draw = ImageDraw.Draw(im)
+            draw.line((0, 0) + im.size, "#f00", 3)
+            draw.line((0, im.size[1], im.size[0], 0), "#f00", 3)
+            im.save(file)
+
+    # Resize idiffshape files
+    for file in baseline_path.glob("**/*idiffshape*.png"):
+        with Image.open(file) as im:
+            (width, height) = (im.width // 2, im.height // 2)
+            im_resized = im.resize((width, height))
+            im_resized.save(file)