Support for yaml databags #912

e8johan · 2021-06-05T21:00:27Z

Description of Changes

Wrote at least one-line docstrings (for any new functions)
Added unit test(s) covering the changes (if testable)

Included a screenshot or animation (if affecting the UI, see Licecap)
Link to corresponding documentation pull request for getlektor.com

Added support for yaml databags, and added a test case for databags in general (testing ini, json, and yaml).

Documentation update: lektor/lektor-website#321

The databag test case checks so that a single text value can be read from databags of the supported formats (ini, json, and yaml).

lektor/databags.py

dairiki · 2022-02-24T05:40:39Z

lektor/databags.py

@@ -2,6 +2,7 @@
 import json
 import os
 from collections import OrderedDict
+import yaml


Need to add PyYAML to install_requires.

(Also, should install/run pre-commit, which, via reorder-python-imports will, err, reorder these imports.)

dairiki · 2022-02-24T15:22:27Z

lektor/databags.py

 def load_databag(filename):
    try:
        if filename.endswith(".json"):
            with open(filename, "r") as f:
                return json.load(f, object_pairs_hook=OrderedDict)
        elif filename.endswith(".ini"):
            return decode_flat_data(IniFile(filename).items(), dict_cls=OrderedDict)
+        elif filename.endswith(".yaml"):
+            with open(filename, "r") as f:
+                return ordered_load(f, yaml.SafeLoader)
        else:
            return None
    except (OSError, IOError) as e:


I realize this isn't something touched by your PR, but might as well clean it up here...

This could be more cleanly written:

except FileNotFoundError: return None

dairiki · 2022-02-24T15:42:41Z

lektor/databags.py

+# By: https://stackoverflow.com/users/650222/coldfix
+# License: CC-BY-SA 4.0 (https://stackoverflow.com/help/licensing)
+# URL: https://stackoverflow.com/questions/5121931/in-python-how-can-you-load-yaml-mappings-as-ordereddicts/
+def ordered_load(stream, Loader=yaml.SafeLoader, object_pairs_hook=OrderedDict):


I'd suggest doing away with ordered_load and constructing the OrderedLoader class just once. E.g, at top level:

class _OrderedLoader(yaml.SafeLoader): def construct_ordered_dict(self, node): self.flatten_mapping(node) return OrderedDict(self.construct_pairs(node)) _OrderedLoader.add_constructor( yaml.resolver.BaseResolver.DEFAULT_MAPPING_TAG, _OrderedLoader.construct_ordered_dict, )

Then, in load_databag (below), call yaml.load(stream, _OrderedLoader) directly.

I don't think this ordered_load hack is necessary at all (the same is written in the linked stackoverflow question). All dicts are ordered since CPython 3.6 (implementation-wise) or Python 3.7 (language spec).

All dicts are ordered since CPython 3.6 (implementation-wise) or Python 3.7 (language spec).

True, but there are still subtle differences, e.g. equality comparison for OrderedDict is order sensitive, while for dict it is not, reversed() iteration does not work for dict before python3.8 (but works for OrderedDict). In general, using OrderedDict seems more explicit when one really does care about order.

Currently, our load_databag is careful to parse to OrderedDicts for JSON and INI databags.

I guess I'd be happy either way, so long as we have tests that verify that order is preserved.

dairiki · 2022-02-24T15:44:01Z

lektor/databags.py

 def load_databag(filename):
    try:
        if filename.endswith(".json"):
            with open(filename, "r") as f:
                return json.load(f, object_pairs_hook=OrderedDict)
        elif filename.endswith(".ini"):
            return decode_flat_data(IniFile(filename).items(), dict_cls=OrderedDict)
+        elif filename.endswith(".yaml"):
+            with open(filename, "r") as f:
+                return ordered_load(f, yaml.SafeLoader)


Per suggestion in previous comment, this becomes

return yaml.load(stream, _OrderedLoader)

dairiki · 2022-02-24T16:02:12Z

tests/test_databag.py

+from lektor.environment import Environment
+from lektor.project import Project
+
+


+1 for the tests, but I would suggest these tests be converted to unit tests.

Test the data loading functionality with tests that call lektor.databags.load_databag directly.

If feeling ambitious, test the methods of the Databags class by constructing a Databags instance and calling the methods directly.

This makes it easier to check for specific issues (e.g., is the parsing to OrderedDicts working correctly?) It allows for more tests (to check various aspects of loading) while keeping total test run time low.

If you want to include an integration test to ensure that the bag() works in a template, that's okay. My personal feeling is that is the lowest priority of the tests — if that's broken, someone is likely to notice, even without the test in place.

dairiki · 2022-02-24T16:19:36Z

tests/test_databag.py

+from lektor.project import Project
+
+
+def get_databag_builder(tmpdir):


As a pytest style note, this could be structured as a fixture.

@pytest.fixture def databag_pad(): project = Project.from_path(Path(__file__).parent / "databag-project") return project.make_env(load_plugins=False).new_pad() @pytest.fixture def databag_builder(databag_pad, tmp_path): return Builder(databag_pad, tmp_path / "output") def test_databag_project_folder(databag_pad, databag_builder): ...

Or, perhaps, just override the project fixture (defined in conftest.py) which would allow for the direct use of the pad and builder fixtures from conftest.py:

@pytest.fixture def project(): return Project.from_path(Path(__file__).parent / "databag-project") def test_databag_project_folder(pad, builder): ...

nixjdm · 2022-03-03T00:08:42Z

If it's too much work to see this PR through the finish line, I think you can disregard this for now:

Having dealt with some very large amounts of yaml before, I've come to really prefer people not use the pyyaml lib, in favor of ruamel.yaml. The primary reason being that pyyaml is not yaml 1.2 compliant, which was published all the way back in 2009, and I've tripped on that and seen others trip on it many times. They aren't wildly different, but it's enough for me to prefer ruamel. Databags generally won't be very complicated, but if it's all the same, I'd rather help people avoid the pitfalls of yaml 1.1.

Secondarily, ruamel supports round-tripping yaml in a way that preserves comments, though I'm not sure that would matter here. At least it doesn't right now.

e8johan added 2 commits June 5, 2021 21:44

Add support for yaml databags

60a2877

Add tests for databags

41a4fec

The databag test case checks so that a single text value can be read from databags of the supported formats (ini, json, and yaml).

dairiki reviewed Aug 13, 2021

View reviewed changes

lektor/databags.py Outdated Show resolved Hide resolved

Fix typo

bbce5e8

dairiki force-pushed the yaml-databags branch from d361e88 to bbce5e8 Compare February 24, 2022 15:10

dairiki reviewed Feb 24, 2022

View reviewed changes

dairiki force-pushed the master branch from 4aaae00 to 1999510 Compare March 1, 2022 16:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for yaml databags #912

Support for yaml databags #912

e8johan commented Jun 5, 2021 •

edited

dairiki Feb 24, 2022

dairiki Feb 24, 2022

dairiki Feb 24, 2022

yagebu Feb 24, 2022

dairiki Feb 24, 2022

dairiki Feb 24, 2022

dairiki Feb 24, 2022

dairiki Feb 24, 2022

nixjdm commented Mar 3, 2022

		from lektor.environment import Environment
		from lektor.project import Project

		from lektor.project import Project


		def get_databag_builder(tmpdir):

Support for yaml databags #912

Are you sure you want to change the base?

Support for yaml databags #912

Conversation

e8johan commented Jun 5, 2021 • edited

Description of Changes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nixjdm commented Mar 3, 2022

e8johan commented Jun 5, 2021 •

edited