New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for yaml databags #912
base: master
Are you sure you want to change the base?
Conversation
The databag test case checks so that a single text value can be read from databags of the supported formats (ini, json, and yaml).
d361e88
to
bbce5e8
Compare
@@ -2,6 +2,7 @@ | |||
import json | |||
import os | |||
from collections import OrderedDict | |||
import yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to add PyYAML to install_requires.
(Also, should install/run pre-commit, which, via reorder-python-imports will, err, reorder these imports.)
def load_databag(filename): | ||
try: | ||
if filename.endswith(".json"): | ||
with open(filename, "r") as f: | ||
return json.load(f, object_pairs_hook=OrderedDict) | ||
elif filename.endswith(".ini"): | ||
return decode_flat_data(IniFile(filename).items(), dict_cls=OrderedDict) | ||
elif filename.endswith(".yaml"): | ||
with open(filename, "r") as f: | ||
return ordered_load(f, yaml.SafeLoader) | ||
else: | ||
return None | ||
except (OSError, IOError) as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize this isn't something touched by your PR, but might as well clean it up here...
This could be more cleanly written:
except FileNotFoundError:
return None
# By: https://stackoverflow.com/users/650222/coldfix | ||
# License: CC-BY-SA 4.0 (https://stackoverflow.com/help/licensing) | ||
# URL: https://stackoverflow.com/questions/5121931/in-python-how-can-you-load-yaml-mappings-as-ordereddicts/ | ||
def ordered_load(stream, Loader=yaml.SafeLoader, object_pairs_hook=OrderedDict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest doing away with ordered_load
and constructing the OrderedLoader
class just once. E.g, at top level:
class _OrderedLoader(yaml.SafeLoader):
def construct_ordered_dict(self, node):
self.flatten_mapping(node)
return OrderedDict(self.construct_pairs(node))
_OrderedLoader.add_constructor(
yaml.resolver.BaseResolver.DEFAULT_MAPPING_TAG,
_OrderedLoader.construct_ordered_dict,
)
Then, in load_databag
(below), call yaml.load(stream, _OrderedLoader)
directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this ordered_load hack is necessary at all (the same is written in the linked stackoverflow question). All dicts are ordered since CPython 3.6 (implementation-wise) or Python 3.7 (language spec).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All dicts are ordered since CPython 3.6 (implementation-wise) or Python 3.7 (language spec).
True, but there are still subtle differences, e.g. equality comparison for OrderedDict
is order sensitive, while for dict
it is not, reversed()
iteration does not work for dict
before python3.8 (but works for OrderedDict
). In general, using OrderedDict
seems more explicit when one really does care about order.
Currently, our load_databag
is careful to parse to OrderedDict
s for JSON and INI databags.
I guess I'd be happy either way, so long as we have tests that verify that order is preserved.
def load_databag(filename): | ||
try: | ||
if filename.endswith(".json"): | ||
with open(filename, "r") as f: | ||
return json.load(f, object_pairs_hook=OrderedDict) | ||
elif filename.endswith(".ini"): | ||
return decode_flat_data(IniFile(filename).items(), dict_cls=OrderedDict) | ||
elif filename.endswith(".yaml"): | ||
with open(filename, "r") as f: | ||
return ordered_load(f, yaml.SafeLoader) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per suggestion in previous comment, this becomes
return yaml.load(stream, _OrderedLoader)
from lektor.environment import Environment | ||
from lektor.project import Project | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for the tests, but I would suggest these tests be converted to unit tests.
Test the data loading functionality with tests that call lektor.databags.load_databag
directly.
If feeling ambitious, test the methods of the Databags
class by constructing a Databags
instance and calling the methods directly.
This makes it easier to check for specific issues (e.g., is the parsing to OrderedDicts working correctly?) It allows for more tests (to check various aspects of loading) while keeping total test run time low.
If you want to include an integration test to ensure that the bag()
works in a template, that's okay. My personal feeling is that is the lowest priority of the tests — if that's broken, someone is likely to notice, even without the test in place.
from lektor.project import Project | ||
|
||
|
||
def get_databag_builder(tmpdir): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a pytest style note, this could be structured as a fixture.
@pytest.fixture
def databag_pad():
project = Project.from_path(Path(__file__).parent / "databag-project")
return project.make_env(load_plugins=False).new_pad()
@pytest.fixture
def databag_builder(databag_pad, tmp_path):
return Builder(databag_pad, tmp_path / "output")
def test_databag_project_folder(databag_pad, databag_builder):
...
Or, perhaps, just override the project
fixture (defined in conftest.py
) which would allow for the direct use of the pad
and builder
fixtures from conftest.py
:
@pytest.fixture
def project():
return Project.from_path(Path(__file__).parent / "databag-project")
def test_databag_project_folder(pad, builder):
...
If it's too much work to see this PR through the finish line, I think you can disregard this for now: Having dealt with some very large amounts of yaml before, I've come to really prefer people not use the pyyaml lib, in favor of ruamel.yaml. The primary reason being that pyyaml is not yaml 1.2 compliant, which was published all the way back in 2009, and I've tripped on that and seen others trip on it many times. They aren't wildly different, but it's enough for me to prefer ruamel. Databags generally won't be very complicated, but if it's all the same, I'd rather help people avoid the pitfalls of yaml 1.1. Secondarily, ruamel supports round-tripping yaml in a way that preserves comments, though I'm not sure that would matter here. At least it doesn't right now. |
Description of Changes
Added support for yaml databags, and added a test case for databags in general (testing ini, json, and yaml).
Documentation update: lektor/lektor-website#321