Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent exposure of multi-value parameters #292

Open
marcfrederick opened this issue Feb 20, 2023 · 1 comment
Open

Inconsistent exposure of multi-value parameters #292

marcfrederick opened this issue Feb 20, 2023 · 1 comment

Comments

@marcfrederick
Copy link
Member

marcfrederick commented Feb 20, 2023

Problem

We are currently exposing MediaWiki API parameters, that allow for multiple values, inconsistently. In some cases, we expect a str of the form param1|param2, whereas in other cases, we expect a list. This inconsistency is confusing and there seems to be no clear logic behind when we use which approach.

Example

Site.revisions is an example where two multi-value parameters are used. However, one of them (revids) expects a list, while the other (prop) expects a str.

def revisions(self, revids, prop='ids|timestamp|flags|comment|user'):
    ... 

Proposed Solution

We should standardize our approach to multi-value API parameters and be consistent around the library. IMO the cleanest (and backwards compatible) way to resolve this would be, to consistently accept both types for multi-value parameters (i.e. str and list[str]) and have a single function that converts them as necessary before being passed to the API.

From a quick look at API:Data formats, this function could look something like the following:

def encode_multivalue(value):
    if isinstance(value, str) or not isinstance(value, Iterable):
        # Strings and values that consist of a single item are returned as-is to 
        # maintain backwards-compatibility.
        encoded_value = value
    elif not any('|' in x for x in value):
        # Parameters that take multiple values are normally submitted with the values
        # separated using the pipe character.
        encoded_value = '|'.join(value)
    else:
        # If a value must contain the pipe character, use U+001F (Unit Separator) as the
        # separator and prefix the value with U+001F.
        encoded_value = '\x1f' + '\x1f'.join(value)

    return encoded_value
@AdamWill
Copy link
Member

yep, that sounds like the right way to do it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants