Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collector service cannot process np.nan, np.inf, np.NINF values passed to endpoints #1031

Open
c0t0ber opened this issue Mar 18, 2024 · 0 comments

Comments

@c0t0ber
Copy link
Contributor

c0t0ber commented Mar 18, 2024

import uuid

import numpy as np
import pandas as pd

from evidently.collector.client import CollectorClient
from evidently.collector.config import RowsCountTrigger, CollectorConfig, ReportConfig
from evidently.metrics import DatasetMissingValuesMetric
from evidently.report import Report


missing_vals = (np.nan, np.inf, np.NINF)

for bug_val in missing_vals:
    rep = Report(
            metrics=[
                DatasetMissingValuesMetric(missing_values=[bug_val]),
            ],
        )
    rep.run(reference_data=None, current_data=pd.DataFrame())
    conf = CollectorConfig(
        trigger=RowsCountTrigger(),
        report_config=ReportConfig.from_report(rep),
        project_id=str(uuid.uuid4()),
    )
    collector_id = str(uuid.uuid4())
    collector_client = CollectorClient("http://localhost:8001")
    try:
        collector_client.create_collector(collector_id, conf)
    except Exception as e:
        print(e)

Client error: 400 Client Error: Bad Request for url: http://localhost:8001/7ff10857-346c-474f-b557-1fa1040d53fa
Collector error:

INFO:     127.0.0.1:61679 - "POST /7ff10857-346c-474f-b557-1fa1040d53fa HTTP/1.1" 400 Bad Request
ERROR - 2024-03-18 21:21:40,458 - litestar - config - exception raised on http connection to route /7ff10857-346c-474f-b557-1fa1040d53fa

Traceback (most recent call last):
  File "evidently\venv38\lib\site-packages\litestar\serialization\msgspec_hooks.py", line 186, in decode_json
    return _msgspec_json_decoder.decode(value)
msgspec.DecodeError: JSON is malformed: invalid character (byte 306)

litestar.exceptions.base_exceptions.SerializationException: JSON is malformed: invalid character (byte 306)

The issue is related to litestar using msgspec, which does not support np.nan, np.inf, np.NINF. Adding support for these values without substituting them, for example, with strings, is not feasible.

Possible solutions:

  • Discontinue support for np.inf, np.NINF values and use the pyarrow backend for pandas, which will convert nan to None and subsequently be processed normally.
  • Change the transport protocol, as nan and inf are not standards and are not supported by most libraries. As an alternative, msgpack can be considered.
  • A very controversial solution since it depends on the framework implementation, use fastapi, as it utilizes Python's built-in json encoder, which supports nan and inf.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant