New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run python tests in Pyodide build #11914
Conversation
pytestmark = pytest.mark.skipif( | ||
platform.system() == "Emscripten", | ||
reason="Not supported on Emscripten", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is skipped even though technically it shouldn't be. Without the skipping this actually crashes the interpreter with a funky WASM stack trace about BigInt
. I can dig around in a follow-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, this is a clear improvement on current status, tests are run and failure are mapped.
@@ -64,6 +64,7 @@ def test_timestamp_timedelta(self): | |||
df_from_duck = duckdb.from_df(df).df() | |||
assert df_from_duck.equals(df) | |||
|
|||
@pytest.mark.xfail(condition=platform.system() == "Emscripten", reason="time zones not working") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should pass, happy to dig into it in a follow up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like timezones other than UTC don't seem to work, not sure why yet.
9e81771
to
99acc87
Compare
@Tishj Were there some spark-shim-api union tests added recently? Not sure why they are suddenly failing. |
Nope, looks like the last changes there were 8 months ago |
a7fed18
to
4161683
Compare
@carlopi Would you mind giving this a review? |
Not sure why some multithread tests are failing, I only changed whether the test skips instead of checking first. |
Failure might be related to the fact that |
Oh, I can look into it |
Hm, I don't see where |
Ah, found it |
Probably time someone added some basic static checks (other than formatting) to the python codebase 😅 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Main concern is non changing behaviour of Python regular tests, if those pass, I would mostly check with @Tishj on what he prefers for skipping the tests.
@@ -47,6 +47,7 @@ def test_timedelta_negative(self, duckdb_cursor): | |||
@pytest.mark.parametrize('minutes', [0, 60]) | |||
@pytest.mark.parametrize('hours', [0, 24]) | |||
@pytest.mark.parametrize('weeks', [0, 51]) | |||
@pytest.mark.skipif(platform.system() == "Emscripten", reason="Bind parameters are broken when running on Pyodide") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you expand on this?
It seems to be working on https://duckdb.github.io/duckdb-pyodide/console by doing:
import duckdb
duckdb.execute("SELECT $2::date - $1::date", ['2024-01-02', '2024-03-04'])
unsure if the problem is in the testing (possibly due to handling of timezones / else) or this is actually broken.
Which tests are you referring to? I don't think I changed any actual material test function bodies, I'm just skipping based some tests known to not work on Emscripten builds, unless you consider the pyarrow imports, which @Tishj had already said was desirable (replacing the top level import with an |
I meant that once regular Python CI is green (unsure if there might be other unforeseen surprises like pyarrow -> pa, it's basically good to go. |
Ah ok, great! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me! Thanks
I would say if @Tishj can give a second look to the tests, but if it makes sense to him this looks great. |
@@ -469,10 +469,12 @@ def test_read_csv_combined(self, duckdb_cursor): | |||
assert rel.columns == rel2.columns | |||
assert rel.types == rel2.types | |||
|
|||
def test_read_csv_names(self): | |||
def test_read_csv_names(self, tmp_path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not entirely confident this will be unique for every test
I have more faith in this fixture, do you mind using this?
@pytest.fixture(scope="function")
def temp_file_name(request, tmp_path_factory):
return str(tmp_path_factory.mktemp(request.function.__name__, numbered=True) / 'file.csv')
tmp_path_factory creates unique paths, and numbers them, + for good measure I included the name of the test as another unique token
We are creating more than one files in certain places, perhaps this fixture should return a generator then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not entirely confident this will be unique for every test
The pytest documentation for tmp_path
says almost verbatim exactly that, which is why I chose it :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation of tmp_path
is very similar to your proposed implementation (I removed the docstring and clean up code that follows the yield
):
def _mk_tmp(request: FixtureRequest, factory: TempPathFactory) -> Path:
name = request.node.name
name = re.sub(r"[\W]", "_", name)
MAXVAL = 30
name = name[:MAXVAL]
return factory.mktemp(name, numbered=True)
@fixture
def tmp_path(
request: FixtureRequest, tmp_path_factory: TempPathFactory
) -> Generator[Path, None, None]:
path = _mk_tmp(request, tmp_path_factory)
yield path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes and making it as less invasive as possible, looks good to me 👍
Thanks! |
Merge pull request duckdb/duckdb#12081 from Maxxen/type-metadata-redux Merge pull request duckdb/duckdb#11914 from cpcloud/run-pyodide-tests
This PR now runs as much of the Python test suite under Pyodide as possible, to give some confidence about its functionality.