`TypeError: dcmread: Expected a file ... but got NoneType` test failures on `=pydicom-2.3.1` #1800

TheChymera · 2023-05-25T05:39:53Z

Full build log: https://ppb.chymera.eu/084827.log

Any ideas what's going on?

mrbean-bremen · 2023-05-25T05:53:10Z

At a first glance it looks like the test data cannot be accessed - maybe this warning:

_Warning: Package 'pydicom.data.test_files' is absent from the `packages` configuration.

has something to do with it.
Are you packaging a distribution?

TheChymera · 2023-05-25T05:58:41Z

@mrbean-bremen yes, I'm packaging it for Gentoo Linux. Is there perhaps a static testdata archive somewhere? The log above is using the PyPI tarball, I also tried it with the GH one, but I get the same errors.

mrbean-bremen · 2023-05-25T06:18:43Z

Yes, there is both testdata in the package (the one mentioned in the warning), and a static test data archive in a separate repository (pydicom/pydicom-data). I thought that the tests that use the external data are excluded if it cannot be accessed, but I may be wrong. I may have a closer look tonight (at work right now), or @darcymason may check this if he finds the time.

darcymason · 2023-05-25T13:17:54Z

Instead of trying to completely understand what is going on here, I have been wondering anyway about being more explicit for optional packages in pydicom's new pyproject.toml, which might help solve this.

Specifically, I'm thinking to options so one could pip install pydicom[test_files], or pip install pydicom[all], etc. although I'm not clear what the script does here. If it starts with the source tarball, then maybe that is not possible. Alternatively, can your packaging be done without running pydicom's tests?

I'm hoping to do a pydicom 2.4 release in the next week or so. If you could wait for that, we could ensure that it is configured correctly to work for your build.

I thought that the tests that use the external data are excluded if it cannot be accessed, but I may be wrong.

I'll try to have a look at that too later today if @mrbean-bremen doesn't get to it first

TheChymera · 2023-05-25T16:48:48Z

@darcymason yes, I can temporarily disable the tests. Can you ping me on this with the new release?

darcymason · 2023-05-25T17:26:17Z

I thought that the tests that use the external data are excluded if it cannot be accessed, but I may be wrong.

I've had a look at this and there is are classes of tests of the data manager code itself that skip if EXT_PYDICOM (whether pydicom-data installed) is True and another class if it is False, but no such check in the other tests AFAICT. I haven't followed it through completely, but I believe the files will just be downloaded as needed by each test.

darcymason · 2023-05-25T17:27:26Z

Can you ping me on this with the new release?

@TheChymera, sure, we will likely enlist your help pre-release, to make sure it is okay before publishing.

mrbean-bremen · 2023-05-25T19:28:51Z

Thanks @darcymason - I got distracted with other not pydicom-related stuff and didn't get to it.

I've had a look at this and there is are classes of tests of the data manager code itself that skip if EXT_PYDICOM (whether pydicom-data installed) is True and another class if it is False, but no such check in the other tests AFAICT.

Yes, you are right. It is probably the best to also skip all tests relying on this data, maybe using an auto-use fixture that detects if the files are available.
Speaking of fixtures - I also noticed a whole lot of pytest warnings about the obsolete setup method. We may also want to fix this. I'm not sure if I find the time before the release, but these should be easy things to do.

mrbean-bremen · 2023-05-27T07:38:43Z

@darcymason - I've started to try this out locally. Depending on an environment variable, get_test_data will skip the test if the data is not available and external data shall not be fetched (probably making this the default, so that for packaging it will never be attempted).
Unfortunately, there are a lot of get_test_data used globally in the test modules, which will not work for skipping single tests, so this will take a bit to change. Nevertheless, I think it is better to do this in a general way, so we don't have to mark each single test that uses external data. What do you think?

I will try to put something together over the weekend.

darcymason · 2023-05-27T14:56:18Z

What do you think?

I will try to put something together over the weekend.

Sure, if you are willing to come up with something, that would be great.

skip the test if the data is not available and external data shall not be fetched (probably making this the default, so that for packaging it will never be attempted).

I'm not sure how this should work -- it seems making the default skip a large number of tests might be awkward. How do we set that for our CI, etc.? Does that mean adding an environment variable change to all the workflows? How does that work for a user who just installed and wants to run the tests? -- in partial answer to my own question, I think we could have a [dev] extras option in the install, or [test], and we could add that to the contributing guidelines (and/or installation documentation)

darcymason · 2023-05-27T15:00:54Z

Unfortunately, there are a lot of get_test_data used globally in the test modules,

Forget to mention this in my previous comment. I did wonder whether we should simplify that anyway. I don't see the point of setting a global variable, e.g. rtplan_name = get_testdata_file("rtplan.dcm") when each test using it could just call the get_testdata_file directly. I doubt there is a significant time cost in getting the full filepath again for files that are used in multiple tests, and if even if there is, some kind of caching could be done to avoid that penalty.

darcymason · 2023-05-27T18:20:32Z

So, another thought...what if the tests simply required pydicom-data to be installed and no tests are run if it is not, or the user has to specify it is okay to install pydicom-data. I think all or almost all files are used in the tests, so it would be more efficient anyway to just get them all at once.

I'm not sure if that can work in this issue's build from a source tarball scenario ... but if there were a way to do something parallel to pip install[tests] or pip install[all] kind of behavior (maybe produce two different source builds - one with pydicom-data, one without), then perhaps it could work out.

mrbean-bremen · 2023-05-27T19:03:43Z

I'm not sure how this should work -- it seems making the default skip a large number of tests might be awkward.

Agreed.

I doubt there is a significant time cost in getting the full filepath again for files that are used in multiple tests, and if even if there is, some kind of caching could be done to avoid that penalty.

Yes, I ended up doing exactly this for many cases (specifically the pixel handler tests, as they use the names to parametrize the tests). I started by adding fixtures for the common cases, but I'm now not so sure if this is even needed.

So, another thought...what if the tests simply required pydicom-data to be installed and no tests are run if it is not, or the user has to specify it is okay to install pydicom-data.

You mean no tests at all are run? This would make some sense, as with my current implementation, only the tests with external data are skipped, which is a rather arbirary collection of tests. I'm inclined to throw away my local changes in favour of an easier way that will just skip all tests if pydicom-data is not available. I'm not sure about the best way yet...

darcymason · 2023-05-27T20:28:03Z

You mean no tests at all are run?

Yes, that is the suggestion. Throw an error message stating that testing requires pydicom-data to be installed. But as I said, I'm not sure how that can work in a source distribution as in this issue.

darcymason · 2023-06-02T20:49:16Z

@TheChymera, I'm attaching a source tarball from my latest attempts with the prep2.4 branch. Do you mind giving it a try and making sure it works in your environment? It should be the same as a normal sdist, just without the tests folder.

pydicom-2.4.0.tar.gz

This was referenced May 28, 2023

[WIP] Add possibility to switch off use of external data in tests #1801

Closed

Pydicom v3.0 packaging/testing #1802

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`TypeError: dcmread: Expected a file ... but got NoneType` test failures on `=pydicom-2.3.1` #1800

`TypeError: dcmread: Expected a file ... but got NoneType` test failures on `=pydicom-2.3.1` #1800

TheChymera commented May 25, 2023

mrbean-bremen commented May 25, 2023

TheChymera commented May 25, 2023

mrbean-bremen commented May 25, 2023

darcymason commented May 25, 2023

TheChymera commented May 25, 2023

darcymason commented May 25, 2023

darcymason commented May 25, 2023

mrbean-bremen commented May 25, 2023

mrbean-bremen commented May 27, 2023

darcymason commented May 27, 2023

darcymason commented May 27, 2023

darcymason commented May 27, 2023

mrbean-bremen commented May 27, 2023

darcymason commented May 27, 2023

darcymason commented Jun 2, 2023

TypeError: dcmread: Expected a file ... but got NoneType test failures on =pydicom-2.3.1 #1800

TypeError: dcmread: Expected a file ... but got NoneType test failures on =pydicom-2.3.1 #1800

Comments

TheChymera commented May 25, 2023

mrbean-bremen commented May 25, 2023

TheChymera commented May 25, 2023

mrbean-bremen commented May 25, 2023

darcymason commented May 25, 2023

TheChymera commented May 25, 2023

darcymason commented May 25, 2023

darcymason commented May 25, 2023

mrbean-bremen commented May 25, 2023

mrbean-bremen commented May 27, 2023

darcymason commented May 27, 2023

darcymason commented May 27, 2023

darcymason commented May 27, 2023

mrbean-bremen commented May 27, 2023

darcymason commented May 27, 2023

darcymason commented Jun 2, 2023

`TypeError: dcmread: Expected a file ... but got NoneType` test failures on `=pydicom-2.3.1` #1800

`TypeError: dcmread: Expected a file ... but got NoneType` test failures on `=pydicom-2.3.1` #1800