Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expected failing test_feature_availability_profiler tests on Linux (others?) #449

Open
jlitzingerdev opened this issue Dec 1, 2017 · 1 comment
Labels

Comments

@jlitzingerdev
Copy link
Contributor

jlitzingerdev commented Dec 1, 2017

This is likely more a question than an issue, but an issue seemed more appropriate than StackOverflow for a unit test. With current master (installed in a virtualenv with all dependencies) I get a few errors/failures, one each of which are in test_feature_availability_profiler. Specifically:

healthcareai.tests.test_feature_availability_profiler.TestFeatureAvailabilityProfiler
healthcareai.tests.test_feature_availability_profiler.TestFeatureAvailabilityProfilerError3

raise exceptions about the fact that the elements are not date types, instead of the expected exception.

I noticed some changes in this area in cb4c162, are the failing tests expected in master, or is it likely a platform issue?

Debugging follows, feel free to ignore

Digging in, it looks as though feature_availability_profiler wants to verify the dtype of the Series is a datetime64[ns], yet since the initial type for the only element is int, the dtype becomes an object once datetimes are mixed in, whereas if it is instantiated only with datetimes the error goes away...my quick hackery:

    def setUp(self):
        self.df = pd.DataFrame(np.random.randn(1000, 2),
                               columns=['AdmitDTS',
                                        'LastLoadDTS'])
        # generate load date
        self.df['LastLoadDTS'] = pd.datetime(2015, 5, 20)
        # generate datetime objects for admit date
        delta = pd.datetime(2015, 5, 20) - pd.datetime(2015, 5, 1)
        int_delta = (delta.days * 24 * 60 * 60) + delta.seconds

        def test_time(random_second):
            return pd.datetime(2015, 5, 1) + timedelta(seconds=random_second)

        admit = [test_time(randrange(int_delta)) for _ in range(1000)]
        self.df['AdmitDTS'] = pd.Series.from_array(admit)
@Aylr Aylr added the bug med label Jan 22, 2018
@jlitzingerdev
Copy link
Contributor Author

Proposed: #467

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants