Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Datatype coercion is not allowed" when creating session with custom timeseries array #262

Open
warrickball opened this issue May 26, 2021 · 3 comments

Comments

@warrickball
Copy link
Contributor

warrickball commented May 26, 2021

Here's a script that creates a basic timeseries of Gaussian noise in a 2×1000 array.

#!/usr/bin/env python3

import numpy as np
import pbjam

n = 1000
data = np.zeros((2,n))
data[0] = np.arange(n, dtype=float)
data[1] = np.random.randn(n)

s = pbjam.session(ID='mwe', numax=(100, 1), teff=(5000, 100), bp_rp=(0.7, 0.005), dnu=(5, 0.1), timeseries=data)

It fails with the following traceback:

Traceback (most recent call last):
  File "/home/wball/try/pbjam/mwe.py", line 11, in <module>
    s = pbjam.session(ID='mwe', numax=(100, 1), teff=(5000, 100), bp_rp=(0.7, 0.005), timeseries=data)
  File "/home/wball/pypi/PBjam/pbjam/session.py", line 572, in __init__
    _format_col(vardf, timeseries, 'timeseries')
  File "/home/wball/pypi/PBjam/pbjam/session.py", line 289, in _format_col
    vardf[key] = [_arr_to_lk(x, y, vardf['ID'][0], key)]
  File "/home/wball/.local/lib/python3.9/site-packages/pandas/core/frame.py", line 3163, in __setitem__
    self._set_item(key, value)
  File "/home/wball/.local/lib/python3.9/site-packages/pandas/core/frame.py", line 3239, in _set_item
    value = self._sanitize_column(key, value)
  File "/home/wball/.local/lib/python3.9/site-packages/pandas/core/frame.py", line 3899, in _sanitize_column
    value = maybe_convert_platform(value)
  File "/home/wball/.local/lib/python3.9/site-packages/pandas/core/dtypes/cast.py", line 112, in maybe_convert_platform
    values = construct_1d_object_array_from_listlike(values)
  File "/home/wball/.local/lib/python3.9/site-packages/pandas/core/dtypes/cast.py", line 1638, in construct_1d_object_array_from_listlike
    result[:] = values
  File "/home/wball/.local/lib/python3.9/site-packages/astropy/table/table.py", line 853, in __array__
    raise ValueError('Datatype coercion is not allowed')
ValueError: Datatype coercion is not allowed

I had a brief look around. The problem isn't the conversion of the timeseries into a Lightkurve object but rather when adding this to the vardf dataframe.

I just did a git pull so I'm using the top of master (commit 0c5591a). If any other versions are relevant, they are:

  • Python 3.9.4
  • NumPy 1.20.1
  • Pandas 1.2.0
  • Astropy 4.2
  • Lightkurve 2.0.9
@warrickball
Copy link
Contributor Author

@nielsenmb couldn't reproduce this in Python 3.7 and neither can I with Python 3.7.4. I do, however, hit this with Python 3.8.2 and

  • NumPy 1.20.3
  • Pandas 1.2.4
  • Astropy 4.2.1
  • Lightkurve 2.0.9

@warrickball
Copy link
Contributor Author

Creating the LightCurve object seems to be fine so I had a closer look at why assigning the timeseries that are downloaded via Lightkurve works but passing a custom timeseries doesn't. Mimicking the code in PBjam, I tried this

import numpy as np
import pandas as pd
import lightkurve as lk
import pbjam

n = 1000
data = np.zeros((2,n))
data[0] = np.arange(n, dtype=float)/720
data[1] = np.random.randn(n)

df = pd.DataFrame({'ID': np.array(['test']).reshape((-1,1)).flatten()})
df['timeseries'] = [lk.LightCurve(time=data[0], flux=data[1])]

which is similar to the code path followed for custom timeseries and reproduces the Datatype coercion is not allowed error message.

If I change the last line to this, which tries to be more like the path for downloaded objects (i.e. when you pass a string identifier), it appears to work:

df.at[0, 'timeseries'] = lk.LightCurve(time=data[0], flux=data[1], targetid='test')

@nielsenmb
Copy link
Collaborator

nielsenmb commented May 27, 2021

vardf.at[0, key] = _arr_to_lk(x, y, vardf['ID'][0], key) returns an error for me on Python 3.7, but

vardf[key] = object()
vardf.at[0, key] = _arr_to_lk(x, y, vardf['ID'][0], key)

seems to work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants