Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

geography_as_object=True fails if mode="REPEATED" #1213

Closed
bnaul opened this issue Apr 12, 2022 · 1 comment · Fixed by #1220
Closed

geography_as_object=True fails if mode="REPEATED" #1213

bnaul opened this issue Apr 12, 2022 · 1 comment · Fixed by #1220
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. priority: p3 Desirable enhancement or fix. May not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@bnaul
Copy link
Contributor

bnaul commented Apr 12, 2022

Small oversight I think from #848:

[ins] In [57]: bigquery.Client().query("SELECT [ST_GEOGPOINT(-100, 30)] x").to_dataframe()
Out[57]:
                  x
0  [POINT(-100 30)]

[ins] In [58]: bigquery.Client().query("SELECT [ST_GEOGPOINT(-100, 30)] x").to_dataframe(geography_as_object=T
          ...: rue)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-58-90093b9d3663> in <module>
----> 1 bigquery.Client().query("SELECT [ST_GEOGPOINT(-100, 30)] x").to_dataframe(geography_as_object=True)

~/model/.venv/lib/python3.9/site-packages/google/cloud/bigquery/job/query.py in to_dataframe(self, bqstorage_client, dtypes, progress_bar_type, create_bqstorage_client, max_results, geography_as_object)
   1682         """
   1683         query_result = wait_for_query(self, progress_bar_type, max_results=max_results)
-> 1684         return query_result.to_dataframe(
   1685             bqstorage_client=bqstorage_client,
   1686             dtypes=dtypes,

~/model/.venv/lib/python3.9/site-packages/google/cloud/bigquery/table.py in to_dataframe(self, bqstorage_client, dtypes, progress_bar_type, create_bqstorage_client, geography_as_object)
   1984             for field in self.schema:
   1985                 if field.field_type.upper() == "GEOGRAPHY":
-> 1986                     df[field.name] = df[field.name].dropna().apply(_read_wkt)
   1987
   1988         return df

~/model/.venv/lib/python3.9/site-packages/pandas/core/series.py in apply(self, func, convert_dtype, args, **kwargs)
   4431         dtype: float64
   4432         """
-> 4433         return SeriesApply(self, func, convert_dtype, args, kwargs).apply()
   4434
   4435     def _reduce(

~/model/.venv/lib/python3.9/site-packages/pandas/core/apply.py in apply(self)
   1080             return self.apply_str()
   1081
-> 1082         return self.apply_standard()
   1083
   1084     def agg(self):

~/model/.venv/lib/python3.9/site-packages/pandas/core/apply.py in apply_standard(self)
   1135                 # List[Union[Callable[..., Any], str]]]]]"; expected
   1136                 # "Callable[[Any], Any]"
-> 1137                 mapped = lib.map_infer(
   1138                     values,
   1139                     f,  # type: ignore[arg-type]

~/model/.venv/lib/python3.9/site-packages/pandas/_libs/lib.pyx in pandas._libs.lib.map_infer()

~/model/.venv/lib/python3.9/site-packages/shapely/geos.py in read(self, text)
    288         """Returns geometry from WKT"""
    289         if not isinstance(text, str):
--> 290             raise TypeError("Only str is accepted.")
    291         text = text.encode()
    292         c_string = c_char_p(text)

TypeError: Only str is accepted.

Probably worth doing the iteration over the items and converting each one, but even just skipping on nested instead of failing also seems fine (that would just be a one-liner). @jimfulton @tswast any preference?

EDIT: I updated the title/example here to focus on the geography_as_object flag rather than the to_geodataframe method, since that applies to both methods and is where the actual error occurs.

@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery API. label Apr 12, 2022
@yoshi-automation yoshi-automation added the triage me I really want to be triaged. label Apr 13, 2022
@bnaul bnaul changed the title to_geodataframe() fails if mode="REPEATED" geography_as_object=True fails if mode="REPEATED" Apr 13, 2022
@tswast
Copy link
Contributor

tswast commented Apr 13, 2022

Skipping the conversion if mode="REPEATED" sounds like a decent option to me.

@meredithslota meredithslota added type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. priority: p3 Desirable enhancement or fix. May not be included in next release. and removed triage me I really want to be triaged. labels Apr 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. priority: p3 Desirable enhancement or fix. May not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants