GeoAccessor.from_table results in empty array #1784

georaleigh · 2024-04-01T18:03:14Z

I'm not sure if this is a bug or not but when I run the command to import my csv into the GeoAccessor I end up with an Empty DataFrame.

I have a script that I'm using to sort through a very large feature layer with thousands of records and create a summary table of them by type. It also calculates the count of records of the last 5 years to the current date for each year rather than the entire calendar year and then calculates the average. My output table is the type, average, and current count for the year. Right now I have this outputting to a CSV but I'm trying to get that CSV to overwrite a hosted feature table. I've referenced some posts on the Esri Community forums and it seems like this should be possible but I keep getting an error.

Here's the CSV I'm using to overwrite a hosted table in AGOL:
trendstable_test.csv

Here is my code:

import arcpy, os, uuid
import pandas as pd
import datetime
from datetime import timedelta
from arcgis.features import GeoAccessor, FeatureLayer
from arcgis.gis import GIS

# variables

url = 'arcgis online url'
username = 'username'
password = 'password'
hostedTableID = 'item ID'
inputfc = r'feature layer from sde on server'
outputCSVFile = r'csv in folder on server'

# get table from AGOL
gis = GIS(url, username, password)
CrimeTrendsTable = gis.content.get(hostedTableID)
CrimeTrendsTableLyr = CrimeTrendsTable.tables[0]
crimeLayer = FeatureLayer(CrimeTrendsTableLyr.url, gis=gis)

# truncate table
crimeLayer.manager.truncate()

# import feature class and create slice of the data set in a new dataframe
df = pd.DataFrame.spatial.from_featureclass(inputfc)
df = df.sort_values(['OFFENSES_YEAR', 'CRIME_STAT_TYPE', 'FROM_DATE'], ascending= [True, True, True], ignore_index=True)
df = df.loc[:, ['OFFENSES_YEAR', 'CRIME_STAT_TYPE', 'FROM_DATE']]

# set variables for today and the previous 5 years
today = datetime.datetime.today()
todate1 = today - timedelta(365)
todate2 = today - timedelta(730)
todate3 = today - timedelta(1095)
todate4 = today - timedelta(1460)
todate5 = today - timedelta(1825)
this_year = datetime.datetime.today().year
one_year_ago = this_year - 1
two_years_ago = this_year - 2
three_years_ago = this_year - 3
four_years_ago = this_year - 4
five_years_ago = this_year - 5

# create new dataframes for current and each previous year to date
df0 = df.loc[(df['OFFENSES_YEAR']==this_year) & (df['FROM_DATE'] < today)]
df1 = df.loc[(df['OFFENSES_YEAR']==one_year_ago) & (df['FROM_DATE'] < todate1)]
df2 = df.loc[(df['OFFENSES_YEAR']==two_years_ago) & (df['FROM_DATE'] < todate2)]
df3 = df.loc[(df['OFFENSES_YEAR']==three_years_ago) & (df['FROM_DATE'] < todate3)]
df4 = df.loc[(df['OFFENSES_YEAR']==four_years_ago) & (df['FROM_DATE'] < todate4)]
df5 = df.loc[(df['OFFENSES_YEAR']==five_years_ago) & (df['FROM_DATE'] < todate5)]

# create tables for each current and previous year grouped by crime type, add together previous year tables and average
table0 = df0.groupby('CRIME_STAT_TYPE')['OFFENSES_YEAR'].count().reset_index()\
            .rename(columns={"OFFENSES_YEAR" : this_year})
table1 = df1.groupby('CRIME_STAT_TYPE')['OFFENSES_YEAR'].count()
table2 = df2.groupby('CRIME_STAT_TYPE')['OFFENSES_YEAR'].count()
table3 = df3.groupby('CRIME_STAT_TYPE')['OFFENSES_YEAR'].count()
table4 = df4.groupby('CRIME_STAT_TYPE')['OFFENSES_YEAR'].count()
table5 = df5.groupby('CRIME_STAT_TYPE')['OFFENSES_YEAR'].count()
avgtable = pd.concat([table1, table2, table3, table4, table5]).groupby('CRIME_STAT_TYPE').mean().reset_index()\
             .rename(columns={"OFFENSES_YEAR" : "previous5yearAverage"})
trendstable = avgtable.join(table0.set_index('CRIME_STAT_TYPE'), on='CRIME_STAT_TYPE').fillna(0)
trendstable.to_csv(outputCSVFile, sep='\t', encoding='utf-8')

# update hosted table from csv file
csvDF = GeoAccessor.from_table(outputCSVFile)
adds_fs = csvDF.spatial.to_featureset()
adds_dict = adds_fs.to_dict()
adds = adds_dict["features"]
crimeLayer.edit_features(adds=adds)

This is the error I'm getting when I run the adds_fs = csvDF.spatial.to_featureset() line:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
In  [60]:
Line 1:     adds_fs = csvDF.spatial.to_featureset()

File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\arcgis\features\geo\_accessor.py, in to_featureset:
Line 3573:  return FeatureSet.from_dict(self.__feature_set__)

File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\arcgis\features\geo\_accessor.py, in __feature_set__:
Line 3297:  if self.sr is None:

File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\arcgis\features\geo\_accessor.py, in sr:
Line 3503:  for g in self._data[self.name]

File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\pandas\core\frame.py, in __getitem__:
Line 3505:  indexer = self.columns.get_loc(key)

File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\pandas\core\indexes\base.py, in get_loc:
Line 3631:  raise KeyError(key) from err

KeyError: None
---------------------------------------------------------------------------

I also tried it this way:

csvDF = pd.read_csv(outputCSVFile, sep=',')
adds = csvDF.to_dict("records")
crimeLayer.edit_features(adds=adds)

And received this error:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
In  [60]:
Line 1:     adds_fs = csvDF.spatial.to_featureset()

File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\arcgis\features\geo\_accessor.py, in to_featureset:
Line 3573:  return FeatureSet.from_dict(self.__feature_set__)

File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\arcgis\features\geo\_accessor.py, in __feature_set__:
Line 3297:  if self.sr is None:

File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\arcgis\features\geo\_accessor.py, in sr:
Line 3503:  for g in self._data[self.name]

File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\pandas\core\frame.py, in __getitem__:
Line 3505:  indexer = self.columns.get_loc(key)

File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\pandas\core\indexes\base.py, in get_loc:
Line 3631:  raise KeyError(key) from err

KeyError: None
---------------------------------------------------------------------------

Here's a screenshot of the table. I initially thought maybe the issue was with the index or delimiter but I tried removing the index and the header and I'm still getting an empty dataframe.

Please let me know if this is a bug or not. I'd like for the data in the csv to overwrite the table in AGOL. Thanks in advance!

System Info:

OS: Windows Server 2016
Browser: Chrome
Python API Version: 3.9.16 [MSC v.1931 64 bit (AMD64)]

nanaeaubry · 2024-04-02T06:22:18Z

@georaleigh

Following our guides here I can read in your table data:
https://developers.arcgis.com/python/guide/part2-data-io-reading-data/#read-in-non-spatial-table-data

georaleigh · 2024-04-02T15:53:27Z

@nanaeaubry thanks for giving this a shot in your environment. I wonder if it's an issue with my version of Pro or Python. Can you share the code you used to test reading in my csv so that I can try it?

Also, I did find a work around for now by using XLS instead. Here's the code I'm using:

trendstable.to_excel(outputXLSFile)
trendstable = arcpy.ExcelToTable_conversion(outputXLSFile, outputTable, 'Sheet1')
trendstable = arcpy.management.Append(inputs=[outputTable], target=CrimeTrendsTableLyr.url)[0]

I'm not sure if the issue I'm having with the CSV is the delimiter, separator, header or something else in how CSVs are created and read back in. During my research I found some weird behavior with the CSV. For example, when I open it in Excel all the data would appear fine. Same for if I viewed it using the print command in Python. However, when trying to read it in using an arcpy command or open it as a table in ArcGIS Pro some columns have null data. Using XLS instead worked great so I'm glad to have a solution but would like to understand the cause better at some point and why I can't get this to work using a CSV.

nanaeaubry · 2024-04-02T17:03:05Z

@georaleigh
Sure so I also have arcpy in my environment and here is the code I used:

from arcgis.gis import GIS
import pandas as pd
gis = GIS(profile = "your_online_profile")
tbl_df = pd.DataFrame.spatial.from_table(filename=r'C:\workspace\trendstable_test.csv')
tbl_df.head(2)

I have python 3.9 and using our latest api version

You can check the version of the python api by doing:

import arcgis
print(arcgis.__version__)

Maybe this is an issue in python api version 2.2.0.2? I will try to test with older versions and see what happens.

georaleigh added the bug label Apr 1, 2024

nanaeaubry added question and removed bug labels Apr 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GeoAccessor.from_table results in empty array #1784

GeoAccessor.from_table results in empty array #1784

georaleigh commented Apr 1, 2024

nanaeaubry commented Apr 2, 2024

georaleigh commented Apr 2, 2024

nanaeaubry commented Apr 2, 2024

GeoAccessor.from_table results in empty array #1784

GeoAccessor.from_table results in empty array #1784

Comments

georaleigh commented Apr 1, 2024

nanaeaubry commented Apr 2, 2024

georaleigh commented Apr 2, 2024

nanaeaubry commented Apr 2, 2024