I'm getting this error: original_activities_str = row["activities"] KeyError: 'activities' #25

Tetizera-zz · 2022-02-21T03:41:46Z

Not sure what this is about. I had another error shown to me in a .csv that I edited with Excel. This one, however, is a .csv that I edited with Excel without transforming the data with a PowerQuery, just used Ctrl + F to replace some of the strings.

(venv) Tet@DESKTOP: make csv

python -m dayliopy.clean_csv
Reading CSV: C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy/var/data_in/daylio_export.csv
Traceback (most recent call last):
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.2800.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.2800.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\clean_csv.py", line 227, in <module>
    main()
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\clean_csv.py", line 223, in main
    process(csv_in, csv_out)
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\clean_csv.py", line 207, in process
    available_activities, in_data = read_csv(csv_in)
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\clean_csv.py", line 151, in read_csv
    activities_list = process_activities(original_activities_str)
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\clean_csv.py", line 124, in process_activities
    activities_split = activities_str.split(" | ")
AttributeError: 'NoneType' object has no attribute 'split'
make: *** [Makefile:52: csv] Error 1

Here's my file for debugging:

daylio_export.csv

Additional information:

Windows 10
Python 3.9.10
pip 21.2.4
GNU make 4.3

The text was updated successfully, but these errors were encountered:

MichaelCurrin · 2022-02-22T11:24:38Z

Thanks I'll look into this

MichaelCurrin · 2022-02-22T17:18:55Z

The CSV looks badly formatted and I can't tell if editing with Excel caused that.

Can you send your original CSV and I can see if the issue is there?

When using bat, I found you have a unicode character at the start, and quoting is applied to the whole row (note yellow quotes at start and end of row).

I pushed some code to help you with debugging. Here is my result for make csv. Note how most of the values are None and the row is squashed into the first column full_date.

ValueError: The activities column is present but blank - fix the formatting of your CSV - row: 
  {'full_date': '2022-02-20,February 20,Sunday,20:00,good,"","",""', 'date': None, 
  'weekday': None, 'time': None, 'mood': None, 'activities': None, 
   'note_title': None, 'note': None}
make: *** [csv] Error 1

MichaelCurrin · 2022-02-22T17:19:52Z

I also added a bit so that if activities column is missing, it logs the columns it can see.

Tetizera-zz · 2022-02-22T18:48:41Z

I'll check if my csv will work this time, but here's the original file with no modifications
daylio_export_2022_02_22.csv

EDIT: I can report that this .csv + your code fixes worked. I created config.local.conf with the proper values too.

I had one problem with make mood. The error was UnicodeDecodeError: 'utf-8' codec can't decode byte 0xed in position 83: invalid continuation byte, which led me to this StackOverflow page where the solution was to change the encoding.

What is was: df = pd.read_csv(csv_in_path, usecols=["mood_label", "mood_score"])

What I added: df = pd.read_csv(csv_in_path, usecols=["mood_label", "mood_score"], encoding = 'latin-1')

I happen to have an issue with make fit, which I'll describe in the next comment

Tetizera-zz · 2022-02-22T20:02:00Z

So, I had the same encoding issue with make fit, so I changed the .py file. I still get errors though.

python -m dayliopy.fit_model
Traceback (most recent call last):
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.2800.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.2800.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\fit_model.py", line 107, in <module>
    main()
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\fit_model.py", line 100, in main
    model = fit(csv_in_path)
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\fit_model.py", line 84, in fit
    encoded_df = prepare_data(df)
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\fit_model.py", line 53, in prepare_data
    df.set_index("datetime", inplace=True, verify_integrity=True)
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\venv\lib\site-packages\pandas\util\_decorators.py", line 311, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\venv\lib\site-packages\pandas\core\frame.py", line 5510, in set_index
    raise ValueError(f"Index has duplicate keys: {duplicates}")
ValueError: Index has duplicate keys: DatetimeIndex(['2021-09-26 20:00:00'], dtype='datetime64[ns]', name='datetime', freq=None)
make: *** [Makefile:58: fit] Error 1```

MichaelCurrin · 2022-02-23T12:35:16Z

Thanks I'll look at adding the latin encoding myself. Thanks for your file.

Regarding your last comment - the issue is you appear to have two records with the exact same date and time. Would you be willing to change one of your records for 26 Sep with a different time and then run your export again, to avoid that?

MichaelCurrin · 2022-02-26T08:43:35Z

@Tetizera will you review #25 to fix encoding issue, please?

MichaelCurrin · 2022-02-26T08:53:36Z

I found a way to stop the duplicate error

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.set_index.html

Will you review PR #27 for me please?

MichaelCurrin · 2022-02-26T08:56:25Z

Also for the unicode error for make mood

The error on StackOverflow was for Python 2.

Are you on Python 3? If not, can you make sure you run on Python 3 on master and see if the error goes away?

MichaelCurrin mentioned this issue Feb 26, 2022

fix: add latin-1 encoding #26

Open

MichaelCurrin mentioned this issue Feb 26, 2022

fix: remove verify_integrity flag in fit_model.py #27

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I'm getting this error: original_activities_str = row["activities"] KeyError: 'activities' #25

I'm getting this error: original_activities_str = row["activities"] KeyError: 'activities' #25

Tetizera-zz commented Feb 21, 2022 •

edited

MichaelCurrin commented Feb 22, 2022

MichaelCurrin commented Feb 22, 2022

MichaelCurrin commented Feb 22, 2022

Tetizera-zz commented Feb 22, 2022 •

edited

Tetizera-zz commented Feb 22, 2022

MichaelCurrin commented Feb 23, 2022

MichaelCurrin commented Feb 26, 2022

MichaelCurrin commented Feb 26, 2022

MichaelCurrin commented Feb 26, 2022

I'm getting this error: original_activities_str = row["activities"] KeyError: 'activities' #25

I'm getting this error: original_activities_str = row["activities"] KeyError: 'activities' #25

Comments

Tetizera-zz commented Feb 21, 2022 • edited

MichaelCurrin commented Feb 22, 2022

MichaelCurrin commented Feb 22, 2022

MichaelCurrin commented Feb 22, 2022

Tetizera-zz commented Feb 22, 2022 • edited

Tetizera-zz commented Feb 22, 2022

MichaelCurrin commented Feb 23, 2022

MichaelCurrin commented Feb 26, 2022

MichaelCurrin commented Feb 26, 2022

MichaelCurrin commented Feb 26, 2022

Tetizera-zz commented Feb 21, 2022 •

edited

Tetizera-zz commented Feb 22, 2022 •

edited