Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I'm getting this error: original_activities_str = row["activities"] KeyError: 'activities' #25

Open
Tetizera-zz opened this issue Feb 21, 2022 · 9 comments

Comments

@Tetizera-zz
Copy link

Tetizera-zz commented Feb 21, 2022

Not sure what this is about. I had another error shown to me in a .csv that I edited with Excel. This one, however, is a .csv that I edited with Excel without transforming the data with a PowerQuery, just used Ctrl + F to replace some of the strings.


(venv) Tet@DESKTOP: make csv

python -m dayliopy.clean_csv
Reading CSV: C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy/var/data_in/daylio_export.csv
Traceback (most recent call last):
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.2800.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.2800.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\clean_csv.py", line 227, in <module>
    main()
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\clean_csv.py", line 223, in main
    process(csv_in, csv_out)
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\clean_csv.py", line 207, in process
    available_activities, in_data = read_csv(csv_in)
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\clean_csv.py", line 151, in read_csv
    activities_list = process_activities(original_activities_str)
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\clean_csv.py", line 124, in process_activities
    activities_split = activities_str.split(" | ")
AttributeError: 'NoneType' object has no attribute 'split'
make: *** [Makefile:52: csv] Error 1

Here's my file for debugging:

Additional information:

Windows 10
Python 3.9.10
pip 21.2.4
GNU make 4.3

@MichaelCurrin
Copy link
Owner

Thanks I'll look into this

@MichaelCurrin
Copy link
Owner

The CSV looks badly formatted and I can't tell if editing with Excel caused that.

Can you send your original CSV and I can see if the issue is there?

When using bat, I found you have a unicode character at the start, and quoting is applied to the whole row (note yellow quotes at start and end of row).

Screen Shot 2022-02-22 at 7 07 41 pm


I pushed some code to help you with debugging. Here is my result for make csv. Note how most of the values are None and the row is squashed into the first column full_date.

ValueError: The activities column is present but blank - fix the formatting of your CSV - row: 
  {'full_date': '2022-02-20,February 20,Sunday,20:00,good,"","",""', 'date': None, 
  'weekday': None, 'time': None, 'mood': None, 'activities': None, 
   'note_title': None, 'note': None}
make: *** [csv] Error 1

@MichaelCurrin
Copy link
Owner

I also added a bit so that if activities column is missing, it logs the columns it can see.

@Tetizera-zz
Copy link
Author

Tetizera-zz commented Feb 22, 2022

I'll check if my csv will work this time, but here's the original file with no modifications
daylio_export_2022_02_22.csv

EDIT: I can report that this .csv + your code fixes worked. I created config.local.conf with the proper values too.

I had one problem with make mood. The error was UnicodeDecodeError: 'utf-8' codec can't decode byte 0xed in position 83: invalid continuation byte, which led me to this StackOverflow page where the solution was to change the encoding.

What is was: df = pd.read_csv(csv_in_path, usecols=["mood_label", "mood_score"])

What I added: df = pd.read_csv(csv_in_path, usecols=["mood_label", "mood_score"], encoding = 'latin-1')

I happen to have an issue with make fit, which I'll describe in the next comment

@Tetizera-zz
Copy link
Author

So, I had the same encoding issue with make fit, so I changed the .py file. I still get errors though.

python -m dayliopy.fit_model
Traceback (most recent call last):
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.2800.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.2800.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\fit_model.py", line 107, in <module>
    main()
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\fit_model.py", line 100, in main
    model = fit(csv_in_path)
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\fit_model.py", line 84, in fit
    encoded_df = prepare_data(df)
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\dayliopy\fit_model.py", line 53, in prepare_data
    df.set_index("datetime", inplace=True, verify_integrity=True)
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\venv\lib\site-packages\pandas\util\_decorators.py", line 311, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\Tet\OneDrive\Documentos\github-gitlab\daylio-csv-parser\venv\lib\site-packages\pandas\core\frame.py", line 5510, in set_index
    raise ValueError(f"Index has duplicate keys: {duplicates}")
ValueError: Index has duplicate keys: DatetimeIndex(['2021-09-26 20:00:00'], dtype='datetime64[ns]', name='datetime', freq=None)
make: *** [Makefile:58: fit] Error 1```

@MichaelCurrin
Copy link
Owner

Thanks I'll look at adding the latin encoding myself. Thanks for your file.

Regarding your last comment - the issue is you appear to have two records with the exact same date and time. Would you be willing to change one of your records for 26 Sep with a different time and then run your export again, to avoid that?

@MichaelCurrin
Copy link
Owner

@Tetizera will you review #25 to fix encoding issue, please?

@MichaelCurrin
Copy link
Owner

I found a way to stop the duplicate error

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.set_index.html

Will you review PR #27 for me please?

@MichaelCurrin
Copy link
Owner

Also for the unicode error for make mood

The error on StackOverflow was for Python 2.

Are you on Python 3? If not, can you make sure you run on Python 3 on master and see if the error goes away?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants