Using `data` for providing new data (as df) in `print()` or `plot()` yields errors #91

hneth · 2022-09-04T12:46:25Z

In previous versions of FFTrees, it was possible to supply data frames as data when plotting (and perhaps printing) FFTrees objects. Thus, the data argument in print() and plot() was serving multiple functions by either being used to switch between existing "train" vs. "test" data or, alternatively, accept a new data frame (of test data).

In the latter case, the new df was used to apply an existing FFTrees object to the new test data.

This currently fails (as data is not handled properly for dfs).

Details

We could fix this, or course, but I hesitate to restore the previous functionality, for the following reasons:

Co-opting data to accept both a simple string and an entire df is a bit ugly.

More precisely, what remains confusing and unclear is:

Does an implicit application of new data (due to taking place within a function call) replace the previous test data in the FFTrees object?

Actually, fftrees_apply() returns a modified FFTrees object. However, as long as this result is not re-assigned to the original source object, the change in the test dataset and its performance details do not appear in the original object. Thus, the user would print or plot results that are not stored in the corresponding FFTrees object.

Suggestion

Keep the data argument simple (a string that must be either "train" or "test").
Add a newdata argument (as in predict() and fftrees_apply()) to provide new test data (as df) and pass it to fftrees_apply().
Add a note when using this function that the original FFTrees object is not changed unless it is being re-assigned.

The text was updated successfully, but these errors were encountered:

hneth · 2022-09-11T09:40:11Z

PR #95 has addressed the bug, but still uses the data argument for both string and data frame inputs.

Additionally, the issue of locally vs. globally changing an FFTrees object x is not resolved. Presently, the global object x remains unchanged when plotting or printing x for new data (as a data frame).

hneth added bug question labels Sep 4, 2022

hneth self-assigned this Sep 11, 2022

hneth mentioned this issue Sep 11, 2022

Using data for providing new data (as df) in print() or plot() #95

Merged

hneth removed their assignment Jan 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using `data` for providing new data (as df) in `print()` or `plot()` yields errors #91

Using `data` for providing new data (as df) in `print()` or `plot()` yields errors #91

hneth commented Sep 4, 2022

hneth commented Sep 11, 2022

Using data for providing new data (as df) in print() or plot() yields errors #91

Using data for providing new data (as df) in print() or plot() yields errors #91

Comments

hneth commented Sep 4, 2022

hneth commented Sep 11, 2022

Using `data` for providing new data (as df) in `print()` or `plot()` yields errors #91

Using `data` for providing new data (as df) in `print()` or `plot()` yields errors #91