Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

396 R6 impl #471

Draft
wants to merge 13 commits into
base: main
Choose a base branch
from
Draft

396 R6 impl #471

wants to merge 13 commits into from

Conversation

jmbarbone
Copy link
Collaborator

resolves #396

Also contains some default parameters to use more getOption() and some reformatting of tests. Biggest change in those regards is the use of na.strings = getOption("openxlsx2.na.strings").
This was done to avoid complications with missing() and using substitute() to fake missingness when passing parameters down to other functions.

  • wrap all public workbook methods
  • clean up for wrappers
  • some clean up; something is still wrong
  • formatting
  • get style from parent frame
  • fix style default
  • redoc
  • clean up tests
  • rename impl functions

@jmbarbone
Copy link
Collaborator Author

@JanMarvin , this is current a draft because I'm seeing a few test failures that I'm not quite sure about. I managed to avoid others with a few changes to default parameters (which I'll doc in NEWS) --- mostly the na.strings business.

I don't know why these are failing right now.

──────────────────────────────────────────────────────────────────────
Failure (test-wb_functions.R:219): handle 29Feb1900
`got` (`actual`) not equal to `as_date` (`expected`).

`class(actual)`:   "POSIXct" "POSIXt"
`class(expected)`: "Date"            

`attr(actual, 'tzone')` is a character vector ('')
`attr(expected, 'tzone')` is absent

  `unclass(actual)`: -2203959600 -2203873200
`unclass(expected)`:      -25509      -25508
──────────────────────────────────────────────────────────────────────
✖ | 1      53 | wb_styles [3.1s]                                      
──────────────────────────────────────────────────────────────────────
Failure (test-wb_styles.R:416): applyCellStyle works
`got` (`actual`) not equal to `exp` (`expected`).

`actual`:   "4" "2" "1" "3"
`expected`: "3" "2" "1" "3"
──────────────────────────────────────────────────────────────────────
✔ |        14 | Workbook_properties [0.1s]                            
✔ |        24 | Worksheet_naming [0.2s]                               
✖ | 1      25 | write [1.2s]                                          
──────────────────────────────────────────────────────────────────────
Failure (test-write.R:120): update_cells
`got` (`actual`) not equal to `exp` (`expected`).

`actual`:   "inlineStr" "" "b"    
`expected`: "inlineStr" "" "b" "e"

@JanMarvin
Copy link
Owner

@jmbarbone , I have seen this and the amount of code that is moved. We should be very careful applying this and should probably only do so after a previous new dot release (I opened an issue for a #472 release). But in any case, we should review this PR very thoroughly. I can't stress this enough. I really don't want to be faced with interesting surprises like those below 😄

  • Failure (test-wb_functions.R:219): handle 29Feb1900
    This is caused by posix time detection, not sure what is going on here. We can comment out overlapping date detection and protect minutes and seconds as `":m", ":mm", ":s", ":ss". Something like that (but I still want to know where this comes from and I am convinced that something is going wrong in this PR until proven otherwise; oh and our date/posix dection is far from being perfect):
 posix_fmts <- c(
    # "yy", "yyyy",
    # "m", "mm", "mmm", "mmmm", "mmmmm",
    # "d", "dd", "ddd", "dddd",
    "h", "hh", ":m", ":mm", ":s", ":ss", # not sure if hour minutes seconds can be written only as h:m:s or if h/m/s is possible
    "AM", "PM", "A", "P"
  )
  • Failure (test-wb_styles.R:416): applyCellStyle works
    Looks like something in style assignment changed. From the test: the first line removes the cell style in E5 and writes a plain date into E5 (no previous cell style survives). In the second line a date is written to A1 and should detect that there is a matching style in E5 that can be used.
  wb$add_data(dims = "E5", x = Sys.Date(), removeCellStyle = TRUE)
  wb$add_data(dims = "A1", x = Sys.Date()
  • Failure (test-write.R:120): update_cells
    This is failing, because this PR somewhere assings na.string = "#N/A" which is different to the previous na.stringassignment which wrote missings as #N/A (in this PR it is a character, previously it was and should be an expression aka a real openxml missing). We distinguish between missing (the expression is written), null (the cell is not written at all) and characters (an input character is written).

@jmbarbone
Copy link
Collaborator Author

jmbarbone commented Dec 13, 2022

@JanMarvin , yeah this is going to be a big one. But it's nearly all just organization. Trying to not make any changes to the actual functions, so errors like these are a little interesting. Unless I didn't move something correctly (I'll have to go through again to check that) there could be something else with how we're handing assignments.

For na.strings I think I understand the problem. Maybe we should use a special class to safely handle this rather than relying on missings? (maybe this is also causing other issues -- hopefully)

na_string <- function() {
  structure("na_string", class = "openxlsx2_waiver")
}

is_na_string <- function(x) {
  is_waiver(x) && isTRUE(x == "na_string")
}

If that sounds good we can try to implement that prior to this hulk.

@JanMarvin
Copy link
Owner

I know you don't like missing() and have your reasons, and I'm not even stating that working with missings or some under the hood assignments is a good idea. But I'm reluctant to introduce very large (and currently even test breaking) changes at this stage of development.

My current plan is to stick it out until the end of the year, but after that I won't be working on openxlsx2 on a daily basis. I have other things going on in life and want to focus on some more "fun" things related to the package and development. Like for example working on a general flextable wrapper or maybe even a pivot table extension.

My fear is that so many under-the-hood changes, especially those hard-to-see changes, will lead to bigger impacts. We have the three cases that we have already discovered in our very many tests. But there are still so many things that are not yet covered in the tests that are only visible in Excel, and I don't really feel like spending another couple of months looking for bugs that are introduced by cleaning up the biggest piece of code in the package ...

Therefore - if possible - I really like not to introduce behavior changes at all. In no case should we rush to implement the changes, but rather do so in a well-considered manner and with the certainty that we have examined them well and weighed up the advantages and disadvantages.

@jmbarbone
Copy link
Collaborator Author

Bringing the conversion over here...

But please, before you start updating #471, I still haven't made up my mind about the size and impact of the PR and would veto the merge at this point. I still feel that the risk of breaking things along the way still outweighs the benefit of having more organized (and possibly cleaner) code. Since I started porting some real life work over to openxlsx2 (which exposed some of the bugs I have been fixing lately), I don't like having a nice but broken mess and prefer a working mess, if you know what I mean.

Maybe we can break #471 into smaller pieces (e.g. not including the unnecessary testthat changes) or go function by function, but just having one big pull request that changes 61 files and moves ~5k lines of code is not what I like at the moment. And even if it does not make things better, I am really sorry to be so picky about this, especially since you have already spent some time on it.

I agree. The scope of this was too large and I've kept it as a draft PR knowing it would take a while. It's probably better to close this and work in smaller chunks, if needed, or decide that some release version would only be dedicated to internal changes. Since it's just about internal code management, there isn't a big necessity to include these changes.

@JanMarvin
Copy link
Owner

At some hopefully distant time in the future, further restructuring of the code will be required (such as moving code from the workbook to the worksheet, or aligning the worksheet and chart sheet #404). I think something like this PR will be very useful then.

As for the next version: I plan to release 0.5.1 at the end of the month. After that I will work on #38. After writing some openxlsx2 code, I wanted to improve the user experience a bit. There are a few bumps that can be ironed out, but for that something had to exist and work first. But we have hopefully reached this state by now and as far as my own needs are concerned, we are now complete and I am quite happy with the current state. There is always room for things to improve or to tweak, but such is life.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

internal R6 workbook implementation functions
2 participants