Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRAN issues #120

Closed
eddelbuettel opened this issue Apr 28, 2024 · 19 comments · Fixed by #121
Closed

CRAN issues #120

eddelbuettel opened this issue Apr 28, 2024 · 19 comments · Fixed by #121
Assignees

Comments

@eddelbuettel
Copy link
Owner

We apparently have two CRAN issues outside the long-fixed 'do not specify C++11 unless needed' we addressed ages ago.

One is just a typo. R-devel checks more carefully for 'isolated' pairs of {} we have one code{+} that wants to be \code{+}. I will fix that in a second. (One aside is that we do something that upsets roxygen2 on my machine. It errors with "S3 method {.arg {name}} needs @export or @exportS3method tag") and dies.)

The other is that of late Brian Ripley is chasing more uses of functions from 'internal' headers and our use of SET_S4_OBJECT is flagged, twice. The function is a simple one-liner so we could internalize, or we could replace the S4 creation with a call from C++ to R and call a proper new(...). Thoughts?

@eddelbuettel
Copy link
Owner Author

eddelbuettel commented Apr 28, 2024

Looks like the second aspect can be fixed too similar to what I found here (gotta love github search with 'cran' ...)

(But I seem to have a Heisenbug effect under tests. Running tinytest over the test directory passes, running R CMD check croaks test_nanoival.R............... 0 tests Error: Not an S4 object.. Hm.)

@lsilvest
Copy link
Collaborator

This is the function called by Rf_asS4:

SEXP asS4(SEXP s, Rboolean flag, int complete)
{
    if(flag == IS_S4_OBJECT(s))
	return s;
    PROTECT(s);
    if(MAYBE_SHARED(s)) {
	s = shallow_duplicate(s);
	UNPROTECT(1);
	PROTECT(s);
    }
    if(flag) SET_S4_OBJECT(s);
    else {
	if(complete) {
	    SEXP value;
	    /* TENTATIVE:  how much does this change? */
	    if((value = R_getS4DataSlot(s, ANYSXP))
	       != R_NilValue && !IS_S4_OBJECT(value)) {
	      UNPROTECT(1);
	      return value;
	    }
	    /* else no plausible S3 object*/
	    else if(complete == 1) /* ordinary case (2, for conditional) */
	      error(_("object of class \"%s\" does not correspond to a valid S3 object"),
		      CHAR(STRING_ELT(R_data_class(s, FALSE), 0)));
	    else {
		UNPROTECT(1);
		return s; /*  unchanged */
	    }
	}
	UNSET_S4_OBJECT(s);
    }
    UNPROTECT(1);
    return s;
}

Things work for me when changing:

SET_S4_OBJECT(res);

to:

res = Rf_asS4(res, TRUE, FALSE);

But not when there is no assignment to res. So it seems a shallow duplicate is made... I have to track down what shallow_duplicate does exactly to make sure we are not incurring a performance penalty when swapping to Rf_asS4.

@eddelbuettel
Copy link
Owner Author

By "assignment" you refer to the subsequent .S3class attribute setting? I think I found the same, but have no fix. We use SET_S4_OBJECT in two places and the first trivially replaces with Rf_asS4. I also don;t quite understand yet what is happening in there.

@lsilvest
Copy link
Collaborator

No, I meant:

res = Rf_asS4(res, TRUE, FALSE);

instead of:

Rf_asS4(res, TRUE, FALSE);

The latter works only if a shallow duplicate is not made. Otherwise the S4 bit gets set on the copy and is not assigned to res. That is what I believe creates the Error: Not an S4 object, or anyway, that's what I get when I don't do the assignment to res.

@eddelbuettel
Copy link
Owner Author

That is ... in a way even more puzzling. Maybe we can also step back and see if we can shimmy improvements into Rcpp::S4 which currently is fairly bare (but exists, mostly to get conveniently at a slot).

@lsilvest
Copy link
Collaborator

All we want to do here is set the S4 bit on a newly created object; SET_S4_OBJECT is simply:

#define SET_S4_OBJECT(x) (((x)->sxpinfo.gp) |= S4_OBJECT_MASK)

We are in full control of the object, so there should be no cases where the object is shared and no need for a duplicate (even shallow!). Looks like we'd be mainly just copying attributes and we don't have any, so it might all be OK in the end if we just replace with res = Rf_asS4(res, TRUE, FALSE);. But worth trying to understand completely what's happening here: I will spend a bit of time on MAYBE_SHARED and the duplicate function.

@eddelbuettel
Copy link
Owner Author

Yep. I got as far. Sadly the header info is hidden away. The assignment trick is already pretty good.

@lsilvest
Copy link
Collaborator

lsilvest commented May 23, 2024

So the reason MAYBE_SHARED returns true is that the ref count for the new object is indeed larger than 1. It doesn't really make sense because we've just allocated the vector. But it turns out the ref count is incremented to 3 after the copying of the names, so this code for finding the start of a nanoival:

Rcpp::NumericVector nanoival_get_start_impl(const Rcpp::ComplexVector cv) {
  Rcpp::NumericVector res(cv.size());
  Rcpp::Rcout << "Ref count just after allocation: " << REFCNT(res) << std::endl;
  for (R_xlen_t i=0; i<cv.size(); ++i) {
    interval ival;
    Rcomplex c = cv[i];
    memcpy(&ival, reinterpret_cast<const char*>(&c), sizeof(c));
    if (ival.isNA()) {
      double d;
      memcpy(&d, reinterpret_cast<const char*>(&NA_INTEGER64), sizeof(NA_INTEGER64));
      res[i] = d;
    }
    else {
      std::int64_t start = ival.s;
      double d;
      memcpy(&d, reinterpret_cast<const char*>(&start), sizeof(start));
      res[i] = d;
    }
  }
  Rcpp::Rcout << "Ref count after filling in the start vector: " << REFCNT(res) << std::endl;
  res.names() = cv.names();
  Rcpp::Rcout << "Ref count after assigning the names: " << REFCNT(res) << std::endl;
  return assignS4("nanotime", res, "integer64");
}

when executed prints:

Ref count just after allocation: 1
Ref count after filling in the start vector: 1
Ref count after assigning the names: 3

I believe this is happening in Rcc::NamesProxyPolicy; in the set function, I see a few creations of Shield<SEXP> which would increase the ref count, but they should decrease when going out of scope.

Anyway, a bit lost at this point. I will need to investigate closer the reference counting that is occurring in the names set. It is possibly happening inside the R callbacks that occur in NamesProxyPolicy::set:

void set(SEXP x) {
            Shield<SEXP> safe_x(x);

            /* check if we can use a fast version */
            if( TYPEOF(x) == STRSXP && parent.size() == Rf_length(x) ){
                Rf_namesgets(parent, x);
            } else {
                /* use the slower and more flexible version (callback to R) */
                SEXP namesSym = Rf_install( "names<-" );
                Shield<SEXP> call(Rf_lang3(namesSym, parent, x));
                Shield<SEXP> new_vec(Rcpp_fast_eval(call, R_GlobalEnv));
                parent.set__(new_vec);
            }

        }

Does any of this ring a bell?

I'm thinking we could call assignS4 before copying the names therefore guaranteeing we don't do a shallow duplicate and the code will be equivalent.

Thoughts?

@eddelbuettel
Copy link
Owner Author

I don't think Rcpp itself sets or increments the reference counters -- that is all deep down in R itself business by Luke. Shield<> etc should only be nicer-to-use and simpler-because-RAII variants of PROTECT / UNPROTECT unless I miss something. So moving assignS4 up an instruction or two seems fine to me. Can you try it?

@lsilvest
Copy link
Collaborator

Tried it and it works. That's probably the best solution. Will go ahead and commit this, do you want me to also fix the missing \ in the \code{+} hoping Roxygen2 doesn't grumble on my machine? Or are you taking care of the latter fix?

@eddelbuettel
Copy link
Owner Author

Awesome.

I can take care of roxygen2, that has come up elsewhere and gotten fixed (incl. at work).

@eddelbuettel
Copy link
Owner Author

Confirming this local state of affairs passes muster:

modified   inst/include/nanotime/utilities.hpp
@@ -48,8 +48,8 @@ namespace nanotime {
   SEXP assignS4(const char* classname, Rcpp::Vector<R>& res) {
     Rcpp::CharacterVector cl = Rcpp::CharacterVector::create(classname);
     cl.attr("package") = "nanotime";
+    Rf_asS4(res, TRUE, FALSE);
     res.attr("class") = cl;
-    SET_S4_OBJECT(res);
     return Rcpp::S4(res);
   }
 
@@ -61,11 +61,37 @@ namespace nanotime {
     Rcpp::CharacterVector cl = Rcpp::CharacterVector::create(classname);
     cl.attr("package") = "nanotime";
     res.attr("class") = cl;
+    res = Rf_asS4(res, TRUE, FALSE);
     // use 'install' here as in R source code attrib.c: LLL
     Rcpp::CharacterVector oc = Rcpp::CharacterVector::create(oldClass);
     res.attr(".S3Class") = oc;
-    SET_S4_OBJECT(res);
     return Rcpp::S4(res);

@lsilvest
Copy link
Collaborator

lsilvest commented May 24, 2024

The name copying order also needs to be changed in interval.cpp. In utilities.hpp the order is unimportant. I've opened PR #121 for these changes. Maybe you can slip in there the \code{+} fix? I also see we've got #117 open. It's a small fix that I could slip in here unless you prefer to keep fixed separate?

@eddelbuettel
Copy link
Owner Author

eddelbuettel commented May 24, 2024

No strong feelings. Separate issues (as #121 and #117 seem to be) may warrant separate PRs...

I am getting quite used to squashing PRs (as we do at work, started it for Rcpp a while back too) which makes things nice and linear. Order doesn't matter. I can wrap up #121 and was about to suggest to ship to CRAN but we may to hit #117 over the head too. So maybe create a new branch for #117, or wait til #121 is in, or ... It all works.

@eddelbuettel
Copy link
Owner Author

BTW I checked the change under r-devel and I don't think i had the interval.cpp change in and now I confused why R didn't yell at me. Also I get no \code{+} hint either under (current, from this week) r-devel or r-release.

@lsilvest
Copy link
Collaborator

OK, I'll fix #117 tonight in a different PR and like that we can submit to CRAN only once. Without the change in interval.cpp things work, but then we get a shallow duplicate for no reason at all. Switching the order means no shallow duplicate because the ref count is still at 1 before the names assignment.

@eddelbuettel
Copy link
Owner Author

Ok, I will squash merge #121 then so you have clean slate to start from.

Onwards, and forwards, and big big thanks for working through this one where I only managed half of it at best.

@eddelbuettel
Copy link
Owner Author

If found the '\code{+}' issue (reported at CRAN). Might be easiest to remove the outer backticks around \code{+} in the Rd formating.

@eddelbuettel
Copy link
Owner Author

Naaa, I already fixed that issue weeks in this commit. Sorry for a) not making that clearer and b) not going via a PR so that you saw it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants