Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: allow new factor levels in brandom #115

Open
kmorndahl opened this issue Nov 22, 2021 · 4 comments
Open

Feature request: allow new factor levels in brandom #115

kmorndahl opened this issue Nov 22, 2021 · 4 comments

Comments

@kmorndahl
Copy link

I am building a gamboost model with the GammaReg() family. The model fits fine, but when it comes time to predict on a new dataset I am getting an error: Error in X %*% rowSums(cf) : Cholmod error 'X and/or Y have wrong dimensions' at file ../MatrixOps/cholmod_sdmult.c, line 90.

I attempted to create a smaller reproducible example of this, but could not replicate that exact error. Instead, this smaller dataset gives a different error: Error in f(init, x[[i]]) : non-conformable arrays. This example is created using a subset of the training data from my original data, the same modeling approach, and I attempt to predict on the full test set from my original data.

Any help troubleshooting these error(s) would be much appreciated. I can of course share the full dataset if that would be helpful.

x = c(-0.420619854880168, -0.769823976992038, -0.709316986674812, 
      -1.25099225335503, -0.618892161183838, -0.555349783432928, 
      -0.914234689796377, -1.22903701739405, -0.4833834921797, 
      -0.320848947810941, 0.135931013665819, 2.39042286987258, 
      1.73643729459268, -0.506909477648839, 1.62136146009556, 
      2.15263600603266, 1.4014282748866, 2.03401367337059, 0.877646599658447, 
      1.02535151508941, -0.837245279816666, 0.58292669901717, 
      0.602153227358826, 1.83594483207367, 1.02820280062304, 
      -0.765221508789011, -0.74152886321564, -0.354830989878368, 
      -0.282803791828277, -0.407939851800533)

y = c(3.37808396795311, 4.31703013336414, 3.62201047152382, 
      3.47647337833432, 3.57383927065914, 4.0274754006413, 5.11993857962149, 
      4.10603649459834, 3.44626699808267, 6.50187496364316, 47.364073741465, 
      49.4539651723017, 18.5584474694755, 167.384017225471, 43.854077667435, 
      20.5948980572258, 60.0090328389651, 32.2859775573889, 23.028143147698, 
      27.5759143301009, 36.8938384302345, 132.866721315487, 187.170964086464, 
      14.9766986238594, 4.54648985258215, 25.5890448582519, 48.696962593379, 
      36.9890174750545, 53.4100395759561, 49.565020848753)

group = c("1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "2", "2", 
          "2", "2", "2", "2", "2", "2", "2", "2", "3", "3", "3", "3", "3", 
          "4", "4", "4", "4", "4")

df = data.frame(y, x, group)
df$group = factor(df$group)

mod = mboost::gamboost(y ~ bols(x) + brandom(group), data = df, family = GammaReg())

x_test = c(1.2562301, -0.4628746, -0.2848149, -0.9655805, -1.0166867, 
           1.8343589, -0.6302188, 1.1909887, -0.8064399, 0.3444268, 
           -0.4593891)

y_test = c(3.004605, 5.595847, 7.62922, 6.687553, 7.435949, 
           11.453977, 13.381522, 13.393321, 6.855579, 16.023104, 
           17.48234)
  
group_test = c("5", "5", "5", "5", "5", "5", "5", "5", "5", "5", "1")

df_test = data.frame(y = y_test, x = x_test, group = group_test)
df_test$group = factor(df_test$group)

preds = predict.mboost(mod, newdata = df_test)

# Error in f(init, x[[i]]) : non-conformable arrays

# Error when running on full dataset:
# Error in X %*% rowSums(cf) : 
# Cholmod error 'X and/or Y have wrong dimensions' at file ../MatrixOps/cholmod_sdmult.c, line 90

@sbrockhaus
Copy link
Member

sbrockhaus commented Nov 23, 2021 via email

@kmorndahl
Copy link
Author

Ah easy fix, thanks very much @sbrockhaus!

The error (Error in X %*% rowSums(cf) : Cholmod error 'X and/or Y have wrong dimensions) still persisted in the full data set, but this seems to be resolved in the patched version available through github.

Are there plans to/a timeline for incorporating the patched version into the main CRAN install? (pardon my ignorance here, I don't know much about package development).

Best

@kmorndahl
Copy link
Author

Upon further reflection, I'm wondering if mboost provides the functionality to include new levels of a brandom() grouping variable in the test set/newdata object for prediction. For random effects, I believe this is not uncommon (?) For example, the glmer function provides the allow.new.levels = TRUE parameter for this purpose.

Thanks!

@hofnerb
Copy link
Member

hofnerb commented Dec 14, 2021

I am wondering that the code breaks with the CRAN version but not the github version as they should be practically identical and we did not modify anything wrt to the above mentioned functionality.

A feature such as allow.new.levels would be indeed nice to have but would require quite some time to implement. I do not know if anyone currently has this time. I don't think so. You are more than welcome to contribute code, though.

@hofnerb hofnerb changed the title Errors using predict() after fitting gamboost GammaReg model: non-conformable arrays, Cholmod error 'X and/or Y have wrong dimensions Feature request: allow new factor levels in brandom Dec 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants