Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move implementations in src/distrs/norm.jl to Distributions #42

Open
dmbates opened this issue Mar 16, 2018 · 7 comments
Open

Move implementations in src/distrs/norm.jl to Distributions #42

dmbates opened this issue Mar 16, 2018 · 7 comments

Comments

@dmbates
Copy link
Contributor

dmbates commented Mar 16, 2018

As far as I can see the collection of "d-p-q-r" functions (to use the R terminology) in src/distrs/norm.jl is complete. I propose that the implementations (not the functions themselves) be moved to Distributions/src/univariate/continuous/normal.jl and the functions here be deprecated.

The deprecation might be a bit tricky because the Distributions package requires the StatsFuns package, not the other way around. The situation here would be that to deprecate functions in StatsFuns you need to ensure that a certain version of Distributions or later is available. Given that user code and other packages "should" not call these functions directly that may not be too much of a problem. But it is a tricky situation.

Where should a function like _norminvlogcdf_impl and its dependencies go? I think it should migrate with the other code to Distributions but then does it need a new name to avoid a conflict with the function here during the deprecation period?

I hope what I am asking makes sense. Ping me if this is too confusing. I have only had one cup of coffee so this might be complete nonsense.

@andreasnoack
Copy link
Member

I'll have to think a little more but I'm wondering if it would be easier just to move all of StatsFuns to Distributions in one go (except for the few pieces that need to go into StatsBase). Then we could just ask all users to use Distributions right away instead of having a long period where only parts of the functionality have been migrated.

@dmbates
Copy link
Contributor Author

dmbates commented Mar 16, 2018

@andreasnoack I don't think you want to move everything in StatsFuns to Distributions until the dependencies on Rmath are removed. I thought the reason for having calls from Distributions to functions in libRmath go through intermediate functions like normlogpdf in StatsFuns was to be able to transition to native Julia implementations in stages. As it stands you can't remove a call to the @_delegate_statsfuns macro unless you have a complete set of what are called in R d-p-q-r functions for the distribution.

At present there is an indirect dependence of Distributions on Rmath but the hope is to remove that. I don't think you want to reintroduce a direct dependence of Distributions on Rmath.

In my vision StatsFuns eventually shrinks down to constants and convenience functions like xlogx and log1pexp. I think there would still be a case for having such functions in a separate package. Distributions is a heavyweight dependency to just get an implementation of xlogx.

@andreasnoack
Copy link
Member

I think it would be easier to transition to native implementations if all of the functionality in StatsFuns was inside Distributions. It could be a submodule in Distributions if necessary but I'm not sure it is. I also don't think there is a substantial difference between having Rmath as an indirect dependency compared to having it as a direct dependency. Why would the direct dependency be a problem? My comment "...except the few pieces that need to go into StatsBase" was exactly about xlogx and log1pexp.

@dmbates
Copy link
Contributor Author

dmbates commented Mar 17, 2018

I checked and there are some packages that require StatsFuns but not either Distributions or "StatsBase`.

julia> setdiff(Pkg.dependents("StatsFuns"), union(Pkg.dependents("Distributions"), Pkg.dependents("StatsBase")))
6-element Array{AbstractString,1}:
 "RiemannTheta"                       
 "DynamicHMC"                         
 "Erdos"                              
 "Divergences"                        
 "HilbertSchmidtIndependenceCriterion"
 "LogProbs"                    

@andreasnoack
Copy link
Member

Yes. We'd have to open some PRs against those packages to fix them but I don't think it should too much work.

@dmbates
Copy link
Contributor Author

dmbates commented Mar 19, 2018

I agree that it is a good idea to move the functions related to distributions back into the Distributions package. I don't really recall why they were moved into StatsFuns in the first place. They seem to have been part of StatsFuns from the earliest commits by Dahua Lin (July 25, 2015).

My recollection is that the functions linking to libRmath were initially a separate package then became part of the Distributions package. Do you remember why Dahua decided to move them here? I am asking because I don't want to transfer these functions back to Distributions only to have someone come forward and say "we moved them to StatsFuns for a good reason and you should move them back".

@andreasnoack
Copy link
Member

Dahua factored these out without much discussion and announced it in https://groups.google.com/forum/#!topic/julia-stats/nOObNwKM1b8. As you can see from the thread, I also thought it was a good idea but less than a year after, I'd changed my mind, https://groups.google.com/forum/#!searchin/julia-stats/statsfuns%7Csort:date/julia-stats/SRBpQnhDw54/FzE1O9PBJgAJ. It might be that a few people like the C-ish API here in StatsFuns but I think it is better to focus only on the Distributions API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants