Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Division by zero in Probability\Distribution\Continuous\Continuous #429

Open
drjayvee opened this issue Oct 7, 2021 · 21 comments
Open

Division by zero in Probability\Distribution\Continuous\Continuous #429

drjayvee opened this issue Oct 7, 2021 · 21 comments

Comments

@drjayvee
Copy link

drjayvee commented Oct 7, 2021

Contiuous.php:39 can trigger a division by zero Warning.

Specifically, this happens in our project with the following call (new MathPHP\Statistics\Regression\Linear($points))->ci($x, 0.05);.

Call stack:

#8 /vendor/markrogoyski/math-php/src/Probability/Distribution/Continuous/Continuous.php(39): MathPHP\Probability\Distribution\Continuous\Continuous::inverse
#7 /vendor/markrogoyski/math-php/src/Probability/Distribution/Continuous/StudentT.php(130): MathPHP\Probability\Distribution\Continuous\StudentT::inverse2Tails
#6 /vendor/markrogoyski/math-php/src/Statistics/Regression/Methods/LeastSquares.php(730): MathPHP\Statistics\Regression\Linear::ci
...

I'm sorry I can't contribute a patch for this. There's a reason we're using a math library ;p

@markrogoyski
Copy link
Owner

Hi,

Thanks for your interest in MathPHP. Let's see if we can figure out what is going on. Can you provide the values for $points and $x here that are causing the issue?

Thanks,
Mark

@drjayvee
Copy link
Author

drjayvee commented Oct 7, 2021

Sure thing!

$points = [[5,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,0],[8,1],[8,0],[8,1],[8,1],[8,1],[8,1],[8,0],[8,1],[8,0],[8,1],[8,0],[8,1],[8,0],[8,1],[8,0],[8,1],[8,1],[8,1],[8,0],[8,1],[8,0],[8,1],[8,1],[8,1],[8,0],[8,1],[8,0],[8,1],[8,0],[8,1],[8,0],[8,1],[8,0],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1]];

$x = 5.0;

$regression = new MathPHP\Statistics\Regression\Linear($points);
$regression->ci($x, 0.05);

That should do does it!

@drjayvee
Copy link
Author

drjayvee commented Oct 7, 2021

Turns out $slope == 0.0 && $guess == 0.0

@Beakerboy
Copy link
Contributor

Beakerboy commented Oct 7, 2021

What a weird dataset. The slope in this inverse function can only be zero on a T distribution when x is positive or negative infinity‽ I’ll have to think about the best way to handle this, probably throw an exception. The fact that your dataset is basically one data point, repeated many many times, makes t very large (or very small) at any other x value. Just thinking about your data, your confidence interval is going to be extremely large.

I wonder if this is one of those cases where a data scientist would normally say this calculation is not appropriate for these data.

edit: this is all wrong, the confidence interval within reasonable values is finite and reasonable. I didn’t see that there were some (8,0) points mixed in.

@Beakerboy
Copy link
Contributor

What value of X are you attempting to evaluate the confidence interval at?

@Beakerboy
Copy link
Contributor

Beakerboy commented Oct 7, 2021

R code

library(stats)
X <- c(5,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8)

Y <- c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)

test.data <- data.frame(y=Y, x=X)
attach(test.data)
testing.lm = lm(Y ~ X)
newdata = data.frame(X=5)
predict(testing.lm, newdata, interval="confidence")

@drjayvee
Copy link
Author

drjayvee commented Oct 8, 2021

Thanks for digging into this so quickly.

What value of X are you attempting to evaluate the confidence interval at?

$x = 5.0 as seen in the example, right?

What a weird dataset.

Can't say I disagree. But I'll pass the blame to our users. 😄

Our app lets users create data sets and offers reporting tools on top. Our users are generally not data scientists or statisticians, by the way. (Neither is any of our developers, and certainly not me.)

Here's what happens. One page shows a map of correlation coefficients. Users can then click on any cell to display a correlation chart.
image

Our app needs to be able to support any data set, weird or sane. I think the same should be true for MathPHP, don't you agree?

Let me know if you need any more information.

@Beakerboy
Copy link
Contributor

Beakerboy commented Oct 8, 2021

$x = 5.0 as seen in the example, right?

I was just testing it at 5, but maybe you were using some really large or small number. If you were using 1E100 I would offer different advice than if you used 5.

Our app needs to be able to support any data set, weird or sane. I think the same should be true for MathPHP, don't you agree?

At a bare minimum this function should return an exception when the slope is zero, this would allow you to catch it and make a decision, like not plotting the line, or failing gracefully. I have not testing the library yet so I don’t know where the failure is actually happening.

@markrogoyski
Copy link
Owner

markrogoyski commented Oct 9, 2021

This unit test will reproduce the issue.

    public function testBugIssue429()
    {
        // Given
        $points = [[5,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,0],[8,1],[8,0],[8,1],[8,1],[8,1],[8,1],[8,0],[8,1],[8,0],[8,1],[8,0],[8,1],[8,0],[8,1],[8,0],[8,1],[8,1],[8,1],[8,0],[8,1],[8,0],[8,1],[8,1],[8,1],[8,0],[8,1],[8,0],[8,1],[8,0],[8,1],[8,0],[8,1],[8,0],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1]];
        $x      = 5.0;

        // And
        $regression = new Linear($points);

        // When
        $ci = $regression->ci($x, 0.05);
    }

It produces this stack trace:

There was 1 error:

1) MathPHP\Tests\Statistics\Regression\LinearTest::testBugIssue429
DivisionByZeroError: Division by zero

math-php/src/Probability/Distribution/Continuous/Continuous.php:39
math-php/src/Probability/Distribution/Continuous/StudentT.php:130
math-php/src/Statistics/Regression/Methods/LeastSquares.php:730
math-php/tests/Statistics/Regression/LinearTest.php:558

The division by zero happens when computing the guess in the Continues inverse function:

        while ($dif > $tolerance) {
            $y = $this->cdf($guess);

            // Since the CDF is the integral of the PDF, the PDF is the derivative of the CDF
            $slope = $this->pdf($guess);
            $del_y = $target - $y;
            $guess = $del_y / $slope + $guess; // <---- this is where the division by zero happens

With the test inputs, these are the values at the time of the crash:

  • del_y: 0.475
  • slope: 0
  • guess: 0
  • initial guess: 0 (From continuous distribution mean)

@Beakerboy
Copy link
Contributor

So the question becomes, why is the pdf 0 at t=0?

@markrogoyski
Copy link
Owner

The issue looks like it is in the StudentT distribution's PDF.

This unit test using the data from this bug currently fails, producing 0 instead of the expected result.

    public function testBugIssue429StudentTPdf()
    {
        // Given
        $v = 341;
        $studentT = new StudentT($v);

        // When
        $t = 0;
        $pdf = $studentT->pdf($t);

        // Then
        $expected = 0.3986499;
        $this->assertEqualsWithDelta($expected, $pdf, 0.001);
    }
There was 1 failure:

1) MathPHP\Tests\Statistics\Regression\LinearTest::testBugIssue429StudentTPdf
Failed asserting that 0.0 matches expected 0.3986499.

R code that generated the test data result:

> library(stats)
> v <- 341
> t <- 0
> dt(t, v)
[1] 0.3986499

This online calculator also confirms the result: https://keisan.casio.com/exec/system/1180573203
Producing: 0.398649908330383416563

The StudentT's PDF is calculated using the gamma special function. The denominator is calculating gamma(341/2) which is producing infinity, which is where the 0 value seems to be coming from.

For the same value, R produces an extremely large number, which for all intents and purposes, is the same as infinity.

> gamma(341/2)
[1] 5.562092e+305

Is there an alternate way to calculate the StudentT PDF? I'm not familiar with this or gamma, but numbers this high are basically at the limit of what PHP can handle on a 64-bit machine:

php > echo \PHP_INT_MAX;
9223372036854775807
php > echo \PHP_FLOAT_MAX;
1.7976931348623E+308

@Beakerboy when you have time can you investigate this further? Seems like StudentT PDF is supposed to be able to produce a result here, and maybe being smarter with calculating and applying the gamma function arithmetic, or calculating it some other way, might lead to a better result? Regardless, the StudentT PDF unit tests don't have any test data with values over 10, so maybe that would be a good place to start: add some higher values and see where it breaks down. I'll add some additional unit tests and see how high it can go and still get reasonable results matching R.

@markrogoyski
Copy link
Owner

markrogoyski commented Oct 10, 2021

I added additional test cases in the develop branch for the StudentT pdf up until the point where the tests still pass. The library works fine up until a v of 301. For values 302 up to 334 the pdf returns 0. For values above that it produces either 0, INF, or NAN. I put some commented out test cases that fail with the expected results.

@Beakerboy
Copy link
Contributor

Beakerboy commented Oct 10, 2021

In theory, at this height of degrees of freedom the standard normal could probably be used instead of T with minimal loss of precision. I’ll see how R does their T calculations.

@Beakerboy
Copy link
Contributor

Beakerboy commented Oct 11, 2021

R Source for dt function:
https://github.com/wch/r-source/blob/trunk/src/nmath/dt.c

@markrogoyski
Copy link
Owner

Looks like I didn't push the new test cases. Here they are for reference.

    /**
     * @return array [t, ν, pdf]
     * Generated with R dt(t, ν) from package stats
     */
    public function dataProviderForPdf(): array
    {
        return [
            [-4, 1, 0.01872411],
            [-3, 1, 0.03183099],
            [-2, 1, 0.06366198],
            [-1, 1, 0.1591549],
            [0, 1, 0.3183099],
            [1, 1, 0.1591549],
            [2, 1, 0.06366198],
            [3, 1, 0.03183099],
            [4, 1, 0.01872411],
            [5, 1, 0.01224269],
            [10, 1, 0.003151583],

            [-4, 2, 0.01309457],
            [-3, 2, 0.02741012],
            [-2, 2, 0.06804138],
            [-1, 2, 0.1924501],
            [0, 2, 0.3535534],
            [1, 2, 0.1924501],
            [2, 2, 0.06804138],
            [3, 2, 0.02741012],
            [4, 2, 0.01309457],
            [5, 2, 0.007127781],
            [10, 2, 0.0009707329],

            [-4, 6, 0.004054578],
            [-3, 6, 0.01549193],
            [-2, 6, 0.06403612],
            [-1, 6, 0.2231423],
            [0, 6, 0.3827328],
            [1, 6, 0.2231423],
            [2, 6, 0.06403612],
            [3, 6, 0.01549193],
            [4, 6, 0.004054578],
            [5, 6, 0.001220841],
            [10, 6, 1.651408e-05],

            [-10, 10, 7.284686e-07],
            [-5, 10, 0.0003960011],
            [-2, 10, 0.06114577],
            [-1, 10, 0.230362],
            [0, 10, 0.3891084],
            [1, 10, 0.230362],
            [2, 10, 0.06114577],
            [5, 10, 0.0003960011],
            [10, 10, 7.284686e-07],

            [-10, 20, 2.660085e-09],
            [-5, 20, 7.898911e-05],
            [-2, 20, 0.05808722],
            [-1, 20, 0.2360456],
            [0, 20, 0.3939886],
            [1, 20, 0.2360456],
            [2, 20, 0.05808722],
            [5, 20, 7.898911e-05],
            [-10, 20, 2.660085e-09],

            [0, 50, 0.3969527],
            [1, 50, 0.2395711],
            [5, 50, 1.283547e-05],

            [0, 100, 0.3979462],
            [1, 100, 0.2407659],
            [5, 100, 5.080058e-06],

            [0, 200, 0.3984439],
            [1, 200, 0.2413671],
            [5, 200, 2.88097e-06],

            [0, 250, 0.3985435],
            [1, 250, 0.2414876],
            [5, 250, 2.545035e-06],

            [0, 300, 0.39861],
            [1, 300, 0.241568],
            [5, 300, 2.338036e-06],

            [0, 301, 0.3986111],

            // Anything past 302 for v and INF comes back.
//            [0, 302, 0.3986122],
//            [0, 303, 0.3986133],
//            [0, 305, 0.3986154],
//            [0, 310, 0.3986207],
//
//            [0, 320, 0.3986307],
//            [1, 320, 0.2415931],
//            [5, 320, 2.276028e-06],
//
//            [5, 320, 2.276028e-06],
//            [5, 325, 2.261896e-06],
//            [5, 330, 2.248256e-06],
//            [5, 331, 2.245584e-06],
//            [5, 332, 2.242931e-06],
//            [5, 333, 2.240297e-06],
//            [5, 334, 2.23768e-06],
//            [5, 335, 2.235082e-06],
//
//            [0, 341, 0.3986499],
//            [0, 351, 0.3986582],
//            [0, 361, 0.3986661],
//            [0, 371, 0.3986735],
//            [0, 381, 0.3986806],
//            [0, 400, 0.398693],
        ];
    }

@Beakerboy
Copy link
Contributor

A pull request has been submitted which should fix this issue. It needs to be cleaned up to clarify what it is doing, but it should work for you while we make it pretty.

@markrogoyski
Copy link
Owner

@Beakerboy,

Thanks for providing a fix so quickly. The StudentT PDF is now always producing answers for any input, which is a great improvement.

I have a question, in @drjayvee's original input, it was previously crashing, and no longer crashes, which is great. However, what is the expected result?

    public function testBugIssue429()
    {
        // Given
        $points = [[5,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,0],[8,1],[8,0],[8,1],[8,1],[8,1],[8,1],[8,0],[8,1],[8,0],[8,1],[8,0],[8,1],[8,0],[8,1],[8,0],[8,1],[8,1],[8,1],[8,0],[8,1],[8,0],[8,1],[8,1],[8,1],[8,0],[8,1],[8,0],[8,1],[8,0],[8,1],[8,0],[8,1],[8,0],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1],[8,1]];
        $x      = 5.0;

        // And
        $regression = new Linear($points);

        // When
        $ci = $regression->ci($x, 0.05);

       // Then
       // ??
    }

Right now the confidence interval produces NAN, and it looks like the NAN comes from the CDF computation in the inverse. Is this the expected result? It's certainly better than crashing, but I wasn't sure if this was the expected result for that dataset or not.

Thoughts?

@Beakerboy
Copy link
Contributor

I have r code above that should give the correct result. I’ll probably just port the cdf and inverse functions from the C sources that R uses as well. I think I have some idea of what the functions are doing after reading the source paper. I found a couple typos though which makes it harder.

@Beakerboy
Copy link
Contributor

Beakerboy commented Oct 12, 2021

will need to:

  • Refactor StudentT::cdf()
  • Refactor Beta::cdf()
  • Add Special::logBeta()
  • Add Special::logGamma()
  • ?
  • PROFIT!

Beakerboy added a commit to Beakerboy/math-php that referenced this issue Oct 15, 2021
@markrogoyski
Copy link
Owner

@drjayvee,

This has been fixed in Version 2.5.0. You should not get any more division by zero errors.

Many thanks to @Beakerboy for numerous improvements in the special functions that drive these distributions.

@drjayvee
Copy link
Author

Thanks a bunch! I'm upgrading as we speak.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants