Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passing–Bablok regression #368

Open
JulienBohy opened this issue Feb 7, 2020 · 8 comments
Open

Passing–Bablok regression #368

JulienBohy opened this issue Feb 7, 2020 · 8 comments

Comments

@JulienBohy
Copy link

JulienBohy commented Feb 7, 2020

Hello,

Is it possible to add Passing–Bablok regression in this lib?
src : https://en.wikipedia.org/wiki/Passing%E2%80%93Bablok_regression

thanks

@Beakerboy
Copy link
Contributor

In an initial glance, this seems like the Theil-Sen regression, but with Confidence Intervals. Is it similar?

@JulienBohy
Copy link
Author

JulienBohy commented Feb 7, 2020

The Passing-Bablok and Theil-Sen regression are closely related non-parametric methods to
estimate the regression coefficients and build tests on the relationship between the dependent
and independent variables. Both methods rely on the slopes of the connecting lines between
pairwise measurements. While Theil and Sen assume no measurement errors in the independent
variable, the method from Passing and Bablok accounts for errors in both variables

src: https://arxiv.org/pdf/1905.07649.pdf

I'm not a mathematician but here's what I found
;)

@Beakerboy
Copy link
Contributor

Beakerboy commented Feb 7, 2020

I’m not a mathematician either ...scientist/engineer. As an aside...neither of your links are working correctly.

@JulienBohy
Copy link
Author

Hello,

Links are updated sorry

@JulienBohy
Copy link
Author

What do you think of this proposal?
Do you think it deserves to be added to this wonderful library?

@Beakerboy
Copy link
Contributor

All submissions are welcome.

@JulienBohy
Copy link
Author

JulienBohy commented Dec 1, 2020

Hi,

A trainee wrote us a function, but the code is not very clear and does not really correspond to PHP standards.
So i cant make a pull request.

I've never been in open source projects and honestly don't really have time.

I found also Python implementation on this link

you will find below the code of our intern

<?php
        /**
         * @param array $dataPoints
         * @return array
         */
        public static function passingBablok($dataPoints)
        {
            $x = [];
            $y = [];
            $ic = [];
            $ic_low = [];
            $ic_upp = [];
            $S = [];

            foreach ($dataPoints as $point) {
                $x[] = (float)$point[0];
                $y[] = (float)$point[1];
            }

            $n = count($x);

            if ($n != count($y)) {
                throw new Exception('passingBablok(): Number of elements in coordinate arrays do not match.');
            }

            $x_min = min($x);
            $x_max = max($x);

            $x_sum = array_sum($x);
            $y_sum = array_sum($y);

            $xx_sum = 0;
            $xy_sum = 0;
            $yy_sum = 0;

            for ($i = 0; $i < count($x); ++$i) {
                $xy_sum += ($x[$i] * $y[$i]);
                $xx_sum += ($x[$i] * $x[$i]);
                $yy_sum += ($y[$i] * $y[$i]);
            }
            $r = ($xy_sum - ((1 / $n) * $x_sum * $y_sum)) / (sqrt(
                (($xx_sum) - ((1 / $n) * (pow($x_sum, 2)))) * (($yy_sum) - ((1 / $n) * (pow($y_sum, 2))))
              ));
            $r2 = $r * $r;

            $cgamma = 1.95996 * pow(($n * ($n - 1) * (2 * $n + 5) / 18), 0.5);

            $k = 0;
            for ($i = 0; $i < $n; ++$i) {
                for ($j = 0; $j < $n; ++$j) {
                    $var = null;
                    if ($i < $j && !($x[$i] == $x[$j] && $y[$i] == $y[$j])) {
                        $dx = $x[$j] - $x[$i];
                        $dy = $y[$j] - $y[$i];
                        if (0.00 !== $dx) {
                            $var = $dy / $dx;
                        }
                        if (0 == $dx) {
                            $var = MathUtils::Sgn($dy) * 99999;
                        }
                        if (!empty($var)) {
                            $S[] = $var;
                            ++$k;
                        }
                    }
                }
            }

            $nn = $k;
            sort($S);

            $k = 0;
            while ($S[$k] < -1) {
                ++$k;
            }

            $N_CIlow = round(($nn - $cgamma) / 2, 0);

            $N_CIupp = $nn - $N_CIlow + 1 + $k;

            $N_CIlow = $N_CIlow + $k;

            $nr = round($nn / 2, 0);
            if ($nn / 2 == $nr) {
                $odd_even = 1;
                $n_med = $nn / 2 + $k;
            } else {
                $odd_even = 2;
                $n_med = ($nn + 1) / 2 + $k;
            }

            $b = $S[$n_med];

            $b_CI95upp = 0;
            $b_CI95low = 0;
            $slope_b = 0;
            for ($i = 0; $i < $nn; ++$i) {
                if (1 == $odd_even && $i == $n_med) {
                    $slope_b = $S[$i] / 2;
                }
                if (1 == $odd_even && $i == $n_med + 1) {
                    $slope_b = $slope_b + $S[$i] / 2;
                }
                if (2 == $odd_even && $i == $n_med) {
                    $slope_b = $S[$i];
                }
                if ($i == $N_CIlow) {
                    $b_CI95low = $S[$i - 1];
                }
                if ($i == $N_CIupp) {
                    $b_CI95upp = $S[$i];
                }
            }

            for ($i = 0; $i < $n; ++$i) {
                $ic[$i] = $y[$i] - $slope_b * $x[$i];
                $ic_low[$i] = $y[$i] - $b_CI95upp * $x[$i];
                $ic_upp[$i] = $y[$i] - $b_CI95low * $x[$i];
            }

            $a_CI95upp = Average::median($ic_upp);
            $intercept_a = Average::median($ic);
            $a_CI95low = Average::median($ic_low);

            $results = [];
            $results['SLOPE'] = $slope_b;
            $results['R2'] = $r2;
            $results['INTERCEPT'] = $intercept_a;
            $results['B_CI95LOW'] = $b_CI95low;
            $results['B_CI95UPP'] = $b_CI95upp;
            $results['A_CI95LOW'] = $a_CI95low;
            $results['A_CI95UPP'] = $a_CI95upp;
            $results['REGRESSION_LINE_X_MIN'] = $x_min;
            $results['REGRESSION_LINE_X_MAX'] = $x_max;

            return $results;
        }
		?>

Can all of this information be useful?

@markrogoyski
Copy link
Owner

Hi @JulienBohy,

Thank you for sharing your code. Even if it is not a formal pull request, it is still helpful.

We'll update this thread if there is any progress on an official MathPHP implementation.

Thanks again,
Mark

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants