Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Computing time 'INTERVAL' with postgres #97

Open
kumy opened this issue Mar 30, 2020 · 14 comments
Open

Computing time 'INTERVAL' with postgres #97

kumy opened this issue Mar 30, 2020 · 14 comments

Comments

@kumy
Copy link
Contributor

kumy commented Mar 30, 2020

I'm having troubles selecting rows based on datetime calculation on a Postgres database.

The query I'm trying to build looks like:

SELECT *
FROM "news"
WHERE "created_on_datetime" > NOW()- INTERVAL '1 YEAR'
ORDER BY "created_on_datetime" DESC;

A simple code example to reproduce could be:

        $newsModel = new News();
        $filter = ['created_on_datetime > NOW() - INTERVAL \'1 YEAR\''];
        $news = $newsModel->find($filter);

From the debug lines I've added to the cortex code, the generated query is:

SELECT "id","title","content","author","created_on_datetime" FROM "news" WHERE "created_on_datetime" > NOW() - "INTERVAL" '1 YEAR'

Note the double quotes added automatically around the INTERVAL keyword. (Without them the query works fine.
Those leads to the Postgres error:

Error in query (7): ERROR: type "INTERVAL" does not exist
LINE 1: ...M "news" WHERE "created_on_datetime" > NOW() - "INTERVAL"...

I've tracked the quotes adder down to the function sql_quoteCondition()

f3-cortex/lib/db/cortex.php

Lines 2683 to 2706 in 364caaa

/**
* quote identifiers in condition
* @param string $cond
* @param object $db
* @return string
*/
public function sql_quoteCondition($cond, $db) {
// https://www.debuggex.com/r/6AXwJ1Y3Aac8aocQ/3
// https://regex101.com/r/yM5vK4/1
// this took me lots of sleepless nights
$out = preg_replace_callback('/'.
'\w+\((?:(?>[^()]+)|\((?:(?>[^()]+)|^(?R))*\))*\)|'. // exclude SQL function names "foo("
'(?:(\b(?<!:)'. // exclude bind parameter ":foo"
'[a-zA-Z_](?:[\w\-_.]+\.?))'. // match only identifier, exclude values
'(?=[\s<>=!)]|$))/i', // only when part of condition or within brackets
function($match) use($db) {
if (!isset($match[1]))
return $match[0];
if (preg_match('/\b(AND|OR|IN|LIKE|NOT|HAVING|SELECT|FROM|WHERE)\b/i',$match[1]))
return $match[1];
return $db->quotekey($match[1]);
}, $cond);
return $out ?: $cond;
}

Here is a regex101 fork, were we clearly see the INTERVAL captured as a column name.
https://regex101.com/r/AP2mRH/1

I'm on:

/**
 *  Cortex - the flexible data mapper for the PHP Fat-Free Framework
 *  […]
 *  @package DB
 *  @version 1.6.0
 *  @date 03.02.2019
 *  @since 24.04.2012
 */

I hope this one will not give you more sleepless nights 😉

@kumy
Copy link
Contributor Author

kumy commented Mar 30, 2020

A workaround could be to write the filter as:

        $filter = ['created_on_datetime > NOW() - \'1 YEAR\'::interval'];

But I still have to find how to properly escape it in Pdo when it's parametrized:

        $filter = ['created_on_datetime > NOW() - \'? YEAR\'::interval', $years];

This give error:

Named bind parameter `:interval` does not exist in filter arguments

Other tries:

        $filter = ['created_on_datetime > NOW() - cast(? as interval)', $year.' YEAR'];  // OK, but I would like to avoid concat
        $filter = ['created_on_datetime > NOW() - cast(\'? DAY\' as interval)', $year]; // FAIL
        $filter = ['created_on_datetime > NOW() - cast(concat(?::integer, \' DAY\') as interval)', $year]; // Fail
        $filter = ['created_on_datetime > NOW() - cast(concat(cast(?) as integer, \' DAY\') as interval)', $year]; // FAIL
        $filter = ['created_on_datetime > NOW() - cast(concat(?, \' DAY\') as interval)', $year]; // FAIL

@ikkez
Copy link
Owner

ikkez commented Apr 1, 2020

use the Date method to encapsulate the expression in a function, that'll skip the qoutation of parameters and fields inside. See duplicate: #28 (comment)

@ikkez ikkez added the duplicate label Apr 1, 2020
@kumy
Copy link
Contributor Author

kumy commented Apr 1, 2020 via email

@ikkez
Copy link
Owner

ikkez commented Apr 1, 2020

it's a SQL function as far as i know

@kumy
Copy link
Contributor Author

kumy commented Apr 1, 2020

The best I can get is:

        $filter = ['created_on_datetime > NOW() - cast(? as interval)', $year.' DAY'];
        $filter = ['created_on_datetime > DATE(NOW() - INTERVAL ?)', $year.' YEAR']; // FAIL
// PDOStatement: ERROR: syntax error at or near "$1" LINE 1: ...ERE "created_on_datetime" > DATE(NOW() - INTERVAL $1) ORDER ... ^

        $filter = ['created_on_datetime > DATE(NOW() - INTERVAL \'? YEAR\')', $year]; // FAIL
// PDOStatement: ERROR: invalid input syntax for type interval: "? YEAR" LINE 1: ...ERE "created_on_datetime" > DATE(NOW() - INTERVAL '? YEAR') ... ^

@ikkez
Copy link
Owner

ikkez commented Apr 1, 2020

seems like the DATE function is not available for postgre. Try this one for postgre:

$filter = ["created_on_datetime > date_trunc('second', INTERVAL '? YEAR')", $year];

// or begin without parameter, since I'm not sure about where to put that atm without testing
$filter = ["created_on_datetime > date_trunc('second', INTERVAL '1 YEAR')"];

@kumy
Copy link
Contributor Author

kumy commented Apr 1, 2020

$filter = ["created_on_datetime > date_trunc('second', INTERVAL '? YEAR')", GK_SITE_NEWS_DISPLAY_DAYS_VALIDITY];
// PDOStatement: ERROR: invalid input syntax for type interval: "? YEAR" LINE 1: ...ated_on_datetime" > date_trunc('second', INTERVAL '? YEAR') ... ^

$filter = ["created_on_datetime > date_trunc('second', INTERVAL '1 YEAR')"];
// PDOStatement: ERROR: operator does not exist: timestamp with time zone > interval LINE 1: ...etime" FROM "gk_news" WHERE "created_on_datetime" > date_tru... ^ HINT: No operator matches the given name and argument types. You might need to add explicit type casts.

Screenshot from 2020-04-01 15-48-43

@ikkez
Copy link
Owner

ikkez commented Apr 1, 2020

That's good! See the hint:

HINT: No operator matches the given name and argument types. You might need to add explicit type casts.

I'm not sure what type your created_on_datetime field is, so maybe you just need to cast the one or the other field to aid in comparision.. the error states that he cannot compare a timestamp incl. timezone with whatever your other field is

@ikkez
Copy link
Owner

ikkez commented Apr 1, 2020

Despite of this workaround.. it seem more complicated in postgre than other engine.. maybe i should just add an option somehow to display automatic field quotation for specific parts of the query?!

@kumy
Copy link
Contributor Author

kumy commented Apr 1, 2020

We missed the now() - in the last tests:

        $filter = ["created_on_datetime > NOW() - date_trunc('second', INTERVAL '1 YEAR')"];
// Works, but not with '? YEAR' - seems related to Pdo

        $filter = ["created_on_datetime > NOW() - date_trunc('second', INTERVAL '? YEAR')", GK_SITE_NEWS_DISPLAY_DAYS_VALIDITY];
// PDOStatement: ERROR: invalid input syntax for type interval: "? YEAR" LINE 1: ...datetime" > NOW() - date_trunc('second', INTERVAL '? YEAR') ... ^

My best option right now still seems to be

$filter = ['created_on_datetime > NOW() - cast(? as interval)', GK_SITE_NEWS_DISPLAY_DAYS_VALIDITY.' DAY'];

How would the option be used, would it be a list of keywords to not quote?
Migration to Postgres was a huge work for me 😞 with lots of gotchas

@ikkez
Copy link
Owner

ikkez commented Apr 1, 2020

Could it be that the NOW() need to be set within the function?
created_on_datetime > date_trunc('second', NOW() - INTERVAL '1 YEAR')

@kumy
Copy link
Contributor Author

kumy commented Apr 1, 2020

@ikkez thanks for yor help.

We're getting quite close to something "usable"/"clean".

This syntax may work with a little update to the library.

        $filter = ["created_on_datetime > NOW() - CAST(? || ' YEAR' as INTERVAL)", GK_SITE_NEWS_DISPLAY_DAYS_VALIDITY];

But problem here is that cortex parse it as Mysql syntax, which has completely different meaning with Postgres. "PostgreSQL, following the standard, uses || for string concatenation ('foo' || 'bar' = 'foobar')."

MySQL uses C-language operators for logic (i.e. 'foo' || 'bar' means 'foo' OR 'bar', 'foo' && 'bar' means 'foo' and 'bar'). This might be marginally helpful for C programmers, but violates database standards and rules in a significant way. PostgreSQL, following the standard, uses || for string concatenation ('foo' || 'bar' = 'foobar').
https://wiki.postgresql.org/wiki/Things_to_find_out_about_when_moving_from_MySQL_to_PostgreSQL

Problem is there:

$where = str_replace(['&&', '||'], ['AND', 'OR'], $where);

This substitution have to be conditioned to the driver used.

I've tried to workaround it without updating cortex, but it finally gets more complicated, with no luck and finally we're back to the same problem of quotes:

        $filter = ["created_on_datetime > NOW() - CAST(CONCAT(?, ' YEAR') AS INTERVAL)", GK_SITE_NEWS_DISPLAY_DAYS_VALIDITY];
// PDOStatement: ERROR: could not determine data type of parameter $1

        $filter = ["created_on_datetime > NOW() - CAST(CONCAT(CAST(? AS INTEGER), ' YEAR') AS INTERVAL)", GK_SITE_NEWS_DISPLAY_DAYS_VALIDITY];
// PDOStatement: ERROR: syntax error at or near ""AS"" LINE 1: ...NOW() - CAST(CONCAT(CAST($1 AS INTEGER), ' YEAR') "AS" "INTE... ^

I start wondering why this auto quoting is there, what are the use case behind it?

And sorry I forgot to share the table schema:
Screenshot from 2020-04-01 19-09-00

@kumy
Copy link
Contributor Author

kumy commented Apr 1, 2020

Oh! I may simply follow the doc! I'll test that…

CORTEX.quoteConditions: Default TRUE. By default, all field names in where conditions are quoted automatically according to the used database engine. This helps to work around reserved names in SQL. However the detection of fields isn't perfect yet, so in case you want to add the correct backticks or other quotation yourself, set this to FALSE.
https://github.com/ikkez/f3-cortex#additional-notes

Edit: OK it fix the example from my first post, but to have it dynamic, bind only one int, "created_on_datetime > NOW() - CAST(? || ' YEAR' as INTERVAL)" is the only way I found, but the OR vs || bug is annoying there. If we fix it, then this issue could be closed.

@ikkez Would you a dedicated issue for the OR bug?
Edit2 : ➡️ #98

@kumy
Copy link
Contributor Author

kumy commented Apr 1, 2020

An approach is to use plpgsql to create my own function:

CREATE OR REPLACE FUNCTION public.fresher_than(
	datetime timestamp with time zone,
	duration integer,
	unit character varying)
    RETURNS boolean
    LANGUAGE 'plpgsql'

AS $BODY$BEGIN
	RETURN datetime > NOW() - CAST(duration || ' ' || unit as INTERVAL);
END;$BODY$;

And use it as:

$filter = ["fresher_than(created_on_datetime, ?, 'YEAR')", $years];

This is fine for me, @ikkez you may close the issue if you wish 👍

Note: a view would probably had success there too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants