Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support postgres DBs in tt-rss #37

Open
nhoening opened this issue Aug 16, 2012 · 17 comments
Open

Support postgres DBs in tt-rss #37

nhoening opened this issue Aug 16, 2012 · 17 comments
Assignees

Comments

@nhoening
Copy link
Owner

The inserting code currently only speaks to MySQL databases.

We have MySQL-specific code that sets the charset to be utf-8 and that gets the last inserted ID. For postgres, we would need equivalents.

Also, we would need a new option in the config file to select MySQL or postgres.

@ghost ghost assigned nhoening Aug 16, 2012
@salyavin
Copy link

I would like this with Google reader shutting down in July getting this data into postgresql would be nice.

@nhoening
Copy link
Owner Author

Well have you tried it on postgres yet? I didn't ever, so it might very well be that the import actually works on postgres as-is. As far as I can tell, the SQL code I generate is not fancy on any way (well it uses sub selects, but postgres has them, too).

I'd suggest you backup your database (or use a copy for testing) and give it a try...

salyavin notifications@github.com schrieb:

I would like this with Google reader shutting down in July getting this
data into postgresql would be nice.


Reply to this email directly or view it on GitHub:
#37 (comment)

@salyavin
Copy link

When I try it I get a lot of invalid commands
and various things like this (I ran it with LC_ALL=C but my system language is Japanese
^
ERROR: 列"published"は型booleanですが、式は型integerでした
LINE 1: ... SELECT '', '', '', max(id), NULL, 2, 0, 1, ' ',...
^
HINT: 式を書き換えるかキャストしなければなりません
ERROR: 時間帯の置換が範囲外です: "2007-07-31 07-23-30"
LINE 1: ...annewbie.com/2007/07/31/asaba-ryokan-izu-japan/', '2007-07-3...
^
ERROR: 列"published"は型booleanですが、式は型integerでした
LINE 1: ... SELECT '', '', '', max(id), NULL, 2, 0, 1, ' ',...
^
HINT: 式を書き換えるかキャストしなければなりません
ERROR: "timestamp"型の入力構文が無効です: "2007-08-02 06-07-45"
LINE 1: ...08/02/japanese-inn-meal-at-asaba-ryokan-in-ise/', '2007-08-0...
^
ERROR: 列"published"は型booleanですが、式は型integerでした
LINE 1: ... SELECT '', '', '', max(id), NULL, 2, 0, 1, ' ',...
^
HINT: 式を書き換えるかキャストしなければなりません
ERROR: "timestamp"型の入力構文が無効です: "2007-08-03 12-04-21"
LINE 1: ...m/2007/08/03/6-superfoods-that-prevent-disease/', '2007-08-0...
^
ERROR: 列"published"は型booleanですが、式は型integerでした
LINE 1: ... SELECT '', '', '', max(id), NULL, 2, 0, 1, ' ',...
^
HINT: 式を書き換えるかキャストしなければなりません

@nhoening
Copy link
Owner Author

I really can't say if this is a problem that has to do with postgres or with the fact that my tool doesn't handle Japanese correctly (or that something went wrong on your end). The fact that I can't read Japanese doesn't help :) I mean, the end result should be valid utf-8, but without understanding postgres complaints, I don't know what to do. Cab you shed light on them?

salyavin notifications@github.com schrieb:

When I try it I get a lot of invalid commands
and various things like this (I ran it with LC_ALL=C but my system
language is Japanese
^
ERROR: 列"published"は型booleanですが、式は型integerでした
LINE 1: ... SELECT '', '', '', max(id), NULL, 2, 0, 1, '
',...
^
HINT: 式を書き換えるかキャストしなければなりません
ERROR: 時間帯の置換が範囲外です: "2007-07-31 07-23-30"
LINE 1: ...annewbie.com/2007/07/31/asaba-ryokan-izu-japan/',
'2007-07-3...
^
ERROR: 列"published"は型booleanですが、式は型integerでした
LINE 1: ... SELECT '', '', '', max(id), NULL, 2, 0, 1, '
',...
^
HINT: 式を書き換えるかキャストしなければなりません
ERROR: "timestamp"型の入力構文が無効です: "2007-08-02 06-07-45"
LINE 1: ...08/02/japanese-inn-meal-at-asaba-ryokan-in-ise/',
'2007-08-0...
^
ERROR: 列"published"は型booleanですが、式は型integerでした
LINE 1: ... SELECT '', '', '', max(id), NULL, 2, 0, 1, '
',...
^
HINT: 式を書き換えるかキャストしなければなりません
ERROR: "timestamp"型の入力構文が無効です: "2007-08-03 12-04-21"
LINE 1: ...m/2007/08/03/6-superfoods-that-prevent-disease/',
'2007-08-0...
^
ERROR: 列"published"は型booleanですが、式は型integerでした
LINE 1: ... SELECT '', '', '', max(id), NULL, 2, 0, 1, '
',...
^
HINT: 式を書き換えるかキャストしなければなりません


Reply to this email directly or view it on GitHub:
#37 (comment)

@nhoening
Copy link
Owner Author

Well, the most probable explanation is that the sub SELECT statement is giving postgres trouble (a syntax error). Still not sure. I don't have a postgres database with tt-rss installed nor time to get one. Would you be willing to import a test snippet from my starred items (say, one or two articles) to see if the problem is independent of japanese?

@salyavin
Copy link

Sure. I would be happy to.

@nhoening
Copy link
Owner Author

okay, I think I can send you something tonight (it's 11:45am where I am, as a reference).

----- Original Message -----
From: notifications@github.com
To: gritttt-rss@noreply.github.com
Date: 25.03.2013 11:32:41
Subject: Re: [gritttt-rss] Support postgres DBs in tt-rss (#37)

Sure. I would be happy to.


Reply to this email directly or view it on GitHub:
#37 (comment)

@salyavin
Copy link

ERROR: 時間帯の置換が範囲外です: "2007-07-30 00-23-16"
LINE 1: ...://feeds.feedburner.com/~r/tofugu/~3/139358638/', '2007-07-3...
^
ERROR: 列"published"は型booleanですが、式は型integerでした
LINE 1: ... SELECT '', '', '', max(id), NULL, 2, 0, 1, ' ',...

I am seeing errors like replacement date out of range and published row is a boolean when it should be an integer

@salyavin
Copy link

Here we go
ERROR: time zone displacement out of range: "2007-07-30 00-23-16"
行 1: ...://feeds.feedburner.com/~r/tofugu/~3/139358638/', '2007-07-3...
^
ERROR: column "published" is of type boolean but expression is of type integer
行 1: ... SELECT '', '', '', max(id), NULL, 2, 0, 1, ' ',...
^
HINT: You will need to rewrite or cast the expression.
ERROR: time zone displacement out of range: "2007-07-31 07-23-30"
行 1: ...annewbie.com/2007/07/31/asaba-ryokan-izu-japan/', '2007-07-3...
^
ERROR: column "published" is of type boolean but expression is of type integer
行 1: ... SELECT '', '', '', max(id), NULL, 2, 0, 1, ' ',...
^
HINT: You will need to rewrite or cast the expression.
ERROR: invalid input syntax for type timestamp: "2007-08-02 06-07-45"
行 1: ...08/02/japanese-inn-meal-at-asaba-ryokan-in-ise/', '2007-08-0...
^
ERROR: column "published" is of type boolean but expression is of type integer
行 1: ... SELECT '', '', '', max(id), NULL, 2, 0, 1, ' ',...
^
HINT: You will need to rewrite or cast the expression.
ERROR: invalid input syntax for type timestamp: "2007-08-03 12-04-21"
行 1: ...m/2007/08/03/6-superfoods-that-prevent-disease/', '2007-08-0...
^
ERROR: column "published" is of type boolean but expression is of type integer
行 1: ... SELECT '', '', '', max(id), NULL, 2, 0, 1, ' ',...
^
HINT: You will need to rewrite or cast the expression.

@nhoening
Copy link
Owner Author

Thanks. So maybe postgres wants a different kind of date format than MySQL. I used '%Y-%m-%d %H-%M-%S',but postgres probably wants the time separated with colons, not with dashes (e.g. see Section 8.5.1.3), so I should use '%Y-%m-%d %H:%M:%S', but I'm still just guessing bcs I have limited time.

And in MySQL I use the published flag as 0 or 1 (by casting the Python boolean), but Postgres wants a boolean expression (I think postgres uses true and false).

I think I can push out a version of the script tonight that would treat these two things differently when it's a postgres database. Then we can see how far that got us. Thanks for standing by :)

@salyavin
Copy link

Thanks a bunch, you are sure trying quickly. It is 8:20PM in here Japan, I will probably be able to try it later in the morning.

@disconn3ct
Copy link

I found a lot of issues with postgres. I managed to fix a few but it still doesn't quite work. I'll submit a pull req if I get it working completely.

Added:

I didn't see the later comments. The first problem I found is the multi-line inserts don't track 's properly, and (as mentioned above) true/false vs 0/1.

@nhoening
Copy link
Owner Author

Awesome! Then I'll hold back my blind flight efforts for now.

Dis McCarthy notifications@github.com schrieb:

I found a lot of issues with postgres. I managed to fix a few but it
still doesn't quite work. I'll submit a pull req if I get it working
completely.


Reply to this email directly or view it on GitHub:
#37 (comment)

@nhoening
Copy link
Owner Author

I accepted a pull request (see above) by @tafryn which makes the MySQL/Postgres distinction nicely. What still could be done is to

  • deal with \n symbols and the like (@disconn3ct has been working on this, first trying E strings, a Postgres specialty)
  • transaction support, such that if one INSERT statement fails for some reason, the previously imported ones are rolled back. That is also something @disconn3ct proposed. Edit: I made a new ticket for that, see Transaction support #50.

@nhoening
Copy link
Owner Author

I'd like to hear from Postgres users if they are successful with this, as I didn't test it myself.

@salyavin
Copy link

This appeared to work perfectly for me, thank you very much!

@nhoening
Copy link
Owner Author

Wow, that was fast, thanks. Hey guys, someone successfully imported freaking Japanese articles into Postgres. Woohoo :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants