Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with Facebook on public RSS-Bridge instances #2047

Open
em92 opened this issue Apr 4, 2021 · 107 comments
Open

Problems with Facebook on public RSS-Bridge instances #2047

em92 opened this issue Apr 4, 2021 · 107 comments

Comments

@em92
Copy link
Contributor

em92 commented Apr 4, 2021

Due to many recent You must be logged in to view this page. This is not supported by RSS-Bridge issues coming from Facebook users (#2041, comments from #2014, #2037) I investigated those issues more clearly.

If I open "https://www.facebook.com/facebook/posts" from my home laptop, everything is fine posts are returned.
If I open "https://www.facebook.com/facebook/posts" from my public instance (https://feed.eugenemolotov.ru), it will return redirect to login page.

Looks like FacebookBridge has the same problems as InstagramBridge (#1891), which breaks using FacebookBridge on public RSS-Bridge instances.

Possible solutions for users (same as in metioned InstagramBridge):

  • Deploy RSS-Bridge on your personal PC or laptop and use FacebookBridge from there.
  • Deploy RSS-Bridge on your VPS, make sure that only certain people use it and use FacebookBridge from there.
@grivanov
Copy link

grivanov commented Apr 4, 2021

Thank you very much for investigating. I'm using shared hosting for mine and it's folder protected so only I have access, but it probably just checks the IP and since it's shared hosting, it's heavily used. May have to pay for a private IP in that case.

@mrtpcet
Copy link

mrtpcet commented Apr 4, 2021

I installed it on my vps (Infomaniak, Switzerland) and I am the only one using it. Unfortunately it doesn't work either.
I tried to visit a Facebook page with Firefox and it automatically redirects me to the login page.

@arcctgx
Copy link

arcctgx commented Apr 5, 2021

I'm running my single-user RSS Bridge instance on Digital Ocean, and feeds which were giving me error 500 since April 1st just started working again. Let's see for how long...

Edit: stopped working two hours later.

@Noutladeesse
Copy link

I am having exactly this problem for 1 week. Over 100 feeds created through Facebook main site bridge result in errors. I am using my personal laptop, so this cannot be the reason.
Yesterday, 3 feeds (out of the 100+) delivered lots of previous missed articles ; today, only 1 out of 100+ is working. Looks like it is random and erratic.
I am using them to deliver a daily news digest, it's been 1 week I cannot do it properly and need to check all sources 1 by 1. It is not efficient and time-consuming. What should I do?

@tstanbur
Copy link

tstanbur commented Apr 6, 2021

If I open "https://www.facebook.com/facebook/posts" from my home laptop, everything is fine posts are returned.
If I open "https://www.facebook.com/facebook/posts" from my public instance (https://feed.eugenemolotov.ru), it will return redirect to login page.

Hi @em92 ,

If you remove the /posts part of the url then you don't get the login page show, even on a public instance.

eg

https://www.facebook.com/facebook/posts (redirect to login page)

https://www.facebook.com/facebook (no redirect, page content shown).

Did you try that?

@ghost
Copy link

ghost commented Apr 7, 2021

@tstanbur I have the same problem than @Noutladeesse I dont understand how I can modify https://www.facebook.com/facebook/posts to https://www.facebook.com/facebook, I have an rss feed without facebook inside.

@tstanbur
Copy link

tstanbur commented Apr 7, 2021

@tstanbur I have the same problem than @Noutladeesse I dont understand how I can modify https://www.facebook.com/facebook/posts to https://www.facebook.com/facebook, I have an rss feed without facebook inside.

I have the same issue too!

I was just trying to help fix it, hopefully @em92 can (I think he's the author?)

@ghost
Copy link

ghost commented Apr 7, 2021

@tstanbur understand, my english is too poor. :)

@Noutladeesse
Copy link

@tstanbur understand, my english is too poor. :)

@cborne : @tstanbur a le même problème que nous bien que ses RSS feeds ne soient pas de feeds de Facebook, il demande si @em92 est l'auteur et s'il peut nous aider à résoudre le problème (je traduis !)

@ghost
Copy link

ghost commented Apr 7, 2021

@Noutladeesse merci j'ai fini par comprendre par la suite, au départ je ne comprenais pas ce que faisaient les urls en facebook au milieu mais il s'agit d'une proposition de correction pour @em92. L'anglais c'est pas vraiment comme le vélo, quand tu le pratiques pas ça revient pas tout seul. :)

@Noutladeesse
Copy link

@Noutladeesse merci j'ai fini par comprendre par la suite, au départ je ne comprenais pas ce que faisaient les urls en facebook au milieu mais il s'agit d'une proposition de correction pour @em92. L'anglais c'est pas vraiment comme le vélo, quand tu le pratiques pas ça revient pas tout seul. :)

:-D
Oui c'est une proposition de correction, mais ça ne marche pas pour les feeds déjà créés.

@woj-tek
Copy link

woj-tek commented Apr 8, 2021

Just another small "me too". I'm running RSS-Bridge on my personal VPS (only user) since a long while (~2 years) and I'm also affected by the issue. It started about 1-2 weeks ago, then it started working on Monday and was ok for about 2 days and now it stopped again.

It does seem like an Facebook action to block RSS-Bridge (probably with their silly reasoning that this would somehow make the people go back to using their awful service…)

@RealDutchie
Copy link

RealDutchie commented Apr 8, 2021

Here just one more ''me too''. I specifically signed up here on Github to ask a few things about the Facebook bridge. Until last week, I had been using a public host from Eugene Molotov to my full satisfaction for about a year (thanks a lot). I don't have any technical background, so it is sometimes difficult for me to be able to keep up with all the terms that come up with this topic here.

I wonder if the above and below option mentioned by em92 still works and how I could get it running on my own PC:

Deploy RSS-Bridge on your personal PC or laptop and use FacebookBridge from there.

I would be very happy if I could still use the Facebook Bridge in this way, but I am not sure if this still works and how to install it on my own PC. I have looked through github quite a bit, but unfortunately I can't figure it out myself, which is why I decided to sign up.

If users could confirm or deny that this feature still works, I would be happy with that. Then my next question would be how I can best put the bridge on my own PC or who I can ask for help or get information how to do so. I also think it would be a very good idea to start a donation fund to get a developer to maintain the facebook bridge and make also the instagram bridge work again. That way, we can all contribute to get our beloved feeds going again. Greetings from the Netherlands and thanks for your great work over the years!

@hellmachine2000
Copy link

hellmachine2000 commented Apr 8, 2021

Same here. Since April I got different errors in the same Feeds, like:

"Facebook Bridge | Main Site was unable to receive or process the remote website's content!
Error message: `You must be logged in to view this page. This is not supported by RSS-Bridge."

"Facebook Bridge | Main Site was unable to receive or process the remote website's content!
Error message: `The requested resource cannot be found!"

"Facebook Bridge | Main Site was unable to receive or process the remote website's content!
Error message: Call to a member function children() on null
Query string: action=display&bridge=Facebook&u=hyperlitemountaingear&media_type=all&limit=1000&format=Atom
Version: dev.2020-11-10"
Latest version of RSS-Bridge…

@ghost
Copy link

ghost commented Apr 9, 2021

I've been having these errors as well and I found that changing the cache_timeout parameter in FacebookBridge seems to reset the bridge, but it only works for a little while. I've tried 86400, 43200, 21600, 1, 0, and even eliminating the parameter. Somehow resetting the cache every time the bridge is called might be the solution to this problem?

@Noutladeesse
Copy link

I've been having these errors as well and I found that changing the cache_timeout parameter in FacebookBridge seems to reset the bridge, but it only works for a little while. I've tried 86400, 43200, 21600, 1, 0, and even eliminating the parameter. Somehow resetting the cache every time the bridge is called might be the solution to this problem?

Thank you for suggesting @Mthmgcn05
How do you reset the cache? (I am not an IT professional, only a user)

@ghost
Copy link

ghost commented Apr 9, 2021

After more testing and thought, it may be every time I redeployed, it worked for five minutes, so that could have been resetting it.

@miwcz
Copy link

miwcz commented Apr 9, 2021

It's seems that adding cookie "c_user=XXXX" where XXXX is my ID from Facebook cookie helped. I don't know how to add this only via Bridge, so I did it via contents.php for all requests, which is really bad, but... maybe it's the way for better solution :-)

EDIT: False alarm, not working again...

@em92
Copy link
Contributor Author

em92 commented Apr 10, 2021

@miwcz on my public instance I used c_user and xs values. Quick and dirty patch looks like this:

diff --git a/bridges/FacebookBridge.php b/bridges/FacebookBridge.php
index c03de4e..fafeabd 100644
--- a/bridges/FacebookBridge.php
+++ b/bridges/FacebookBridge.php
@@ -174,6 +174,8 @@ class FacebookBridge extends BridgeAbstract {
 		} else {
 			$header = array();
 		}
+		$header[] = 'Cookie: c_user=xxxx; xs=yyyy;';
+
 
 		$touchURI = str_replace(
 			'https://www.facebook',
@@ -560,11 +562,15 @@ EOD;
 				$header = array();
 			}
 
+			$header[] = 'Cookie: c_user=xxxx; xs=yyyy;';
+
+
 			$html = getSimpleHTMLDOM($this->getURI(), $header)
 				or returnServerError('No results for this query.');
 
 		}
 
 		// Handle captcha form?
 		$captcha = $html->find('div.captcha_interstitial', 0);
 

So far, so good.

@em92
Copy link
Contributor Author

em92 commented Apr 10, 2021

So far, so good.

I meant it is working on my instance at the moment.

@em92
Copy link
Contributor Author

em92 commented Apr 10, 2021

@tstanbur

hopefully @em92 can (I think he's the author?)

I am not author of this bridge. I maintain RSS-Bridge in general (reviewing pull requests, pinging bridge maintainers in issues) and bridges for Pikabu and Vk.

Usually maintainer of the bridge does fix bugs, but we don't have maintainer for Facebook bridge. I have little time to fix bugs in bridges, that I don't maintain.

@miwcz
Copy link

miwcz commented Apr 10, 2021

I have 20+ facebook feeds and this is working only for 4-5 first requests. It seems that facebook is blocking mutliple requests after short while.

@Noutladeesse
Copy link

@tstanbur

hopefully @em92 can (I think he's the author?)

I am not author of this bridge. I maintain RSS-Bridge in general (reviewing pull requests, pinging bridge maintainers in issues) and bridges for Pikabu and Vk.

Usually maintainer of the bridge does fix bugs, but we don't have maintainer for Facebook bridge. I have little time to fix bugs in bridges, that I don't maintain.

Is there anyone who maintains Facebook bridge? @em92

@em92
Copy link
Contributor Author

em92 commented Apr 11, 2021

I meant it is working on my instance at the moment.

Now it does not. Facebook disabled my account 'cos my account violates it's community standards. It pursuaded me to upload my photo (I did it, the real photo of me) and now I am waiting for reviewing.

@em92
Copy link
Contributor Author

em92 commented Apr 11, 2021

@Noutladeesse

Is there anyone who maintains Facebook bridge?

No.

@pin-grid-array
Copy link

I don't have any new information to add that other users haven't already discussed. I'm only here to say that it is happening to me too. I am running FB Bridge on Heroku and using Feedly to save the feeds. I started getting Bridge returned error 500! around the beginning of April.

Some feeds only get the error occasionally. Other feeds keep getting the error constantly, which makes those feeds useless.

Example error message:

Facebook Bridge | Main Site was unable to receive or process the remote website's content!
Error message: `You must be logged in to view this page. This is not supported by RSS-Bridge.`
Query string: `action=display&bridge=Facebook&context=User&u=[REDACTED]&media_type=all&limit=-1&format=Atom`
Version: `dev.2020-02-26`

    Press Return to check your input parameters
    Press F5 to retry
    Check if this issue was already reported on GitHub (give it a thumbs-up)
    Open a GitHub Issue if this error persists

teromene, logmanoriginal

@INPoppoRTUNE
Copy link

@dvikan Is the Facebook bridge still broken? Is it still without a maintainer? I cannot find the maintainers list anymore here on GitHub.

@triatic
Copy link
Contributor

triatic commented Apr 6, 2022

It was totally broken for me as soon as Facebook instigated an IP address block on cloud computing providers.

@INPoppoRTUNE
Copy link

It was totally broken for me as soon as Facebook instigated an IP address block on cloud computing providers.

For me too.
I'd like to know if any maintainer has worked on this problem since then (due to the issue management done 10 days ago) and if it makes sense to re-test the bridge or will I face the same issues?

@mwalbeck
Copy link

mwalbeck commented Apr 6, 2022

Same here, I initially solved it by hosting the bridge at home. Though a few months after that Facebook migrated the pages I was monitoring to the new design which requires javascript to load, which of course completely broke the bridge. So as the new Facebook site is being rolled out, fewer pages will function and at some point the bridge will be completely borked.

@Bockiii
Copy link
Contributor

Bockiii commented Apr 6, 2022

Facebook is one of the hardest to maintain and it also has a few key limiters, even if it's maintained. Public instances almost always run into some form of facebook detection, so it's barely possible to keep it working for the people who run it at home.

Also, because facebook is constantly evolving and doing rolling releases, one feed might work while the other is broken, because fb changed the site underneath.

@mdemoss
Copy link
Contributor

mdemoss commented Sep 13, 2022

Is there currently a way to rate-limit locally? That is, prevent my self-hosted bridge from making too many requests to a given site too quickly? I'd probably also need my feed reader to refresh feeds in a random order.

@triatic
Copy link
Contributor

triatic commented Sep 13, 2022

@mdemoss you can increase CACHE_TIMEOUT in the bridge code.

@mdemoss
Copy link
Contributor

mdemoss commented Sep 15, 2022

@mdemoss you can increase CACHE_TIMEOUT in the bridge code.

That would be a timeout per URL, right? I'm thinking a timeout per domain might be necessary since some of these sites are pretty aggressive about rate-limiting.

@triatic
Copy link
Contributor

triatic commented Sep 15, 2022

@mdemoss you can increase CACHE_TIMEOUT in the bridge code.

That would be a timeout per URL, right? I'm thinking a timeout per domain might be necessary since some of these sites are pretty aggressive about rate-limiting.

CACHE_TIMEOUT is per domain since it applies to every URL on that bridge.

@dvikan
Copy link
Contributor

dvikan commented Oct 6, 2022

As far as I can tell the facebook posts tab is forever gone. E.g.

https://www.facebook.com/eminem/posts

https://www.facebook.com/ladygaga/posts

https://www.facebook.com/Starbucks/posts

They all redirect to the main page which is an unscrapable SPA

@triatic
Copy link
Contributor

triatic commented Oct 6, 2022

Data is fetched via a POST request to https://www.facebook.com/api/graphql/

Finding the correct request payload could be a challenge.

@dvikan
Copy link
Contributor

dvikan commented Oct 6, 2022

We need someone with very much perseverance and patience to scrape the graphql api.

@juanjosepablos
Copy link
Contributor

Maybe get some ideas from other projects:
https://github.com/kevinzg/facebook-scraper

@somini
Copy link
Contributor

somini commented Jul 6, 2023

Maybe get some ideas from other projects: https://github.com/kevinzg/facebook-scraper

That doesn't seem to work for me, from a normal residential IP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests