Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle ip behind proxies #914

Open
wants to merge 21 commits into
base: main
Choose a base branch
from
Open

Handle ip behind proxies #914

wants to merge 21 commits into from

Conversation

michield
Copy link
Member

Description

handle proxies better

Related Issue

Screenshots (if appropriate):

@michield
Copy link
Member Author

I think most of these were already done before. Just the one commit let.

@michield michield marked this pull request as ready for review December 13, 2022 21:27
@bramley
Copy link
Contributor

bramley commented Dec 18, 2022

The function hostname() seems to be used in only a few places when constructing a URL. Elsewhere the global variable $website which is taken from getConfig('website') is used for that.
Will this change use X_FORWARDED_FOR everywhere it is required?

@michield
Copy link
Member Author

You're right. We should the hostname() here, so that it initialises the website and domain config settings correctly

https://github.com/phpList/phplist3/blob/main/public_html/lists/admin/connect.php#L21

@michield
Copy link
Member Author

Not only that, I'm using the wrong value. I should use HTTP_X_FORWARDED_HOST or HTTP_X_FORWARDED_SERVER

HTTP_X_FORWARDED_FOR is the IP address of the user.

…detection and hard code the admin and frontend locations.
@michield
Copy link
Member Author

Ok. I've decided that "auto detection" and config is too complicated and open to massive errors. Instead I've now introduced "ADMIN_WWWROOT" and "USER_WWWROOT" so that the URLs can be hard coded.

This is particularly relevant in eg Docker environments, where the Proxy container may forward to the phpList container and phplist runs locally from eg "http://phplist/lists" but the frontend URL is something like https://website.com/newsletter/

@michield
Copy link
Member Author

The CI is failing, I'll find out why

@michield
Copy link
Member Author

ok, good we have CI. There were some errors, but that's fixed now.

I may add some tests with these values set in the config file as well.

@michield michield requested a review from bramley August 9, 2023 19:41
@bramley
Copy link
Contributor

bramley commented Aug 10, 2023

I'm not sure that I understand the problem this is solving. Is it that admin access is on a different host/path than public access?

From your earlier comment

public access, i.e. subscribes, link clicks, etc is on https://website.com/newsletter/
but admin access is on http://phplist/lists/admin

What is stopping admin access using https://website.com/newsletter/admin ?

Ignore the above as I have now seen the explanation in config_extended.php

The example given has the same path (newsletter) for both. Is that just the example, or are you intending to allow the paths to differ?

#define('ADMIN_WWWROOT','https://admin.mydomain.com:8080/newsletter/admin');
#define('USER_WWWROOT','https://mydomain.com/newsletter');

In an earlier comment there is an example where one uses newsletter and the other uses lists

@michield
Copy link
Member Author

In an earlier comment there is an example where one uses newsletter and the other uses lists

Yes, the idea is that in a proxy setup (eg Docker), you can have /newsletter/ as the public location, but it maps to /lists/ inside the phpList container. Additionally for security, you could map the admin pages via a different proxy, which is not public. So, it's useful to allow a mix of locations.

@bramley
Copy link
Contributor

bramley commented Aug 14, 2023

I have no experience of what you are describing, but if it is required then isn't there an external way to achieve it, using apache rewriting or some other way of mapping one to the other? What do other packages, such as Wordpress or Drupal, do to achieve this?

The code changes, although individually simple, are very widespread. I'd be inclined to not include this in the next release but continue with more testing by installing in one of your own environments that need the new feature.

Copy link
Contributor

@bramley bramley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a number of global variables or settings that now become ambiguous such as
$pageroot, $adminpages, [WEBSITE] and $website, [DOMAIN] and $domain,

I think you need to try to remove use of all those by setting new variables that are independent of whether the new defines are being used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest create a global variable $adminBaseUrl similar to $publicBaseUrl so that this and several others can be simplified to

$history_entry = $adminBaseUrl.'/?page=user&id='.$userid."\n\n";

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$progressUrl = $GLOBALS['admin_scheme'].'://'.hostName().$GLOBALS['adminpages'].'/?page=messages&tab=active';

Not sure why hostName() is used here but it can get replaced by using a new variable $adminBaseUrl

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    $clicktrack_root = sprintf('%s://%s/lt.php', $GLOBALS['public_scheme'], $website.$GLOBALS['pageroot']);

This needs to change to use $publicBaseUrl .

@@ -403,7 +403,11 @@ function sendAdminPasswordToken($adminId)
$emailBody = $GLOBALS['I18N']->get('Hello').' '.$adminName."\n\n";
$emailBody .= $GLOBALS['I18N']->get('You have requested a new password for phpList.')."\n\n";
$emailBody .= $GLOBALS['I18N']->get('To enter a new one, please visit the following link:')."\n\n";
$emailBody .= sprintf('%s://%s/?page=login&token=%s', $GLOBALS['admin_scheme'], $urlroot, $key)."\n\n";
if (defined('ADMIN_WWWROOT')) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the wrong way around, but should be replaced by using a single variable.

@@ -580,7 +580,11 @@ function processMessages($link, $max = 3000)
if ($row['user']) {
$userdata = Sql_Fetch_Array_Query("select * from {$tables['user']} where id = ".$row['user']);
}
$report_linkroot = $GLOBALS['admin_scheme'].'://'.$GLOBALS['website'].$GLOBALS['adminpages'];
if (defined('ADMIN_WWWROOT')) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one too is the wrong way around.

if (defined('USER_WWWROOT')) {
$domainParts = parse_url(USER_WWWROOT);
$D_website = $domainParts['host'];
if ($domainParts['port'] != 80 && $domainParts['port'] != 443) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$domainParts['port'] will be set only if a port was in the URL. But it seems simpler just to include whatever is in the URL, even if is one of the standard port numbers

if (isset($domainParts['port')) {
    $D_website .= ":".$domainParts['port'];
}

@@ -708,7 +700,7 @@ function mb_strtolower($string)
}

if (WARN_ABOUT_PHP_SETTINGS && !$GLOBALS['commandline']) {
if (strpos(getenv('REQUEST_URI'), $pageroot.'/admin') !== 0) {
if (!defined('USER_WWWROOT') && strpos(getenv('REQUEST_URI'), $pageroot.'/admin') !== 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (!defined('USER_WWWROOT') && strpos(getenv('REQUEST_URI'), $pageroot.'/admin') !== 0) {

This test is of the admin URL not the public URL, but should it still be a valid test when ADMIN_WWWROOT is used?

When using the new USER_WWWROOT and ADMIN_WWWROOT Which value should $pageroot in config.php be set to?

$pageroot = '/lists';
}
$publicBaseUrl = "http://[WEBSITE]$pageroot";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work for creating link tracking URLs because the variable doesn't have placeholders replaced

$publicBaseUrl = "http://[WEBSITE]$pageroot";

It does work for creating the user tracking image URL because the img element is added to the message before placeholders are replaced.

I suggest moving the handling of USER_WWWROOT to a point where getConfig() can be called to get the WEBSITE. Maybe at the start of defaultconfig.php, where a similar $adminBaseUrl can be set.

It should use the public scheme which is set earlier $public_scheme

if ($pageroot == '/') {
$pageroot = '';
}
if (defined('USER_WWWROOT')) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (defined('USER_WWWROOT')) {
  $pageroot = USER_WWWROOT;
  $publicBaseUrl = USER_WWWROOT;
} else {

$pageroot is only the path, not the whole URL. But I am not sure that is should be used, as there is ambiguity as to which path is meant - the public or admin. Hiding it with the $publicBaseUrl and $adminBaseUrl variables should mean that it is no longer used.

}

// as the "admin" in adminpages is hardcoded, don't put it in the config file
$adminpages = $GLOBALS['pageroot'].'/admin';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is using $pageroot taken from the public URL but it is used to create URLs for admin pages. Related to the previous comment.

@michield
Copy link
Member Author

I have no experience of what you are describing, but if it is required then isn't there an external way to achieve it, using apache rewriting or some other way of mapping one to the other? What do other packages, such as Wordpress or Drupal, do to achieve this?

The code changes, although individually simple, are very widespread. I'd be inclined to not include this in the next release but continue with more testing by installing in one of your own environments that need the new feature.

Ok. thanks for the review. I will go through the suggestions and check in the next few days.

I don't know Wordpress or Drupal, but I do know Moodle, which has a global CFG variable where it hardcodes the URL of the system, and it doesn't work without that set.

It's just very tricky to "detect" the URL environment, as there are many options. phpList currently does that, but it causes a lot of issues when in a proxied environment. Maybe we also need to remove the "website" and "domain" config variables, and force them to be set in the config file, but that may have a major impact on existing systems.

@michield
Copy link
Member Author

Thanks, I've reviewed your suggestions and I think the idea of an adminBaseUrl is good.

The main complication is this:

  1. publicBaseUrl is set and then used to pre-populate the default config. So, users can then change it and the final value lives in the DB. It defaults as something like "https://[WEBSITE]/lists/?p=subscribe" etc, and then [WEBSITE] is replaced on the fly.

  2. If someone changes that in the config then the DB config is taken instead.

I think we need to remove this. There's no point in having it editable, as it's application specific. The only one that could be editable is the "subscribe page" so that people can point it at their own website for example.

What do you think. The result would be:

  • publicBaseUrl: set with current "detection" of the location
  • adminBaseUrl: $publicBaseUrl + /admin
  • use the new constants to deviate from it
  • allow "subscribe URL" to be editable, but remove the [WEBSITE] stuff
  • remove all other "URL configs" and hard-code their location based on the publicBaseUrl

@bramley
Copy link
Contributor

bramley commented Aug 17, 2023

I don't know. The changes just seem to be getting larger. I'm away for a few days so won't be able to look at anything now.

@bramley
Copy link
Contributor

bramley commented Aug 21, 2023

  • publicBaseUrl: set with current "detection" of the location
  • adminBaseUrl: $publicBaseUrl + /admin
  • use the new constants to deviate from it

I think this is fine. Can the website and subscribe page changes wait until another release? The new defines can be explained that they override all other settings such as the [WEBSITE] setting and $pageroot in config.php.

What will happen to globals such as $website, $systemroot, $public_scheme etc? I think that these need to remain because they are used by some plugins, such as CKEditor, so will need to be derived when the new defines are used.

@michield
Copy link
Member Author

I think this is fine. Can the website and subscribe page changes wait until another release? The new defines can be explained that they override all other settings such as the [WEBSITE] setting and $pageroot in config.php.

Well, the issue is the two-step approach we use at the moment, which needs to go:

  1. set URL configs to default
  2. allow editing them in the UI
  3. use the DB values

With the new constants step 3 would have to change to (a) from DB or (b) from constant, so that's more code changes than changing it in step 1.

But I'll give it a try.

What will happen to globals such as $website, $systemroot, $public_scheme etc? I think that these need to remain because they are used by some plugins, such as CKEditor, so will need to be derived when the new defines are used.

Yes, we can easily parse those out of the new constants with url_parse.

I'll add that to my PR.

@michield
Copy link
Member Author

Ok, I think I have addressed all the issues, but we may need to do an extensive test if this is in an RC.

  1. publicBaseUrl
  2. publicConfigBaseUrl -> to initialise the config, but leave it at that
  3. adminBaseUrl
  4. set public_schema and website when the constants are in use
  5. systemroot is not affected here

@michield
Copy link
Member Author

Hmm the CI fails, but I think that's not the code, but Github playing up

@michield
Copy link
Member Author

Hmm the CI fails, but I think that's not the code, but Github playing up

I re-started the actions and they pass now

@bramley
Copy link
Contributor

bramley commented Aug 31, 2023

A couple of my previous comments were missed, and a few more.

  • In init.php
if (defined('USER_WWWROOT')) {
  $pageroot = USER_WWWROOT;
  $publicConfigBaseUrl = USER_WWWROOT;

$pageroot needs to be the path such as /lists

  • In sendemaillib.php there are a few lines that should be changed to use the $publicBaseUrl variable
$clicktrack_root = sprintf('%s://%s/lt.php', $GLOBALS['public_scheme'], $website.$GLOBALS['pageroot']);

$masked .= '&hm='.hash_hmac(HASH_ALGO, sprintf('%s://%s/lt.php?tid=%s', $GLOBALS['public_scheme'], $website.$GLOBALS['pageroot'], $masked), HMACKEY);

$viewurl = $GLOBALS['public_scheme'].'://'.$website.$GLOBALS['pageroot'].'/dl.php?id='.$att['id'];

  • If you grep for pageroot there are still many remaining uses that need to be reviewed to see whether they need changing or removing.

  • There is a use of $adminpages in function sendAdminPasswordToken() in file lib.php that now doesn't look correct.

     $urlroot = getConfig('website').$GLOBALS['adminpages'];
    
  • The last commit of a change to index.php omitted a '/' character

      $html .= '<p><a href="'.$GLOBALS['adminBaseUrl'].'?page=spage" class="button">'.s('Go back to admin area').'</a></p>';
    
  • Some of the code changes have been indented two characters instead of 4.

  • For a new installation the website config entry is set to the host of the admin URL which will be ADMIN_WWWROOT when that is being used, shouldn't it be the host of USER_WWWROOT instead?

I suggest that you test this on its own before trying to apply the changes. Can you test creating a campaign using ckeditor, as I don't know whether it will use the correct domain for image URLs.
Update - I have checked this and the plugin constructs the URL for an image like this, which I think is correct

            $uploadUrl = sprintf('%s://%s/%s', $public_scheme, $website, ltrim(UPLOADIMAGES_DIR, '/'));

@michield
Copy link
Member Author

michield commented Sep 4, 2023

Just a quick explanation on why this is needed. In Docker, or other "proxied" setups, you have one container (eg swag) at the front sending traffic to backend containers. The backend one has eg "phplist" as the Apache Host and additionally running on HTTP, because the proxy does the SSL. So, you end up with all links being http://phplist/lists/admin/ etc, instead of what the actual site runs on being https://mysite.com/lists/admin/

@michield
Copy link
Member Author

michield commented Sep 4, 2023

  • Pageroot needs to match the value on the backend server, so I've updated that in the explanation of the value.
  • I've fixed the init.php and sendemaillib.php
  • the urlroot line in sendAdminPasswordToken is obsolete.
  • the missing / added

But yes, CKeditor file upload stopped working. I will investigate why.

Correction, the images folder wasn't readable.

@michield
Copy link
Member Author

michield commented Sep 4, 2023

@michield
Copy link
Member Author

michield commented Sep 4, 2023

Ok, that's something local. It also happens with the "main" branch or the phplist-3.6.13 branch.

Considering this change, I think we should number the next one 3.7.0 and tell people to wait with upgrading.

I think we postpone this one, and get all the other PRs out, including #986 which is a security issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.7.x PRs for 3.7
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants