Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix shadow auth for users defined under [users] section #706

Conversation

Justin-Kwan
Copy link

@Justin-Kwan Justin-Kwan commented Apr 14, 2022

Description

When a user is not defined within userlist.txt and exists as a user in the Postgres server, PgBouncer should send an auth query to retrieve the user's md5 password hash in order to log the user in.

However, as outlined in issue: #484, if a user's password is not defined in userlist.txt and the user has overridden configurations under the [users] section in PgBouncer's ini file, the user is unable to login to the Postgres server through PgBouncer.

Root Cause

This happens because any users defined under [users] are parsed and added as new users in loader.c's parse_user. As result, when the real and live user connects, PgBouncer thinks they already exist (not NULL) and skips the shadow authorization process where the user's md5 password hash is queried from pg_shadow.

Fix

This PR introduces a new flag to track whether a user was created from ini file parsing or from a real live login. Then, we determine whether a new user logging in requires shadow authorization.

This fix should now allow PgBouncer operators to use the max_user_connections per user feature.

cc @viggy28

@Justin-Kwan Justin-Kwan changed the title Allow shadow login for users with custom ini configurations Force shadow auth against users with custom ini configurations Apr 14, 2022
@@ -931,6 +933,40 @@ test_password_server() {
return 0
}

# test password authentication from PgBouncer to PostgreSQL server using
# auth_query to retrieve user's shadow password from pg_shadow
test_shadow_password_server_login() {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These test cases are reproducible examples of bug (they fail without the fix): #484

@Justin-Kwan Justin-Kwan force-pushed the Justin-Kwan/bugfix/allow_users_section_shadow_login branch 5 times, most recently from 6b5ec9a to 2b39436 Compare April 21, 2022 21:15
original_user->is_preconfigured = false;

tmp_passwd = strdup(login_user->passwd);
safe_strcpy(original_user->passwd, tmp_passwd, sizeof(original_user->passwd));
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy over password of new login user to account for a single user that may have different passwords for multiple databases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you are copying every time in case a user has a different password in different databases? I think even with this code that could lead to problems, because the original_user is shared between clients of different databases. So maybe we should just document that a user should not have different passwords for different databases when using auth_query/auth_user.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the PR. find_original_user no longer exists. To support a user logging into different databases with different passwords (usually when databases are on different hosts), each PgDatabase* now stores a list of user passwords.

struct PgDatabase {
	struct List head;
	char name[MAX_DBNAME];	/* db name for clients */
	struct AATree user_passwds;	/* user passwords for this database from auth_query */
...
struct PgUserPassword {
	struct AANode tree_node;	/* used to attach user password to tree */
	char username[MAX_USERNAME];
	char passwd[MAX_PASSWORD];
};

Semantically, this should be clearer since it models reality, where each database may have different passwords per user. PgUser* still retains a password field to store a password from auth_file which will be used (forced) by default if it exists, when connecting to any database.

Storing user passwords in each PgDatabase* also conveniently frees (purges) passwords whenever the database is killed in method kill_database to prevent stale logins. This is preferred over storing a list of passwords in PgUser* (which would require another scan to free users' passwords if belonging to a database killed).
Screen+Shot+2022-07-12+at+5 58 05+PM

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change seems to somehow have been reverted on your branch. Possibly due to a rebase.

@Justin-Kwan Justin-Kwan force-pushed the Justin-Kwan/bugfix/allow_users_section_shadow_login branch 3 times, most recently from a9e4fdd to f6a718f Compare May 11, 2022 02:33
@Justin-Kwan Justin-Kwan changed the title Force shadow auth against users with custom ini configurations Fix shadow auth for users defined under [users] section May 16, 2022
Copy link
Member

@JelteF JelteF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new tests that you added also pass with the following simpler version of find_original_user. Are the rest of things that you added to this function really necessary?

/* find original or preconfigured user given the login user */
PgUser *find_original_user(PgUser *login_user)
{
	/*
	 * The original user will either be the current login user, a preconfigured user via
	 * ini file [users] section, a preconfigured user via global auth_file, a PAM user,
	 * a user attached to a database via auth_query, a forced user with user= entry via
	 * ini file [databases] section, or not exist yet if the user has never connected.
	 */
	PgUser *original_user = find_user(login_user->name);

	/* add new login user to reload their configurations at runtime */
	if (original_user == NULL) {
		return login_user;
	}

	/* original user is no longer preconfigured and won't send auth queries on future login */
	if (original_user->is_preconfigured) {
		original_user->is_preconfigured = false;
		if (original_user->passwd[0] == '\0') {
			safe_strcpy(original_user->passwd, login_user->passwd, sizeof(original_user->passwd));
		}
	}

	return original_user;
}

if (original_user->is_preconfigured)
original_user->is_preconfigured = false;

tmp_passwd = strdup(login_user->passwd);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this strdup seems unnecessary, you should be able to copy the password directly from user to user.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

find_original_user no longer exists now.

PgUser *original_user = find_user(login_user->name);

/* add new login user to reload their configurations at runtime */
if (original_user == NULL) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand what this is needed for. Can you add a test showing what you mean with the RELOAD behaviour that you want to support? i.e. one that fails without this put_in_order code.

original_user->is_preconfigured = false;

tmp_passwd = strdup(login_user->passwd);
safe_strcpy(original_user->passwd, tmp_passwd, sizeof(original_user->passwd));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you are copying every time in case a user has a different password in different databases? I think even with this code that could lead to problems, because the original_user is shared between clients of different databases. So maybe we should just document that a user should not have different passwords for different databases when using auth_query/auth_user.

@@ -387,6 +387,36 @@ PgUser *add_user(const char *name, const char *passwd)
return user;
}

/* find original or preconfigured user given the login user */
PgUser *find_original_user(PgUser *login_user)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function name isn't really descriptive of what this function is doing, especially the put_in_order call.

test_shadow_password_server_login() {
$have_getpeereid || return 77

admin "set auth_type='md5'"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it makes sense to include the same tests also for scram auth?

@Justin-Kwan
Copy link
Author

Hi @JelteF! Thanks for taking a look at this PR. This PR is actually very outdated, I'm going to repush everything.

@Justin-Kwan
Copy link
Author

Justin-Kwan commented Aug 12, 2022

We've been running these PgBouncer changes in primary production for nearly a month now with no issues.

Some more details around refactoring work done here:

Context:

  • In the OSS PgBouncer, PgUser* may exist in multiple contexts: (see OSS discussion) pgbouncer crashes when user is equal to auth_user in database config #314 (comment)
  • If a user is defined in userlist.txt or pgbouncer.ini [users] section, then they were stored in global user_tree
  • Any user not defined in userlist.txt (requiring an auth_query to retrieve the user's password from Postgres) would be stored locally under PgDatabase->user_tree
  • From this, you can imagine that a single user connecting to multiple databases would be duplicated in memory
  • This is confusing to understand, because semantically, one user exists across all databases in a single Postgres cluster (host server)
  • The fragmented storage of PgUser* prevents max_user_connections per user to function correctly, because PgUser* will be loaded from [users] and stored in the global user_tree. set_pool assumes that if a user is in the global tree, it's password exists in auth_file, however this may not be true (if a user with max_user_connections is defined in [users] but it's password is not defined in auth_file)

This PR:

  • In order to correctly support global PgUser configuration settings (max_user_connections, etc.), all users now reside globally in user_tree to simplify centrally storing/enforcing configuration values against
  • Refactors PgBouncer to handle multiple passwords per user
  • PgDatabase->user_passwds contains all passwords for users to log into it (user with multiple passwords rather than multiple user instances)
  • This approach was chosen to model reality where each database keeps track of all passwords for users all users
    It is cleaner to store user to password pairs per database because each user is unique by name (easily identifiable), storing the other way around (database to password pairs in PgUser would be more convoluted because database names aren't unique, and would require appending port, etc)
  • When the database is periodically deallocated (in kill_database), all passwords will be cleanly freed as well, as opposed to having to make an expensive lookup through all users to find which have passwords for the freed database, and removing those passwords

src/admin.c Outdated

/* PAM requires passwords as well since they are not stored externally */
if (cf_auth_type == AUTH_PAM && !find_user(sk->login_user->name))
password = sk->login_user->passwd;
if (is_auth_user_present || is_pam_auth_used)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done to prevent unnecessary extra call to find_user.

src/client.c Outdated
@@ -339,13 +343,12 @@ bool set_pool(PgSocket *client, const char *dbname, const char *username, const
}
} else {
client->login_user = find_user(username);
if (!client->login_user) {
if (!client->login_user || user_requires_auth_query(client->login_user)) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Existence of login_user does not imply authentication of the user, as the user could exist due to being parsed from [users] section, without a password in auth_file. This motivates the use of method user_requires_auth_query.

src/client.c Outdated
@@ -394,7 +398,9 @@ bool handle_auth_query_response(PgSocket *client, PktHdr *pkt) {
}
break;
case 'D': /* DataRow */
memset(&user, 0, sizeof(user));
memset(&real_user, 0, sizeof(real_user));
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code cleanup to use two variables (simpler) rather than partial fields of PgUser* struct.

src/client.c Outdated

client->login_user = add_db_user(client->db, user.name, user.passwd);
client->login_user = add_user(real_user, "");
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can leave PgUser*->passwd empty here because we've just received the user's password hash from Postgres pg_shadow and store it under the appropriate PgDatabase*. This means that the user's password must not be in auth_file.

src/client.c Outdated
@@ -584,6 +596,7 @@ static bool scram_client_first(PgSocket *client, uint32_t datalen, const uint8_t
char *input;
int res;
PgUser *user = client->login_user;
const char *real_passwd = user_password(user, client_database(client));
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lookup user's password from the database.

src/janitor.c Outdated
@@ -703,6 +703,7 @@ void kill_database(PgDatabase *db)
kill_pool(pool);
}

aatree_destroy(&db->user_passwds);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Free user passwords associated to the database being killed.

src/loader.c Outdated
@@ -584,7 +612,8 @@ bool load_auth_file(const char *fn)
*p++ = 0; /* tag password end */

/* send them away */
unquote_add_user(user, password);
user = unquote_add_user(username, password);
user->from_auth_file = true;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flag that user was from auth_file to know whether to send off auth_query upon login.

src/server.c Outdated
void user_passwd_free(struct AANode *node, void *arg)
{
PgUserPassword *user_passwd = container_of(node, PgUserPassword, tree_node);
free(user_passwd);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PgUserPassword* allocation/deallocation could be handled by the slab allocator if you prefer.

test/test.sh Outdated
@@ -768,6 +770,22 @@ test_enable_disable() {
return 0
}

test_database_kill() {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test ensures that PgUser* still globally exists (with configurations such as max_user_connections) even if a database (which contains one of its passwords) is freed.

test/test.sh Outdated
@@ -986,6 +986,28 @@ test_shadow_password_server_login() {
echo "curuser=$curuser"
test "$curuser" = "shadowuser2" || return 1

# user defined in ini [users] section with good recently changed password from PostgreSQL server
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More test cases to ensure that users can still login with new password when it's changed at Postgres.

@Justin-Kwan Justin-Kwan force-pushed the Justin-Kwan/bugfix/allow_users_section_shadow_login branch 2 times, most recently from 79382e4 to 2516a9c Compare August 12, 2022 05:16
@@ -335,6 +336,7 @@ struct PgUser {
struct PgDatabase {
struct List head;
char name[MAX_DBNAME]; /* db name for clients */
struct AATree user_passwds; /* user passwords for this database from auth_query */
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A hashtable to store user passwords is more ideal but is more difficult as we'd have to pull in glibc.

src/client.c Outdated
@@ -35,38 +35,39 @@ static const char *hdr2hex(const struct MBuf *data, char *buf, unsigned buflen)
return bin2hex(bin, dlen, buf, buflen);
}

static bool check_client_passwd(PgSocket *client, const char *passwd)
static bool check_client_passwd(PgSocket *client, const char *provided_passwd)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More cleanup to easily discern between the user's real password (from Postgres) vs client provided password.

@Justin-Kwan Justin-Kwan force-pushed the Justin-Kwan/bugfix/allow_users_section_shadow_login branch from 2516a9c to 6303356 Compare August 12, 2022 06:07
src/server.c Outdated
if (db == NULL || user == db->forced_user || user->from_auth_file)
return (char *)user->passwd;

node = aatree_search((struct AATree *)&db->user_passwds, (uintptr_t)user->name);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cast db away from const* here. Alternatively could add const* modifier to signature aatree_search.

@Justin-Kwan Justin-Kwan force-pushed the Justin-Kwan/bugfix/allow_users_section_shadow_login branch from 6303356 to caeebbb Compare August 12, 2022 06:10
@JelteF
Copy link
Member

JelteF commented Aug 12, 2022

Thanks for sharing the updated code, I'll try to look at it soonish. If you rebase with master the Alpine tests should hopefully start passing again. And one of the valgrind tests is currently failing, but seemingly a pretty unrelated test, so maybe it's just a flaky test.

@Justin-Kwan Justin-Kwan force-pushed the Justin-Kwan/bugfix/allow_users_section_shadow_login branch from caeebbb to c8d3118 Compare August 12, 2022 08:25
@goksgie
Copy link
Contributor

goksgie commented Sep 13, 2022

If I am not misinterpreting the final version of this PR, it does not contain the proposed design of maintaining user password in databases. Instead, it seems to be storing users in user_tree and update when new login is attempted. Is that the intended behavior?

Additionally, I do not think user_tree is getting cleared unlike databases. Thus, question becomes, what is the resource implications of this design, especially for those systems that rather have limited resources?

@Justin-Kwan
Copy link
Author

Justin-Kwan commented Sep 13, 2022

Please take a look at this commit in our official fork of PgBouncer, this implements the newly proposed design (which has been running smoothly under the load of roughly 20K QPS in production): cloudflare@999133e

Hi @goksgie! In our workloads, tenants typically stay connected for long periods, and so the number of users should not be frequently changing or linearly increasing. Struct allocation should be fairly cheap even for a few thousand PgUser structs. However, if you're still concerned about active heap usage given many many users that become stale frequently (IMO an uncommon usage pattern), an eviction callback could be registered with libevent to periodically purge all users without custom configurations (with amortized cost of traversing all nodes).

I won't have any cycles to work on this PR as I am back in school now.

Copy link
Member

@JelteF JelteF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for taking way too long to look at this again. But looking at the current change it seems that it has reverted back to one of the earliest versions of the PR because find_original_user is back (even though you mentioned that it is removed now).

Other than that:

  1. This needs a rebase and should use the new testing infrastructure
  2. We should probably have some cleanup mechanism for the users like @goksgie described

But mostly it seems problematic that this PR is containing the old code again, which means I cannot review the latest version.

@veshant
Copy link

veshant commented Oct 12, 2023

Sorry for taking way too long to look at this again. But looking at the current change it seems that it has reverted back to one of the earliest versions of the PR because find_original_user is back (even though you mentioned that it is removed now).

Other than that:

  1. This needs a rebase and should use the new testing infrastructure
  2. We should probably have some cleanup mechanism for the users like @goksgie described

But mostly it seems problematic that this PR is containing the old code again, which means I cannot review the latest version.

It might be worth considering to merge the cf-pgbouncer fork. Probably requires some heavy refactoring but they have some useful fixes and features including the users section.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants