Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SSO support #1990

Open
wants to merge 56 commits into
base: master
Choose a base branch
from
Open

Add SSO support #1990

wants to merge 56 commits into from

Conversation

antepusic
Copy link
Contributor

@antepusic antepusic commented Apr 30, 2024

Description

This module adds Memgraph-side support for SSO.

[master < Task] PR

  • Provide the full content or a guide for the final git message

    Add SSO support

CI Testing Labels

Please select the appropriate CI test labels (CI -build=build-name -test=test-suite)

Documentation checklist

  • Add the documentation label tag
  • Add the bug / feature label tag
  • Add the milestone for which this feature is intended
    • If not known, set for a later milestone
  • Write a release note, including added/changed clauses
    • [Release note text]
  • Link the documentation PR here
    • [Documentation PR link]
  • Tag someone from docs team in the comments

@antepusic antepusic added this to the mg-v2.17.0 milestone Apr 30, 2024
@antepusic antepusic self-assigned this Apr 30, 2024
Copy link
Contributor

@andrejtonev andrejtonev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.
Would just like to go over it with you. I think we could combine the two authentication implementations into one.

@@ -300,6 +302,102 @@ std::optional<UserOrRole> Auth::Authenticate(const std::string &username, const
return user;
}

std::optional<UserOrRole> Auth::SSOAuthenticate(const std::string &scheme,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function seems very similar to the authentication via module.
Can we reuse the code somehow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve extracted the code shared between the two into methods of their own.

@@ -145,6 +146,10 @@ bool SessionHL::Authenticate(const std::string &username, const std::string &pas
interpreter_.ResetUser();
{
auto locked_auth = auth_->Lock();
if (FLAGS_auth_sso_on && (username.empty() || password.empty())) {
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Communicate why the authentication failed to the user.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Superseded (this code is now the same as before)

Comment on lines 189 to 191
// No access control -> give empty user
user_or_role_ = AuthChecker::GenQueryUser(auth_, std::nullopt);
interpreter_.SetUser(AuthChecker::GenQueryUser(auth_, std::nullopt));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't we disabling userless logins when using SSO?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -167,6 +172,33 @@ bool SessionHL::Authenticate(const std::string &username, const std::string &pas
return res;
}

bool SessionHL::SSOAuthenticate(const std::string &scheme, const std::string &identity_provider_response) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to combine the Authenticate and SSOAuthenticate into a single function.
Don't really see why they should be separate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two methods have now become significantly different

@antepusic antepusic changed the title Add SSO via SAML Add SSO support May 19, 2024
@antepusic antepusic marked this pull request as ready for review May 19, 2024 14:24
@antepusic antepusic added CI -build=coverage -test=core Run coverage build and core tests on push Ready for review PR is ready for review labels May 19, 2024
Copy link
Contributor

@andrejtonev andrejtonev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we need to sync again.
The last product meeting changes the logic quite a bit. Not sure we are all on the same page.

std::cerr << "The auth module path doesn't exist or isn't a file!" << std::endl;
return false;
for (auto &[_, path] : memgraph::ModuleMappingsToMap(value)) {
auto status = std::filesystem::status(path);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rename status to module_name or something similar?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


// NOLINTNEXTLINE(misc-unused-parameters)
DEFINE_VALIDATED_string(auth_module_mappings, "",
"Associates auth schemas to external modules. A mapping is structured as follows: <scheme>: "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that gflags supports arguments with spaces.
Something like --flag arg1 arg2 should generate the flag's value only to "arg1".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You’re right about gflags (and environment variables are like this in general), but this works when the environment variable is set as a string, e.g. --auth-module-mappings="test-admin: admin; test-reader: reader". I’ve corrected the description to clarify that.

std::unordered_map<std::string, auth::Module> PopulateModules(std::string &module_mappings) {
std::unordered_map<std::string, auth::Module> module_per_scheme;
if (!FLAGS_auth_module_executable.empty()) {
module_per_scheme.emplace("basic", FLAGS_auth_module_executable);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a basic module?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basic authentication is the auth scheme that takes a username and a password.

The scheme basic requires a username principal::String and a password credentials::String

https://neo4j.com/docs/bolt/1.0/bolt/message/#messages-logon

By analogy, modules marked with basic in the scheme → module mapping handle authentication that takes an username + password and sends them to an external module. This covers LDAP.

*
* @return `true` if auth needs to run
*/
bool UsingAuthModule() const { return !modules_.empty(); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct?
We need to support multiple authentication schemes and a native authentication.
The user will write something like auth_enable: auth1, auth2, native.
We should disable the users and local authentication only if the native authentication is missing, not if we are using any module.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a bit more context for this.
Currently memgraph supports or a native authentication or one via a module.
That is why some functionalities are disabled when the module is active.
We now need to enable those feature, but only for users that have been authorized via the native method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, we do not want to handle any user data locally when a user is authorized via a module. Meaning that we send over the username/password or certificate and get back a role. This role is then used for the authorization.
Currently when updating a role or user, we check to see if we are using a module and allow only role updates in that case.
We also disable crating/deleting a user when we are using a module.
All of these things should be re-enabled, even if we are using a module.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We support native auth, and allow the user to change it as they need.
The only this the module influences is the order the authentication is tried and if a connection can be authenticated using the native auth.

If something is not clear, we can jump on a call :D

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed explanation! I’ve applied your suggestions so that users logged in via module can modify Memgraph-internal user data, e.g. run CREATE USER or SHOW USERS.

params["scheme"] = scheme;
params["response"] = identity_provider_response;

auto ret = module_.Call(params, FLAGS_auth_module_timeout_ms);
auto ret = modules_.at(scheme).Call(params, FLAGS_auth_module_timeout_ms);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we generalize the module call and response check?
Basically have a function that takes in the params and scheme and executed everything else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea

Comment on lines +267 to +269
if (module_mappings.empty()) {
return module_per_scheme;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need this check?
Shouldn't the ModuleMappingsToMap return an empty set either way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s true – I just wanted to avoid calling the function if there’s no need

}
const auto scheme_name = std::string{utils::Trim(module_and_scheme[0])};
const auto module_path = std::string{utils::Trim(module_and_scheme[1])};
module_per_scheme.emplace(scheme_name, module_path);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are supporting only a single module per scheme?

Copy link
Contributor Author

@antepusic antepusic May 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s correct (and appropriate because the schemas are granular, e.g. it’s "saml-entra-id" and not just "saml"/"entra-id").

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tonilastre does this cover all the use-cases you're interested in?
I know you were talking about having multiple LDAP modules. Could all those also be defined in this way?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm the above re scheme, the scheme contains both the protocol and identity provider, so it is a single module per scheme.

Re: LDAP, not sure because I haven't worked with that + Memgraph.

} // namespace memgraph

// NOLINTNEXTLINE(misc-unused-parameters)
DEFINE_VALIDATED_string(auth_module_mappings, "",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are we enabling/disabling the native auth?

Copy link
Contributor Author

@antepusic antepusic May 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Memgraph receives user credentials from Bolt’s LOGON/HELLO messages, and uses native auth or SSO depending on the value of the scheme field.

There’s no ambiguity as to whether native auth or SSO is used:

  • SSO → Memgraph Lab sends messages with SSO-specific values, e.g. "saml-okta".
  • Native auth → the scheme field is always "basic" (see Bolt Protocol message specification).

MigrateVersions(storage_);
}

std::optional<UserOrRole> Auth::Authenticate(const std::string &username, const std::string &password) {
if (module_.IsUsed()) {
if (!modules_.contains("basic")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should allow both LDAP and/or native auth.
Their order should be the order the user defined in the argument list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LDAP module is indexed with "basic" while processing the flags (see also #1990 (comment)), i.e. we support them both.

As for auth method ordering, we’ve avoided that (see meeting note: https://www.notion.so/memgraph/2024-04-30-SSO-Lab-Core-Sync-2-ccb4755e30be4cebb8beb7d7cc16f77d?pvs=4#8eab766c1ecd457d8a6c8cd18849b298)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!modules_.contains("basic") is ambiguous.
How are we suppose to disable or enable both?

@antepusic
Copy link
Contributor Author

antepusic commented May 20, 2024

@andrejtonev Thank you for the review!

Perhaps we need to sync again. The last product meeting changes the logic quite a bit. Not sure we are all on the same page.

In light of that meeting, I’ve updated the spec (changes in green) and we can sync together.

@gitbuda gitbuda modified the milestones: mg-v2.17.0, mg-v2.18.0 May 20, 2024
Copy link
Contributor

@andrejtonev andrejtonev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gitbuda are you okay with us enabling all native auth functionalities even when modules are used?
What I mean specifically is that any modifications to the native auth data is disable when we run modules. This made sense when the modules would modify the data. We disabled that some time ago.
We now need to allow both native and module authentication. I think we can allow the user to change the native user data at any time (even when the native auth is disabled).

}
const auto scheme_name = std::string{utils::Trim(module_and_scheme[0])};
const auto module_path = std::string{utils::Trim(module_and_scheme[1])};
module_per_scheme.emplace(scheme_name, module_path);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tonilastre does this cover all the use-cases you're interested in?
I know you were talking about having multiple LDAP modules. Could all those also be defined in this way?

MigrateVersions(storage_);
}

std::optional<UserOrRole> Auth::Authenticate(const std::string &username, const std::string &password) {
if (module_.IsUsed()) {
if (!modules_.contains("basic")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!modules_.contains("basic") is ambiguous.
How are we suppose to disable or enable both?

@tonilastre
Copy link

@tonilastre does this cover all the use cases you're interested in?
I know you were talking about having multiple LDAP modules. Could all those also be defined in this way?

@andrejtonev, regarding this: I am sure about LDAP, but SSO + native is something that is used often. When you have a database running you want to establish an SSO provider connected with your directory of employees who can access the database (e.g. analysts, data engineers, etc), but there will be some internal applications where you can't set up an SSO for it, so applications will need a native auth (API key = username, API secret = password).

Copy link

sonarcloud bot commented May 24, 2024

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI -build=coverage -test=core Run coverage build and core tests on push CI -build=debug -test=integration Run debug build and integration tests on push CI -build=release -test=core Run release build and core tests on push CI -build=release -test=e2e Run release build and e2e tests on push Docs needed Docs needed enterprise feature feature Ready for review PR is ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants