Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using NpgsqlDataSourceBuilder with EF Core Provider #3117

Open
BrandonDusseau opened this issue Feb 27, 2024 · 4 comments
Open

Using NpgsqlDataSourceBuilder with EF Core Provider #3117

BrandonDusseau opened this issue Feb 27, 2024 · 4 comments

Comments

@BrandonDusseau
Copy link

BrandonDusseau commented Feb 27, 2024

My team's project currently makes use of the now-deprecated JSON POCO mapping in Npgsql. I did try to migrate to the ToJson version of the mapping. Unfortunately the requirement for owned entities caused problems. EF Core was failing to track the child objects because they lack primary keys (or any identity at all). As far as I can tell, this means that if we want a strongly-typed JSON column, we have no choice but to utilize EnableDynamicJson.

This brings me to the main issue: NpgsqlConnection.GlobalTypeMapper is deprecated, and there is almost no documentation on using NpgsqlDataSource with the EF Core provider.

At the time of this writing, the Npgsql EF Core Provider documentation makes no mention at all of NpgsqlDataSource. The only example I could find for setting up a data source is here which assumes the user is managing the database without a framework. It seems there aren't too many people using this configuration with EF Core yet, so I've been working on figuring out the right way to put the pieces together for the last few hours.

I started with a fairly typical existing implementation bsed on the EF Core Getting Started documentation. Here's a sample of that:

// In ConfigureServices()
services
    .AddOptions<AnalyticsDbOptions>()
    .Bind(hostContext.Configuration.GetSection(nameof(AnalyticsDbOptions)))
    .ValidateOnStart();

services.AddDbContext<AnalyticsContext>(options =>
    options.UseNpgsql(hostContext.Configuration.GetConnectionString(nameof(AnalyticsContext))!)
);
// AnalyticsContext.cs
public class AnalyticsContext : DbContext
{
    private readonly AnalyticsDbOptions _configuration;

    public AnalyticsContext(IOptions<AnalyticsDbOptions> dbOptions)
    {
        _configuration = dbOptions.Value ?? throw new ArgumentNullException(nameof(dbOptions));
    }

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
    {
        optionsBuilder.UseNpgsql(CreateConnectionString());
    }

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        // Perform various model building operations.
    }

    private string CreateConnectionString()
    {
        var builder = new NpgsqlConnectionStringBuilder
        {
            ApplicationName = "<redacted>",
            Database = _configuration.Database,
            SearchPath = _configuration.Schema,
            Host = _configuration.Host,
            Password = _configuration.Password,
            Port = (int)_configuration.Port,
            Username = _configuration.Username,
            CommandTimeout = _configuration.CommandTimeoutInSeconds
        };

        if (_configuration.RequireSsl)
        {
            builder.SslMode = SslMode.VerifyFull;
        }

        return builder.ConnectionString;
    }
}

Initially I tried creating the NpgsqlDataSource in the AnalyticsContext.OnConfiguring() method. This probably would have worked if I was creating a single DbContext for the life of the application. However, I have a multi-threaded operation that requires me to spin up several new DbContexts on the fly. During testing, I received the ManyServiceProvidersCreatedWarning.

After a bit of searching, I was able to deduce that this was due to creating multiple NpgsqlDataSources and that the creation of that resource should go into ConfigureServices instead. I discovered AddNpgsqlDataSource which only appears in the documentation in three places (Npgsql 7.0 Release Notes, Logging, Npgsql 8.0 Release Notes) - none of which actually describe how it's used, and none of which show it being used in conjunction with EF Core.

I tried implementing it anyway, modifying the source as such:

// In ConfigureServices()
services
    .AddOptions<AnalyticsDbOptions>()
    .Bind(hostContext.Configuration.GetSection(nameof(AnalyticsDbOptions)))
    .ValidateOnStart();

services.AddNpgsqlDataSource(
    ConnectionStringBuilder.Build(hostContext.Configuration),
    builder =>
    {
        builder.EnableDynamicJson();
    });
var dataSource = services.BuildServiceProvider().GetRequiredService<NpgsqlDataSource>();
services.AddDbContext<AnalyticsContext>(options =>
    options.UseNpgsql(dataSource)
);
// AnalyticsContext.cs
public class AnalyticsContext : DbContext
{
    private readonly NpgsqlDataSource _dataSource;

    public AnalyticsContext(NpgsqlDataSource dataSource)
    {
        _dataSource = dataSource ?? throw new ArgumentNullException(nameof(dataSource));
    }

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
    {
        optionsBuilder.UseNpgsql(_dataSource);
    }

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        // Perform various model building operations.
    }
}
// ConnectionStringBuilder.cs
public static class ConnectionStringBuilder
{
    public static string Build(IConfiguration configuration)
    {
        var dbConfig = new AnalyticsDbOptions();
        configuration.GetSection(nameof(AnalyticsDbOptions)).Bind(dbConfig);

        var builder = new NpgsqlConnectionStringBuilder
        {
            ApplicationName = "<redacted>",
            Database = dbConfig.Database,
            SearchPath = dbConfig.Schema,
            Host = dbConfig.Host,
            Password = dbConfig.Password,
            Port = (int)dbConfig.Port,
            Username = dbConfig.Username,
            CommandTimeout = dbConfig.CommandTimeoutInSeconds
        };

        if (dbConfig.RequireSsl)
        {
            builder.SslMode = SslMode.VerifyFull;
        }

        return builder.ConnectionString;
    }
}

In addition, for the multi-threaded phase of the application I created a factory:

// AnalyticsContextFactory.cs
public class AnalyticsContextFactory : IAnalyticsContextFactory
{
    private readonly NpgsqlDataSource _dataSource;

    public AnalyticsContextFactory(NpgsqlDataSource dataSource)
    {
        _dataSource = dataSource ?? throw new ArgumentNullException(nameof(dataSource));
    }

    public AnalyticsContext Create()
    {
        return new AnalyticsContext(_dataSource);
    }
}

This all appears to work but I'm not confident in the implementation because I don't have any way to see what an implementation should look like with EF Core.


To sum up, here are my questions:

  1. Is there any multi-threading concern with sharing an instance of NpgsqlDataSource from DI with multiple DbContext instances?
  2. Am I using NpgsqlDataSource and DbContext in a correct way?
  3. Can you think of a way to avoid using the legacy JSON POCO mappings in this case? Does it make sense for it to be labeled as "deprecated" when not all stable JSON schemas are able to utilize owned entity mapping?
  4. Can the documentation for setting up EF Core be updated to reflect the current recommended configuration using NpgsqlDataSource? Further, can it include a section on how to properly use AddNpgsqlDataSource? Maybe the mere presence of this issue will help steer future developers in the right direction, but of course that's not ideal.
  5. I found the sudden addition of query logging surprising. Can we update the Logging page to explain which logging categories do what? I personally ended up setting Npgsql.Command to Warning in my configuration.
@roji
Copy link
Member

roji commented Feb 27, 2024

First, you're right that general documentation and guidance on using NpgsqlDataSource with EF is lacking... This is indeed something I need to address.

Now, it's a bit odd to use AddNpgsqlDataSource to configure a data source in DI, only to then get it out as in the following code:

services.AddNpgsqlDataSource(
    ConnectionStringBuilder.Build(hostContext.Configuration),
    builder =>
    {
        builder.EnableDynamicJson();
    });
var dataSource = services.BuildServiceProvider().GetRequiredService<NpgsqlDataSource>();

It's easier to simply construct it yourself as described in the Npgsql 7.0 release notes:

var dataSource = new NpgsqlDataSourceBuilder(ConnectionStringBuilder.Build(hostContext.Configuration))
{
    builder.EnableDynamicJson();
}

There's a simplification where you use AddNpgsqlDataSource(), and then omit passing the data source explicitly to UseNpgsql() (it will find it in DI explicitly for you). However, there's currently an issue in cases where there's more than one data source in the application - if that's your case avoid doing that.


Next thing... When using DI and ASP.NET, you don't typically define OnConfiguring on your DbContext class, but rather get IDbContextOptions injected via its constructor (see the general EF docs for ASP.NET/DI). Since you pass the data source to UseNpgsql() in AddDbContext, those options you get - and pass to the base constructor - will already contain the data source. In other words, there's no need for you to deal with the data source in your DbContext class in any way.


Is there any multi-threading concern with sharing an instance of NpgsqlDataSource from DI with multiple DbContext instances?

No. NpgsqlDataSource conceptually represents a connection pool, and is meant to be fully thread-safe and usable from multiple threads.

Am I using NpgsqlDataSource and DbContext in a correct way?

See above for some comments.

Can you think of a way to avoid using the legacy JSON POCO mappings in this case? Does it make sense for it to be labeled as "deprecated" when not all stable JSON schemas are able to utilize owned entity mapping?

I'm not sure what your specific problem with the owned entity mapping - you wrote "EF Core was failing to track the child objects because they lack primary keys (or any identity at all)", but I'm not sure what exactly that refers to. I suggest opening an issue with a minimal repro on the EF repo to make sure you're doing things right, before assuming you need to use the JSON POCO mapping.

Can the documentation for setting up EF Core be updated to reflect the current recommended configuration using NpgsqlDataSource? Further, can it include a section on how to properly use AddNpgsqlDataSource? Maybe the mere presence of this issue will help steer future developers in the right direction, but of course that's not ideal.

Yes, I indeed agree work on the guidance is (sorely) needed. Unfortunately it may take a bit of time before I'm able to get around to it...

I found the sudden addition of query logging surprising. Can we update the Logging page to explain which logging categories do what? I personally ended up setting Npgsql.Command to Warning in my configuration

When you use AddNpgsqlDataSource() to register an NpgsqlDataSource in DI, that data source automatically logs via the ILoggerFactory registered in the same DI container; this is likely what you're referring to (that happens in addition to logging occuring at the EF level). You can configure logging in AddNpgsqlDataSource() explicitly to prevent that, or if you stop using AddNpgsqlDataSource() it won't happen.

@pinkfloydx33
Copy link

We've been successfully using the data source builder with DI since it was first introduced. For NET8 we needed to configure more options (ie. DynamicJson) but it's been working fine. The only problem we've hit (which is unrelated to the builder, but similar to one you call out) is that we can't use ToJson due to lack of the ability to set column facets, which there's an open issue for.

Even more advanced scenarios (like a periodic passsword provider that itself needs values/options from DI) were solvable with a minimal amount of code. My point is that it's certainly possible to use the builder without a complicated setup, or building the service povider beforehand (which you shouldn't do anyways and likely recieved an analyzer warning for).

The one problem you may hit--which is one we did--would be in tests if you connect to many different servers. In our case we get a new server (docker container) per test class. The ManyServiceProviders warning will eventually rear its head, but we've been advised by the team that it's expected and safe to silence in those cases.

@roji
Copy link
Member

roji commented Feb 28, 2024

The one problem you may hit--which is one we did--would be in tests if you connect to many different servers. In our case we get a new server (docker container) per test class. The ManyServiceProviders warning will eventually rear its head, but we've been advised by the team that it's expected and safe to silence in those cases.

Yep, that should be fine - and is also expected to go away soon (#3086).

@BrandonDusseau
Copy link
Author

BrandonDusseau commented Feb 28, 2024

No. NpgsqlDataSource conceptually represents a connection pool, and is meant to be fully thread-safe and usable from multiple threads.

Excellent!

There's a simplification where you use AddNpgsqlDataSource(), and then omit passing the data source explicitly to UseNpgsql() (it will find it in DI explicitly for you). However, there's currently an issue in cases where there's more than one data source in the application - if that's your case avoid doing that.

We only use one data source, so this will work fine for us. This will mean I don't need to pull the data source out of DI just after registering it, like you said. As you noted, I was under the impression that the Npgsql configuration needed to be done both in the DI container setup and in OnConfiguring(). I thought it was a bit weird to set it up in two places, but when I read the Getting Started I saw the term "additional" and believed both were required as a result.

I'm not sure what your specific problem with the owned entity mapping - you wrote "EF Core was failing to track the child objects because they lack primary keys (or any identity at all)", but I'm not sure what exactly that refers to. I suggest opening an issue with a minimal repro on the EF repo to make sure you're doing things right, before assuming you need to use the JSON POCO mapping.

I will keep this in mind if I decide to try again at a later date. In the meantime, maybe this will provide a better understanding of what I ran into:

I read through the info Microsoft has published on using JSON columns. I threw this into my OnModelCreating:

        modelBuilder
            .Entity<Job>()
            .OwnsMany(job => job.Selections, ownedBuilder =>
            {
                ownedBuilder.ToJson("selections");
            });

However, Selections has some children that are also complex types, and it produced an error that it couldn't locate a primary key on Selections.Path. I added the [Keyless] attribute to the backing type on Selections.Path and then got
Unable to determine the relationship represented by navigation 'Selection.Path' of type 'SelectionPath'.. Okay, so I'll update the configuration above:

        modelBuilder
            .Entity<Job>()
            .OwnsMany(job => job.Selections, ownedBuilder =>
            {
                ownedBuilder.ToJson("selections");
                ownedBuilder.OwnsOne(sel => sel.Path);
            });

Now I have The navigation 'Path' cannot be added because it targets the keyless entity type 'SelectionPath'. Navigations can only target entity types with keys.. While it is possible EF Core can create a key automatically using the properties of SelectionPath, there are cases where all of those properties are null which causes EF to throw an error saying it can't track the entity.

Maybe I'm missing something fundamental here, but it seems like this is an unsupported scenario.

That aside, thanks for your feedback! I've updated the application to apply your suggestions and it's working fine (with EnableDynamicJson()` still in place). Your work on this project is much appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants