Deploy new applications with db:schema:load #118

Odaeus · 2015-04-22T15:54:42Z

Hello,

I see that someone requested this feature for a different reason and it was denied. I don't understand the reason for dismissal, as the source of truth for the database schema is the /db/schema.rb or /db/structure.sql file. Migrations are only designed to manage changes to the database and it seems a waste to run the entire migration history when deploying a new application. Also it should be possible to delete old migrations if an application has a great many.

There is no need (and it is error prone) to deploy a new instance of an app by replaying the entire migration history. It is much simpler and faster to just load into the database a description of the current schema. - Rails guide

Any chance you would reconsider this position?

The text was updated successfully, but these errors were encountered:

chewi · 2015-04-28T14:56:07Z

Just ran into this myself. Didn't notice before because I hadn't deployed any old applications on new servers lately. I think it's debatable whether it should attempt rake db:structure:load automatically. While it should just complain that everything already exists on an existing database, I'd be satisfied and feel safer with a manual option. Maybe cap deploy:new_database or cap deploy NEWDB=1 or something. Prepping it manually beforehand is a bit fiddly because you have to do cap deploy:updating and then SSH to the server to run rake db:structure:load yourself. Depending on your setup, this may require several additional steps in between to get the right environment.

P.S. @Odaeus, hello old friend. 😉

Odaeus · 2015-04-28T21:58:44Z

Hello James!

I ended up adding the following to my deploy.rb:

namespace :deploy do
  namespace :db do
    desc "Load the database schema if needed"
    task load: [:set_rails_env] do
      on primary :db do
        if not test(%Q[[ -e "#{shared_path.join(".schema_loaded")}" ]])
          within release_path do
            with rails_env: fetch(:rails_env) do
              execute :rake, "db:schema:load"
              execute :touch, shared_path.join(".schema_loaded")
            end
          end
        end
      end
    end
  end

  before "deploy:migrate", "deploy:db:load"
end

I agree it should be a manual task with the option of making it automatic, given that it could cause issues with a production database. My example above uses a hidden file as a guard to make sure it only runs once.

will-in-wi · 2016-03-06T02:40:56Z

Just saw this: https://stackoverflow.com/questions/35821777/rails-how-to-set-up-dbschemaload-for-initial-deploy-with-capistrano

Which made me look for this issue.

My thoughts:
Rails points out in their docs that the schema file is the canonical way of setting up a new database. I see two common Capistrano cases:

New app, new deployment. Capistrano just uses migrations and it works fine.
Old app, new deployment. A clean chain of migrations may not exist all the way back to initial creation. Further, it may be error-prone to set up a database this way.

For any case where the database has no schema version information, it is probably more correct to run schema (or structure) load instead of running migrations.

I'm not sure how we might go about finding out whether the database has had the schema loaded yet, but it would ideal to query this and decide based on that which method to run.

If someone wants to flesh out conceptually how this might be done, I can possibly take a crack at a PR.

mattbrictson · 2016-03-07T23:31:23Z

Typically (in my experience) the user that is running the Rails app is a non-privileged user that does not have permission to create databases. The initial step of creating the database is thus done outside of the Capistrano deploy, which probably explains why this not entirely straightforward.

Personally, I try to make sure migrations always work back to the beginning because you never know when you might need to restore an old database backup and migrate it. But, I completely understand that I'm in the minority and rake db:schema:load or db:structure:load is the better way to go. I'm just not sure what is the safe way to do it.

will-in-wi · 2016-03-07T23:34:17Z

I'm not sure either. My usual case is that the database already exists, but is completely empty. schema:load is then run to set up the initial structure.

It would probably be ideal to be able to run cap env rails:schema:load or something like that. But that presumes a preexisting deploy, which wouldn't exist yet…

mattbrictson · 2016-03-07T23:40:06Z

Maybe you could do a trick like:

bundle exec rails runner -e production "exit 1 unless ActiveRecord::SchemaMigration.table_exists?"

And put that inside a Capistrano test to see if we need to load the schema.

mattbrictson · 2016-03-07T23:42:35Z

⬆️ That's an expensive test because it needs to load the Rails environment, so never mind.

mattbrictson · 2016-03-08T01:40:03Z

We could modify the implementation of the deploy:migrate task to add some conditionals. It looks to see if a special :rails_load_schema variable is set to true (it will default to false). If so, it runs rake db:schema:load instead of rake db:migrate.

Then, we have a special rake task, load_schema, which all it does is call set :rails_load_schema, true. First-time deployment works like this:

cap production load_schema deploy

And then do something similar for db:structure:load.

This is solution is a bit out of the ordinary, though; I can't think of another Capistrano plugin that uses additional task names on the command line preceding deploy to affect conditional behavior.

Another option would be to make a special load_schema_and_deploy task that sets the variable and then does invoke :deploy.

Or use an environment variable:

CAP_RAILS_LOAD_SCHEMA=1 cap production deploy

I dunno, none of these approaches feels exactly right.

will-in-wi · 2016-03-08T01:43:32Z

Brainstorming here:

What if in Capistrano proper, a task called deploy:first_time (or something similar) was created which only invokes deploy. Then other plugins could hook into that to have tasks which should only be run the first time a deploy is done.

This would then be documented as such.

Does this sound any better?

mattbrictson · 2016-03-08T03:43:39Z

You still need the variable trick, though, which doesn't feel particularly elegant:

before "deploy:first_time", "rails:first_time" do
  set :rails_load_schema, true
end

before "db:migrate", "db:load_schema" do
  if fetch(:rails_load_schema)
    # rake db:schema:load
  end
end

will-in-wi · 2016-03-08T03:53:04Z

Could you add a task in a task?

before "deploy:first_time", "rails:first_time" do
  before "db:migrate", "db:setup"
end

task "db:setup" do
  # rake db:schema:load
end

I haven't tried this.

mattbrictson · 2016-03-08T04:03:30Z

Perhaps slightly better: deploy:first_time sets a special Capistrano variable that is exposed in the DSL as e.g. first_deploy?. Then tasks can do conditional logic based on that method without having to manage a special variable of their own.

This is sort of like the special dry_run? method proposed in capistrano/capistrano#1564.

mattbrictson · 2016-03-08T04:05:36Z

Could you add a task in a task?

Yes, I think that would work, too.

shaunakv1 · 2017-06-10T06:50:13Z

Just checking in, any formal solution/tasks within Capistrano for this? Having to ssh into server to load schama for cold starts on old apps does not seem right.

mattbrictson · 2017-06-10T20:33:33Z

Just checking in, any formal solution/tasks within Capistrano for this?

No, not to my knowledge.

wbreeze · 2018-04-11T19:40:56Z

In the way-back machine, Capistrano had a "deploy:cold".

In Capistrano 3.10.1, bundle exec cap staging deploy:updating gives me enough to shell-in and run the db:structure:load or db:schema:load task manually. Shelling into a (successful or failed) deploy that has tried deploy:migrate isn't quite the same.

I definitely wouldn't like every deploy to consider wiping the database and then skip it based on a semaphor file. That feels way too dangerous to me.

mattbrictson · 2018-04-11T21:26:33Z

You could run

cap production deploy COLD=1

And then in your deploy.rb you could customize the deployment process as:

task "deploy:db:load" do
  on primary :db do
    within release_path do
      with rails_env: fetch(:rails_env) do
        execute :rake, "db:schema:load"
      end
    end
  end
end
before "deploy:migrate", "deploy:db:load" if ENV["COLD"]

Maybe?

wbreeze · 2018-04-11T22:18:44Z

Edit: I definitely wouldn't like every deploy to consider wiping the database and then skip it ~~based on a semaphor file~~. That feels way too dangerous to me.

vifreefly · 2018-11-20T21:21:36Z

I think all what we need here is to call rails db:create at the first deploy before db:migrate. Use db:schema:load is a bit dangerous.

Anyway, this issue should be resolved. It is an official rails tasks for Capistrano. Having the first deploy always to be failed (because database is not created) and going to the server to manually type rails db:create doesn't seems for me like automation.

Or at least add to the README info about the assuming fail at the first deploy, so the Capistrano newbies will not be so scared when it's happen.

JohnSmall · 2018-11-21T15:47:57Z

I find that deploy:assets:precompile also fails on the first deploy as it runs before the database has been created.

mattbrictson · 2018-11-22T00:51:51Z

Right now I don't want to automate db:create or db:schema:load. Neither is what I personally would want for real-world deployments. Plus from developers I've spoken with there is a pretty even split between those that prefer creating an empty db and migrating it vs those that prefer loading the schema file directly. There doesn't seem to be a standard way to do it.

It's OK that capistrano-rails leaves some things up to the individual developer. For example, we can't provide a deploy:restart task out of the box, because that depends on passenger, initd, systemd, or whatever each project is using to manage the server process. Likewise for the database, maybe I can rely on db:create, or maybe I need to go through AWS to provision an RDS instance. To draw an analogy to Heroku: you can't just git push heroku master the first time without manually configuring the PostgreSQL database add-on first. I don't think that's an unreasonable step.

Or at least add to the README info about the assuming fail at the first deploy, so the Capistrano newbies will not be so scared when it's happen.

Agreed. @vifreefly would you be interested in opening a PR for that?

vifreefly · 2018-11-22T09:43:13Z

@mattbrictson Thanks, I see your point.

In this case, we can probably add another task deploy:db_create (or something like this) which will execute rake db:create.

This task will not be executed by default, but anyone (if it's ok for them) will be able to simply add the hook before 'deploy:migrate', 'deploy:db_create', and have database created automatically before deploy:migrate if it's not exists.
In case if db exists, this task will be executed successfully anyway, because Rails return message like Database 'app_production' already exists with successful (0) status code.

vifreefly · 2018-11-22T10:10:43Z

About db:migrate vs db:schema:load to call for a first time on empty db.

Maybe db:schema:load is more true but I personally had some issues with it and custom Postgres extensions (for example https://github.com/Casecommons/pg_search). With db:schema:load some required extensions described in migrations wasn't enabled in the new database.
On the other hand, with db:migrate all was fine. I think db:migrate fine as well, unless if you're have a several hundreds/thousands of migrations to migrate and wait.

wbreeze · 2018-11-22T11:38:33Z

With db:schema:load some required extensions described in migrations wasn't enabled in the new database.

Don't describe required extensions in migrations?

vifreefly · 2018-11-22T13:18:46Z

Don't describe required extensions in migrations?

Ok, what is a right way then?

wbreeze · 2018-11-22T21:44:39Z

Ouch. Pardon me. It really was a question. It probably sounded a little glib. Excuse me. And also this is a little orthoganal to the thread. Maybe we should take it offline?

Now I've read the doc for the extension you referenced and see that they suggest exactly that, using a migration. Indeed, on a fresh install, those execute "CREATE EXTENSION IF NOT EXISTS pg_trgm;" will not have run from a db:structure:load. That doc doesn't give consideration to booting up a fresh DB without running the migration they suggest. I'm not sure it's great advice, only convenient advice.

The right way? Wow. Would it be great if the structure dump captured those? Assuming they don't (and that might not be possible as it might not be standardized). Would it serve to work it into the database install and provisioning of the DB box? The doc does go on to say that the extension installation methods vary across OS's.

Please excuse me for a. pouncing on that and b. diverting this thread. The rails guide advice quoted in the original issue description: I think it's solid advice.

mulaiko · 2019-09-06T07:40:31Z

Hi,
So, no solution for this case then? (Read through the thread but maybe I missed something...)

thbar · 2019-09-06T07:48:20Z

@mulaiko you have solutions in the thread which you can adapt. Here is my current version, using an explicit RUN_DB_SCHEMA_LOAD flag at first deploy:

# adapted from https://gist.github.com/wacaw/657744af41dcf4963646
namespace :deploy do
  desc "Load the initial schema - it will WIPE your database, use with care"
  task :db_schema_load do
    on roles(:db) do
      puts <<~WARN
        ************************** WARNING ***************************
        If you type [YES], rake db:schema:load will WIPE your database
        any other input will cancel the operation.
        **************************************************************
      WARN

      ask :answer, "Are you sure you want to WIPE your database?: "

      if fetch(:answer) == 'YES'
        within release_path do
          with rails_env: fetch(:rails_env) do
            execute :rake, 'db:schema:load'
          end
        end
      else
        puts "Cancelled."
        exit
      end
    end
  end
end

if ENV["RUN_DB_SCHEMA_LOAD"] == '1'
  before "deploy:migrate", "deploy:db_schema_load"
end

Hope this helps!

This needed when deploying to a server for the first time or when you want to wipe out a database. The issue I'm fixing is this. Things have evolved and in https://github.com/coopdevs/timeoverflow-provisioning we no longer set things up to enable PostgreSQL extensions such as `pg_trgm`. Because Capistrano runs `db:migrate` when deploying very old migrations like `db/migrate/00000000000002_setup_pg_trgm.rb` don't work anymore. I prefer to make this step explicit in the code and with some context rather than having to do this manually in the server and let the dev in charge figure it out every time. Apparently, from capistrano/rails#118 I see this is a known issue for which there's no provided solution. I copied a solution proposed by a capistrano/rails member so should be good. Just run it as follows: ``` $ bundle exec cap staging deploy COLD=1 ```

remcode-remco · 2022-01-05T20:00:16Z

      with rails_env: fetch(:rails_env) do
        execute :rake, 'db:schema:load'
      end

I had to change the line to this in production:
execute :rake, 'db:schema:load DISABLE_DATABASE_ENVIRONMENT_CHECK=1'

This was referenced Aug 6, 2016

First-time deploy hook for setup capistrano/capistrano#1746

Open

Suggestion: Only migrate if current_path is present (Do not migrate on first deploy) #166

Closed

mattbrictson added new feature discuss! labels Aug 9, 2016

vifreefly mentioned this issue Nov 22, 2018

Add info how to handle deploy:migrate error at the first deploy to the README #227

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deploy new applications with db:schema:load #118

Deploy new applications with db:schema:load #118

Odaeus commented Apr 22, 2015

chewi commented Apr 28, 2015

Odaeus commented Apr 28, 2015

will-in-wi commented Mar 6, 2016

mattbrictson commented Mar 7, 2016

will-in-wi commented Mar 7, 2016

mattbrictson commented Mar 7, 2016

mattbrictson commented Mar 7, 2016

mattbrictson commented Mar 8, 2016

will-in-wi commented Mar 8, 2016

mattbrictson commented Mar 8, 2016

will-in-wi commented Mar 8, 2016

mattbrictson commented Mar 8, 2016

mattbrictson commented Mar 8, 2016

shaunakv1 commented Jun 10, 2017

mattbrictson commented Jun 10, 2017

wbreeze commented Apr 11, 2018

mattbrictson commented Apr 11, 2018

wbreeze commented Apr 11, 2018

vifreefly commented Nov 20, 2018

JohnSmall commented Nov 21, 2018

mattbrictson commented Nov 22, 2018

vifreefly commented Nov 22, 2018 •

edited

vifreefly commented Nov 22, 2018 •

edited

wbreeze commented Nov 22, 2018

vifreefly commented Nov 22, 2018

wbreeze commented Nov 22, 2018

mulaiko commented Sep 6, 2019

thbar commented Sep 6, 2019

remcode-remco commented Jan 5, 2022

Deploy new applications with db:schema:load #118

Deploy new applications with db:schema:load #118

Comments

Odaeus commented Apr 22, 2015

chewi commented Apr 28, 2015

Odaeus commented Apr 28, 2015

will-in-wi commented Mar 6, 2016

mattbrictson commented Mar 7, 2016

will-in-wi commented Mar 7, 2016

mattbrictson commented Mar 7, 2016

mattbrictson commented Mar 7, 2016

mattbrictson commented Mar 8, 2016

will-in-wi commented Mar 8, 2016

mattbrictson commented Mar 8, 2016

will-in-wi commented Mar 8, 2016

mattbrictson commented Mar 8, 2016

mattbrictson commented Mar 8, 2016

shaunakv1 commented Jun 10, 2017

mattbrictson commented Jun 10, 2017

wbreeze commented Apr 11, 2018

mattbrictson commented Apr 11, 2018

wbreeze commented Apr 11, 2018

vifreefly commented Nov 20, 2018

JohnSmall commented Nov 21, 2018

mattbrictson commented Nov 22, 2018

vifreefly commented Nov 22, 2018 • edited

vifreefly commented Nov 22, 2018 • edited

wbreeze commented Nov 22, 2018

vifreefly commented Nov 22, 2018

wbreeze commented Nov 22, 2018

mulaiko commented Sep 6, 2019

thbar commented Sep 6, 2019

remcode-remco commented Jan 5, 2022

vifreefly commented Nov 22, 2018 •

edited

vifreefly commented Nov 22, 2018 •

edited