Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

schema.rb view order changes #263

Closed
bradrobertson opened this issue Dec 15, 2018 · 30 comments
Closed

schema.rb view order changes #263

bradrobertson opened this issue Dec 15, 2018 · 30 comments
Milestone

Comments

@bradrobertson
Copy link

I'm finding that with each migration we do, the view definitions get moved around in my schema file. This leads to unnecessary schema.rb changes that make it pretty difficult to track the actual changes to the file.

Does anyone else have this issue?

@fujimura
Copy link
Contributor

This is also happening with structure.sql. pg_dump seems to be outputting the result in disk order and I'm suspecting it, but not sure the behavior of pg_dump affects the result of schema.rb.

@GeorgeKaraszi
Copy link

GeorgeKaraszi commented Jan 4, 2019

So this was driving me nuts for awhile too. After a simple investigation into the matter. I found that the schema being generated doesn't sort view tables prior to being committed. (Rails SchemaDumper sorts tables to avoid this very issue in the method this gem is overriding / appending to)

Quick fix(hack):

/config/initializers/scenic_view.rb

# frozen_string_literal: true

# ScenicView doesn't sort their view's when committing to the schema.rb file
# This is causing a lot of issue's with git since the ordering isn't guaranteed.
#
# Problem area:
# https://github.com/scenic-views/scenic/blob/3d44c625919cc327b25190752b9b3b61cab22a8e/lib/scenic/schema_dumper.rb#L16
module SchemaDumperDecorator
  private

  def dumpable_views_in_database(*args)
    super.sort_by(&:name)
  end
end

ActiveRecord::SchemaDumper.prepend(SchemaDumperDecorator)

@derekprior
Copy link
Contributor

@GeorgeKaraszi That is the code earlier versions of Scenic had. This will break as soon as you have one view that depends on another view that comes after it alphabetically. We need to use a topological sort. There is an open PR which does so. See: https://github.com/scenic-views/scenic/pull/191

It's been a long while since I looked at it, but I believe the issue is that it needs more tests to prove it does what its supposed to.

@GeorgeKaraszi
Copy link

@derekprior ugh... Yea that does make complete sense. On that account, I wonder if we could build some momentum again to wrap that PR up to avoid these irritating schema changes then.

@leewaa
Copy link

leewaa commented Aug 8, 2019

Would be nice to get moving with this one as nearly every project I work with that has scenic, faces this problem. It does get quiet tiresome having to deal with it.

@ritchiey
Copy link

ritchiey commented Sep 4, 2019

I seem to just get a 404 when I try to access that PR https://github.com/scenic-views/scenic/pull/191

@johnmosesman
Copy link

Also getting a 404 on that PR. Is this issue being considered to be worked on? Makes diffing the views if they move places in the file impossible.

@derekprior
Copy link
Contributor

It appears that the user that opened the PR deleted their account.

@sfcgeorge
Copy link

We're increasingly getting failures running rails:db schema:load because we have views depending on views.

I might try patching schema dumper so we don't have to keep manually fixing schema.rb and open an initial PR other people can use until we have a more mergeable version.

@sfcgeorge
Copy link

sfcgeorge commented Dec 16, 2019

I have written a patch that seems to work, for our schema at least. It is logically very simple, probably too simple. The basic idea is:

  • First sort alphabetically for consistency and ease of finding view definitions.
  • Then swap view around to fix dependencies.

The basic algorithm is:

  • Until the order stops changing
    • For each view
      • Get all the previous views
      • Check for their name within the current view's definition
      • If it's found swap the current view with the previous one, then loop again

It may not swap the correct 2 first time but it'll get there in the end. It only does 1 swap per iteration but it gets there in the end. This is all bad for performance but doesn't matter in practice and keeps the logic understandable.

Things to consider (that I think are fine but unchecked):

  • Comments mentioning views (Posgres strips comments anyway, not a concern).
  • Column names that might happen to be the same as a view name (the Regexp should handle this... unless it's the very last column)
  • Cyclic dependencies (I think these wouldn't be valid in Postgres itself but unsure, I've added a max sort attempts break out just in case)

I'm assuming you'd want it written very differently and with tests before its mergeable, this is just a simple as possible proof of concept.

https://gist.github.com/sfcgeorge/6b28e4f038e50a2e642fc14eacbc07c9

@jmmastey
Copy link

Hey y'all. I can't see #191 anymore, so that's sad. But I do see a topo sort in #208? If I could get some tests to prove out that code a little better, would that be a potentially acceptable solution?

@sfcgeorge
Copy link

@jmmastey no, a naive sort by name breaks if there are dependencies between views. My patch above does handle dependencies but also needs tests.

@jmmastey
Copy link

Sorry, I should have been more specific! This comment includes a query for topo sort:
#208 (comment)

@calebhearth
Copy link
Contributor

calebhearth commented Feb 16, 2020 via email

@icheishvili
Copy link

#208 was originally mine but it felt like the patch wasn't wanted or maybe I misread. What's the hurdle to clear for a decent test? View definitions are a DAG by construction so my recommendation is setting up a complex version of said structure and showing that a proper query on pg_catalog produces the expected result.

@oboxodo
Copy link

oboxodo commented May 29, 2020

@icheishvili I don't think your proposal wasn't wanted at all. It "just" needed proper automated tests.

@hansololai
Copy link

hansololai commented Aug 24, 2020

I just used the t-sort in the dumper, so far we haven't have any problems with i

class TsortableHash < Hash
 include TSort

 alias tsort_each_node each_key
 def tsort_each_child(node, &block)
 fetch(node).each(&block)
 end
end
module Scenic
  module SchemaDumperDecorator
  # query for the dependencies between views to views. 
    DEPENDENT_SQL = <<~SQL
      select distinct dependent_ns.nspname as dependent_schema
      , dependent_view.relname as dependent_view 
      , source_ns.nspname as source_schema
      , source_table.relname as source_table
      FROM pg_depend 
      JOIN pg_rewrite ON pg_depend.objid = pg_rewrite.oid 
      JOIN pg_class as dependent_view ON pg_rewrite.ev_class = dependent_view.oid 
      JOIN pg_class as source_table ON pg_depend.refobjid = source_table.oid
      JOIN pg_namespace dependent_ns ON dependent_ns.oid = dependent_view.relnamespace
      JOIN pg_namespace source_ns ON source_ns.oid = source_table.relnamespace
      where dependent_ns.nspname  = 'public' and source_ns.nspname = 'public' and source_table.relname != dependent_view.relname
      and source_table.relkind in ('v','m') and dependent_view.relkind in ('v','m')
      order by dependent_view.relname;
    SQL
    private

   # when dumping the views, their order must be topological sorted
    def dumpable_views_in_database(*args)
      return @ordered_dumpable_views_in_database if @ordered_dumpable_views_in_database
      existing_views = super(*args)
      all_view_names = Set.new(existing_views.map(&:name))
      # Need to do topological sorting
      all_dependencies = Scenic.database.execute(DEPENDENT_SQL)
      hash = TsortableHash.new
      all_dependencies.each do |r|
        source_v = r["source_table"]
        dependent = r["dependent_view"]
        hash[dependent] = [] if !hash[dependent]
        hash[source_v] = [] if !hash[source_v]
        hash[dependent] << source_v
        all_view_names.delete(source_v)
        all_view_names.delete(dependent)
      end
      # after dependencies, there might be some view's left that don't have any dependencies
      all_view_names.each do |v|
        hash[v] = []
      end
      ordered = hash.tsort
      @ordered_dumpable_views_in_database = ordered.map do |v|
        existing_views.find{|sv| sv.name == v}
      end
      @ordered_dumpable_views_in_database
    end
  end
end

Scenic::SchemaDumper.prepend(Scenic::SchemaDumperDecorator)

@gsar
Copy link

gsar commented Jul 6, 2021

@hansololai thanks! i noticed your code there doesn't maintain sorted order for views without any dependencies, so i'm using this modified version (the only meaningful change is to add .sort to existing_view.map(&:name)):

class TsortableHash < Hash
  include TSort

  alias tsort_each_node each_key

  def tsort_each_child(node, &block)
    fetch(node).each(&block)
  end
end

module Scenic
  module SchemaDumperDecorator
    # query for the dependencies between views to views.
    DEPENDENT_SQL = <<~SQL.squish.freeze
      SELECT DISTINCT dependent_ns.nspname AS dependent_schema,
                      dependent_view.relname AS dependent_view,
                      source_ns.nspname AS source_schema,
                      source_table.relname AS source_table
      FROM pg_depend
        JOIN pg_rewrite ON pg_depend.objid = pg_rewrite.oid
        JOIN pg_class AS dependent_view ON pg_rewrite.ev_class = dependent_view.oid
        JOIN pg_class AS source_table ON pg_depend.refobjid = source_table.oid
        JOIN pg_namespace dependent_ns ON dependent_ns.oid = dependent_view.relnamespace
        JOIN pg_namespace source_ns ON source_ns.oid = source_table.relnamespace
      WHERE dependent_ns.nspname  = 'public'
        AND source_ns.nspname = 'public'
        AND source_table.relname != dependent_view.relname
        AND source_table.relkind IN ('v','m')
        AND dependent_view.relkind IN ('v','m')
      ORDER BY dependent_view.relname;
    SQL

    private

    # when dumping the views, their order must be topological sorted
    def dumpable_views_in_database(*args)
      return @ordered_dumpable_views_in_database if @ordered_dumpable_views_in_database

      existing_views = super(*args)
      all_view_names = Set.new(existing_views.map(&:name).sort)
      # Need to do topological sorting
      all_dependencies = Scenic.database.execute(DEPENDENT_SQL)
      hash = TsortableHash.new
      all_dependencies.each do |r|
        source_v = r['source_table']
        dependent = r['dependent_view']
        hash[dependent] = [] unless hash[dependent]
        hash[source_v] = [] unless hash[source_v]
        hash[dependent] << source_v
        all_view_names.delete(source_v)
        all_view_names.delete(dependent)
      end
      # after dependencies, there might be some view's left that don't have any dependencies
      all_view_names.each do |v|
        hash[v] = []
      end
      ordered = hash.tsort
      @ordered_dumpable_views_in_database = ordered.map do |v|
        existing_views.find { |sv| sv.name == v }
      end
      @ordered_dumpable_views_in_database
    end
  end
end

Scenic::SchemaDumper.prepend(Scenic::SchemaDumperDecorator)

@rusikf
Copy link

rusikf commented Jul 12, 2021

Hi, can you please add this to the release?

@hansololai
Copy link

I think this repo is not maintained anymore, I think making PR is useless here, best is that you put a file in the initialization folder in rails and manually monkey patch it. @rusikf

@derekprior
Copy link
Contributor

I don't think it's fair to say the repository or project is not maintained any longer. The truth is that at this stage of development we are intentionally conservative in changes we'll accept. While the project is not without its bugs and annoyances, it does serve thousands of users just fine as is. We've tried solving this problem before only to introduce new ones with the solution, so we're going to need more certainty. I would love for this issue to be fixed as well, but doing so will require a well-tested PR and possibly a few users willing to QA a pre-release version.

@hansololai
Copy link

Ok, I apologize for saying it is not maintained. I understand the difficulty of maintaining an open source package the serves so many users, me being one of them. We definite want this project to move forward. I am just suggesting to the previous commenter that it is faster to monkey patch instead of waiting for a new release. This repo may be @maintained but is not been actively developed.

@olegykz
Copy link

olegykz commented Nov 10, 2021

Here is my adaptation of the @gsar snippet to make it work with MS SQL server:

module Scenic
  module SchemaDumperDecorator
    private

    def get_dependencies(view_name)
      Scenic.database.send(:connection).select_all(
        'SELECT DISTINCT referenced_entity_name FROM '\
        "sys.dm_sql_referenced_entities(#{ActiveRecord::Base.connection.quote("dbo.#{view_name}")}, 'OBJECT')"
      ).to_a.flat_map(&:values)
    end

    # when dumping the views, their order must be topological sorted
    def dumpable_views_in_database(*args)
      return @ordered_dumpable_views_in_database if @ordered_dumpable_views_in_database

      existing_views = super(*args)
      all_view_names = Set.new(existing_views.map(&:name).sort)

      hash = TsortableHash.new
      existing_views.each do |source_view|
        get_dependencies(source_view.name).
          select { |dependency| dependency.in?(all_view_names) }.
          each { |dependency| (hash[dependency] ||= []) << source_view.name }
      end

      all_view_names.each do |v|
        hash[v] = []
      end
      ordered = hash.tsort
      @ordered_dumpable_views_in_database = ordered.map do |v|
        existing_views.find { |sv| sv.name == v }
      end
      @ordered_dumpable_views_in_database
    end
  end
end

@alexch
Copy link

alexch commented Apr 21, 2022

cc @marcoroth

@alxmagro
Copy link

+1

@alxmagro
Copy link

@derekprior could you provide at least an interface for we configure sort by hand? Ex.

# config/initializers/scenic.rb
Scenic.config.views_order = %i[foo_view bar_view foz_view]

It would help a lot.

robcole added a commit to nomadlabsinc/scenic that referenced this issue Oct 18, 2022
@scottjacobsen
Copy link

There appears to be a problem here (at least for me) - if ActiveRecord::SchemaDumper.prepend Scenic::SchemaDumper in lib/scenic.rb runs before the prepend in the initializer then the prepended method does not run. It is essentially a noop. This is surprising to me, but it seems to be how ruby works. The code in lib/scenic.rb runs before my initializers.

class MyClass
  def name
    puts "my name is MyClass"
  end
end

module FirstPrepend
  def name
    puts "my name is FirstPrepend"
    super
  end
end

module SecondPrepend
  def name
    puts "my name is SecondPrepend"
    super
  end
end

MyClass.prepend FirstPrepend
FirstPrepend.prepend SecondPrepend
puts MyClass.new.name

outputs

my name is FirstPrepend
my name is MyClass

Therefor I had to prepend the decorator directly to ActiveRecord::SchemaDumper:

ActiveRecord::SchemaDumper.prepend(Scenic::SchemaDumperDecorator)

@jaydorsey
Copy link

@derekprior could you provide at least an interface for we configure sort by hand? Ex.

# config/initializers/scenic.rb
Scenic.config.views_order = %i[foo_view bar_view foz_view]

It would help a lot.

A configuration to opt into an breaks-for-some order would be OK in our situation. We could just easily patch it w/ the earlier config changes, but making it a config option as part of the gem would let us start gathering & testing specs collectively, without adversely affecting anyone who's using the gem right now

In our case, I don't believe we build views of views so ours are safe in any consistent order.

Scenic.configure do |config|
  # Maybe just "name" is fine, and we Rails.logger.warn?
  config.schema_dump_order = :unsafe_name 
  # Or maybe it's just a boolean?
  config.risky_schema_dump_order = true
end

I'm open to writing a PR + spec for this optional flag if it's something that would be considered for inclusion in the gem/there is interest. If not, I understand and we can just patch it for our system

Related, found this comment for future reference (kind of a bummer that PR 191 was deleted so we can't see it) #208 (comment)

@calebhearth
Copy link
Contributor

#398 should resolve this, pending one spec failure.

@calebhearth
Copy link
Contributor

#416

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests