Using SQL in Rails

Original author: Alexey Osipenko
  • Transfer

Firstly, the article is not about how I love rails, and secondly, the article is not about how I hate them. They can be treated completely differently and they will become better only if they are changed. And they can only become worse if they start to change. Well, in general, I warned you, and you understood me.


One of the basic concepts of ActiveRecord is that the database is quite utilitarian and can be changed. Well, you are sitting, writing your own models using MySQL and suddenly you read somewhere that you can take MySQL and replace it with MongoDB. Well, not so radically, but, say, with PostgreSQL you may have reasons to replace MySQL. Or vice versa, I have nothing against MySQL. Here ActiveRecord claims that it is easier to do this for you, they say, before / after filters and associations are abstract enough not to worry about database queries and take care of the logic of the application. What instead WHERE is_archived = TRUEyou happily writewhere(is_archived: true)and ActiveRecord will do everything for you. All examples will be provided for PostgreSQL, not for MySQL, so MySQL users will have to reinvent their own bicycle.



But how not so! In practice, it turns out that this layer of abstraction is completely full of holes, like a trough from the tale of the Golden Fish. And that many basic features cannot be used, like comparing dates or working with arrays. And scopes are obtained with forced where("#{quoted_table_name}.finished_at >= ?", Date.current)or where("#{quoted_table_name}.other_ids <@ ARRAY[?]", ids). To which ActiveRecord gives a quite conscious and logical answer: do not use it. Instead of arrays, use a habtm connection, and if you need to compare dates, live with it. Yes, and God forbid you to miss quoted_table_namein such a scoop - the first one includesor joinsput everything in its place. It is easier to always and everywhere write, so as not to knock down your hand.


And, of course, as soon as you decide to interfere with ActiveRecord, there will be no way back. Not that the odds, even the ghostly hopes for a painless transition to another database will be. It will be much easier to print and burn this source code. And of course, there is no other reason not to use extra database capabilities in your application. Use on health and force others!


And when it turns out that your scoops in the models folder are more than half of these extra utilities, it will become quite obvious that ActiveRecord is just a handy wrapper for integrating one piece of code with tablets with another piece of code. And, like where(is_archived: true).joins(:sprint).merge(Sprint.archived), it will work fine and it will be no more difficult to combine them than to cook an egg, right?



The next stage will be denormalization. No, denormalization, as it were, always did not disappear anywhere, but concern for her was placed on the mighty shoulders of Rails and ActiveRecord, and you know that these two guys were not distinguished by promptness and asceticism in the requirements for resources. Let's say counter_cache: true- this is the first step towards denormalization, because COUNT(*) AS sprints_countyou simply won't allow you to take ActiveRecord (you don't want to change select(), right?). And furthercounter_cachenot far from perfect, and in some cases, the real number can be out of sync from the cached one. Uncritically, of course, but unpleasant. And this is only the first candidate in order to settle in the database and not load the already loaded head of the ruby-machine. Just a couple of triggers and you're done! Firstly, when deleting and adding a new record to the plate A, you need to recalculate the number of records in plate B and that's it, right? Well, when editing, of course, if foreign_keychanged, because the query UPDATE B SET a_id = $1 WHERE id = $2will break counter_cache for both the old A and the new.


CREATEOR REPLACEFUNCTION update_#{parent_table}_#{child_table}_counter_on_insert() RETURNSTRIGGERAS $$BEGINUPDATE #{parent_table} SET #{counter_column} = COALESCE((SELECT COUNT(id) FROM #{child_table} GROUPBY #{foreign_column} HAVING #{foreign_column} = NEW.#{foreign_column}), 0) WHERE (#{parent_table}.id = NEW.#{foreign_column});
    RETURNNULL;
  END;
  $$LANGUAGE plpgsql;

The next path of the database works will be related to the date-time. And for starters, let's just field created_atand updated_atwill serve in the database, good, it is much easier. First we set defaults:


  change_column_default :table_name, :created_at, -> { 'CURRENT_TIMESTAMP' }
  change_column_default :table_name, :updated_at, -> { 'CURRENT_TIMESTAMP' }

And to immediately do it everywhere, you can organize a cycle on all the labels, where there are these fields. In addition to the tables schema_migrationsand ar_internal_metadata, of course:


  (tables - %w(schema_migrations ar_internal_metadata)).each { ... }

Everything, now the default value for these tables will be exactly the way we need it. And now is the time to take care that the rails do not touch these fields. This is done with two bolts in the right places. And yes, there is an option in setting up the framework that is responsible for this:


Rails.application.config.active_record.record_timestamps = false

So, the next step is to update the field updated_atat the time of updating the record. It's simple:


CREATEOR REPLACEFUNCTION touch_for_#{table_name}_on_update() RETURNSTRIGGERAS $$BEGINSELECTCURRENT_TIMESTAMPINTONEW.updated_at;
    RETURNNEW;
  END;
  $$LANGUAGE plpgsql;

Now you need to completely get rid of touch: truethe models. This thing is very much like a dash target - also completely full of holes. And I will not even explain why, because you already know all these cases. This is not much more complicated, all you need is to update updated_at not only for yourself:


CREATEOR REPLACEFUNCTION touch_for_#{table_name}_on_update() RETURNSTRIGGERAS $$BEGINUPDATE foreign_table_name SET updated_at = CURRENT_TIMESTAMPWHERE id = NEW.foreign_column_name;
    SELECTCURRENT_TIMESTAMPINTONEW.updated_at;
    RETURNNEW;
  END;
  $$LANGUAGE plpgsql;

Of course, the chain of calls for such triggers will do an extra action, but in the postgree of the sane mechanism it will trigger the triggers without changing the entry itself. You can try to do it SET title = title, but it turns out nothing better than SET updated_at = CURRENT_TIMESTAMP.


Exactly the same trigger will be on the insert, only it is updated_atnot necessary to update :


CREATEOR REPLACEFUNCTION touch_for_#{table_name}_on_insert() RETURNSTRIGGERAS $$BEGINUPDATE foreign_table_name SET updated_at = CURRENT_TIMESTAMPWHERE id = NEW.foreign_column_name;
    RETURNNEW;
  END;
  $$LANGUAGE plpgsql;

Of course, you could try to write this with a single function, adding a check for the current event directly inside the trigger, like IF TG_OP = 'UPDATE' THEN, but it is preferable to make all the triggers as simple as possible to reduce the likelihood of error.


You may want to somehow automate the generation of such triggers, and then you will probably need to find all foreign links between the current table and the rest. With this request you can easily do it:


SELECT ccu.table_nameAS foreign_table_name, kcu.column_nameAScolumn_nameFROM information_schema.table_constraints AS tc
    JOIN information_schema.key_column_usage AS kcu
    ON tc.constraint_name = kcu.constraint_nameJOIN information_schema.constraint_column_usage AS ccu
    ON ccu.constraint_name = tc.constraint_nameWHERE constraint_type = 'FOREIGN KEY'AND tc.table_name = '#{table_name}'ORDERBY ccu.table_name;

Another very helpful tip. Call triggers monotonously to be able to verify that one request is present or not, for example, this query will find all touch-insert triggers:


SELECT routine_name ASnameFROM information_schema.routinesWHERE
    routine_name LIKE'touch_for_%_on_insert'AND
    routine_type ='FUNCTION'AND
    specific_schema='public';

And the last thing left is the worst. The fact is that the rails are not designed for at least some clever database and absolutely did not care if the database could change at least something other than the field id, and then only when inserted. Therefore, there is no sane mechanism to add RETURNING updated_atto update requests, no, it will be necessary to dive into the insides of Rails to the ears.



The mankipatch turned out not so much neat, but first of all the goal was to damage the current work of the framework as little as possible.


I will give it in full
moduleActiveRecordmodulePersistence# https://github.com/rails/rails/blob/v5.2.0/activerecord/lib/active_record/persistence.rb#L729-L741def_create_record(attribute_names = self.attribute_names)
      attribute_names &= self.class.column_names
      attributes_values = attributes_with_values_for_create(attribute_names)
      an_id, *affected_rows = self.class._insert_record(attributes_values).dup
      self.id ||= an_id ifself.class.primary_key
      Hash[ApplicationRecord.custom_returning_columns(self.class.quoted_table_name, :create).take(affected_rows.size).zip(affected_rows)].each do|column_name, value|
        public_send("#{column_name}=", self.class.attribute_types[column_name.to_s].deserialize(value)) if value
      end
      @new_record = falseyield(self) if block_given?
      id
    end
    private :_create_record# https://github.com/rails/rails/blob/v5.2.0/activerecord/lib/active_record/persistence.rb#L710-L725def_update_record(attribute_names = self.attribute_names)
      attribute_names &= self.class.column_names
      attribute_names = attributes_for_update(attribute_names)
      if attribute_names.empty?
        affected_rows = []
        @_trigger_update_callback = trueelse
        affected_rows = _update_row(attribute_names)
        @_trigger_update_callback = affected_rows.any?
      end
      Hash[ApplicationRecord.custom_returning_columns(self.class.quoted_table_name, :update).take(affected_rows.size).zip(affected_rows)].each do|column_name, value|
        public_send("#{column_name}=", self.class.attribute_types[column_name.to_s].deserialize(value))
      endyield(self) if block_given?
      affected_rows.none? ? 0 : 1end
    private :_update_recordendmoduleConnectionAdaptersmodulePostgreSQLmoduleDatabaseStatements# https://github.com/rails/rails/blob/v5.2.0/activerecord/lib/active_record/connection_adapters/postgresql/database_statements.rb#L93-L96defexec_update(sql, name = nil, binds = [])
          execute_and_clear(sql_with_returning(sql), name, binds) { |result| Array.wrap(result.values.first) }
        end# https://github.com/rails/rails/blob/v5.2.0/activerecord/lib/active_record/connection_adapters/abstract/database_statements.rb#L147-L152definsert(arel, name = nil, pk = nil, _id_value = nil, sequence_name = nil, binds = [])
          sql, binds = to_sql_and_binds(arel, binds)
          exec_insert(sql, name, binds, pk, sequence_name).rows.first
        endalias create insert
        # https://github.com/rails/rails/blob/v5.2.0/activerecord/lib/active_record/connection_adapters/postgresql/database_statements.rb#L98-L111defsql_for_insert(sql, pk, id_value, sequence_name, binds)# :nodoc:
          table_ref = extract_table_ref_from_insert_sql(sql)
          if pk.nil?
            # Extract the table from the insert sql. Yuck.
            pk = primary_key(table_ref) if table_ref
          end
          returning_columns = quote_returning_column_names(table_ref, pk, :create)
          if returning_columns.any?
            sql = "#{sql} RETURNING #{returning_columns.join(', ')}"endsuperend# No source in original repodefquote_returning_column_names(table_ref, pk, action)
          returning_columns = []
          returning_columns << pk if suppress_composite_primary_key(pk)
          returning_columns += ApplicationRecord.custom_returning_columns(table_ref, action)
          returning_columns.map { |column| quote_column_name(column) }
        end# No source in original repodefsql_with_returning(sql)
          table_ref = extract_table_ref_from_update_sql(sql)
          returning_columns = quote_returning_column_names(table_ref, nil, :update)
          return sql if returning_columns.blank?
          "#{sql} RETURNING #{returning_columns.join(', ')}"end# No source in original repodefextract_table_ref_from_update_sql(sql)
          sql[/update\s("[A-Za-z0-9_."\[\]\s]+"|[A-Za-z0-9_."\[\]]+)\s*set/im]
          Regexp.last_match(1)&.strip
        endendendendend

The most important thing is that there is a call to ApplicationRecord.custom_returning_columnsfind out what kind of columns, besides id, we are interested in. And this method looks something like this:


class << selfdefcustom_returning_columns(table_ref, action)return [] if ['"schema_migrations"', '"ar_internal_metadata"'].include?(table_ref)
      res = []
      res << :created_atif action == :create
      res << :updated_at
      res += case table_ref
             when'"user_applications"'
               [:api_token]
             when'"users"'
               [:session_salt, :password_changed_at]
             # ...else
               []
             end
      res
    endend



Instead of conclusions, we can say that, the sick head of Rails became a little less sick. Such routine processes, as counter_cachewell as touchsink into oblivion, and in the next article will be able to think about something more global, such as removing hanging spaces, data validation, cascading data deletion, or paranoid deletion. If you liked this article, of course.


Also popular now: