ActiveRecord vs SQL

    Hi% username%!


    Recently, I began to study Ruby on Rails and I faced the challenge of importing data from a CSV file into a MySQL table. The task is not difficult and I wrote the code pretty quickly, but only I was very surprised that while it was running, I managed to pour myself coffee and smoke a cigarette.

    - Something is wrong here! I thought, and started digging.

    For tests, I created the test_object model, a table in the database:

    create_table: test_objects do | t |
    	t.column: field1,: integer
    	t.column: field2,: integer
    	t.column: field3,: integer
    	t.column: field4,: integer
    	t.column: field5,: integer
    end
    

    and wrote a small script:

    values ​​= []
    5000.times do
    	values.push ({: field1 => rand (10000),: field2 => rand (10000),: field3 => rand (10000), 
    				: field4 => rand (10000),: field5 => rand (10000)})
    end
    values.each do | item |
    	TestObject.new (item) .save
    end
    

    Runtime: ~ 30 seconds in a development environment, ~ 22 seconds in production. Too much ...

    Then I remembered about the ar-extension plugin , which I came across in search of implementing sql queries of the form “INSERT ... ON DUPLICATE KEY UPDATE” on rails. It allows you to insert data into a table with a single query.
    require 'ar-extensions'
    require 'ar-extensions / adapters / mysql'
    require 'ar-extensions / import / mysql'
    ....
    objs = []
    values.each do | item |
    	objs.push (TestObject.new (item))
    end
    TestObject.import objs
    

    Runtime: ~ 14 seconds in a development environment, ~ 12 seconds in production. Already better, but still a lot.

    Then I decided to try to abandon the use of ActiveRecord in this piece of code, and use a simple sql query

    sql = ActiveRecord :: Base.connection ()
    values.each do | item |
    	sql.execute ("INSERT INTO` test_objects` (`field1`,` field2`, `field3`,` field4`, `field5`) 
    		VALUES ('# {item [: field1]}', '# {item [: field2]}', '# {item [: field3]}', '# {item [: field4]}', '# {item [: field5]} ') ")
    end
    

    Runtime: ~ 5 seconds in a development environment, ~ 3.5 seconds - production.

    And if you use transactions

    sql = ActiveRecord :: Base.connection ()
    sql.execute ("START TRANSACTION")
    values.each do | item |
    	sql.execute ("INSERT INTO` test_objects` (`field1`,` field2`, `field3`,` field4`, `field5`) 
    		VALUES ('# {item [: field1]}', '# {item [: field2]}', '# {item [: field3]}', '# {item [: field4]}', '# {item [: field5]} ') ")
    end
    sql.execute ("COMMIT")
    

    Runtime: ~ 2 seconds in a development environment, ~ 0.8 seconds - production! Much faster than using ActiveRecord!

    If using bulk-insert (thanks to CWN  and Ventura ):
    objs = []
    values.each do | item |
    	objs.push ("('# {item [: field1]}', '# {item [: field2]}', '# {item [: field3]}', '# {item [: field4]}', '# {item [: field5]}') ")
    end
    sql.execute ("INSERT INTO` test_objects` (`field1`,` field2`, `field3`,` field4`, `field5`) VALUES" + objs.join (','))
    

    Runtime: ~ 0.1-0.2 seconds in production environment!

    Conclusion: ActiveRecord is a damn convenient thing, and in no case do I urge to refuse to use it, but in those parts of the code where its wide capabilities are not needed and speed is important, it is better to use regular SQL queries.

    UPD: Added runtime in production environment and test using bulk-insert


    Also popular now: