ActiveRecord vs SQL
Hi% username%!
Recently, I began to study Ruby on Rails and I faced the challenge of importing data from a CSV file into a MySQL table. The task is not difficult and I wrote the code pretty quickly, but only I was very surprised that while it was running, I managed to pour myself coffee and smoke a cigarette.
- Something is wrong here! I thought, and started digging.
For tests, I created the test_object model, a table in the database:
create_table: test_objects do | t | t.column: field1,: integer t.column: field2,: integer t.column: field3,: integer t.column: field4,: integer t.column: field5,: integer end
and wrote a small script:
values = [] 5000.times do values.push ({: field1 => rand (10000),: field2 => rand (10000),: field3 => rand (10000), : field4 => rand (10000),: field5 => rand (10000)}) end values.each do | item | TestObject.new (item) .save end
Runtime: ~ 30 seconds in a development environment, ~ 22 seconds in production. Too much ...
Then I remembered about the ar-extension plugin , which I came across in search of implementing sql queries of the form “INSERT ... ON DUPLICATE KEY UPDATE” on rails. It allows you to insert data into a table with a single query.
require 'ar-extensions' require 'ar-extensions / adapters / mysql' require 'ar-extensions / import / mysql' .... objs = [] values.each do | item | objs.push (TestObject.new (item)) end TestObject.import objs
Runtime: ~ 14 seconds in a development environment, ~ 12 seconds in production. Already better, but still a lot.
Then I decided to try to abandon the use of ActiveRecord in this piece of code, and use a simple sql query
sql = ActiveRecord :: Base.connection () values.each do | item | sql.execute ("INSERT INTO` test_objects` (`field1`,` field2`, `field3`,` field4`, `field5`) VALUES ('# {item [: field1]}', '# {item [: field2]}', '# {item [: field3]}', '# {item [: field4]}', '# {item [: field5]} ') ") end
Runtime: ~ 5 seconds in a development environment, ~ 3.5 seconds - production.
And if you use transactions
sql = ActiveRecord :: Base.connection () sql.execute ("START TRANSACTION") values.each do | item | sql.execute ("INSERT INTO` test_objects` (`field1`,` field2`, `field3`,` field4`, `field5`) VALUES ('# {item [: field1]}', '# {item [: field2]}', '# {item [: field3]}', '# {item [: field4]}', '# {item [: field5]} ') ") end sql.execute ("COMMIT")
Runtime: ~ 2 seconds in a development environment, ~ 0.8 seconds - production! Much faster than using ActiveRecord!
If using bulk-insert (thanks to CWN and Ventura ):
objs = [] values.each do | item | objs.push ("('# {item [: field1]}', '# {item [: field2]}', '# {item [: field3]}', '# {item [: field4]}', '# {item [: field5]}') ") end sql.execute ("INSERT INTO` test_objects` (`field1`,` field2`, `field3`,` field4`, `field5`) VALUES" + objs.join (','))
Runtime: ~ 0.1-0.2 seconds in production environment!
Conclusion: ActiveRecord is a damn convenient thing, and in no case do I urge to refuse to use it, but in those parts of the code where its wide capabilities are not needed and speed is important, it is better to use regular SQL queries.
UPD: Added runtime in production environment and test using bulk-insert