iago December 26, 2011 at 21:15

Optimization of work with SQLite for iOS

“Think of SQLite not as a replacement for Oracle but as a replacement for fopen ()”
- About SQLite

And also, most likely, under Android, BlackBerry and in the sandbox of browsers for web applications, but I did not check.

Why it may be necessary to work with SQLite directly?

Any experienced iOS developer will immediately reproach me for using SQLite directly (or rather not directly, but through FmDb, but this is almost the same as directly). He will say that you need to use CoreData, because it does a lot of nishtyak automatically, such as Undo and Redo. And in it you can draw beautiful sketches, which are then pleasant to show to the customer. And in android, for example, there is OrmLite.

And I agree - but until then, until your base has exceeded, say, over 10 tables of 500,000 records each. And if there are 52 tables, and there are especially fat tables for a million or more? And the database needs to be synchronized with the server through the third format, besides the customer is critical, will the synchronization last an hour or five? If you have encountered tasks of such a volume, welcome to cut! If you haven’t met, then too, because no one is immune from projects with big data, even if there are less of such mobile ones.

The order of query operators

The fact is that when we are taught to work with a database, in most cases we are taught to work with the best of enterprise solutions. For example, I was taught at the institute on Oracle, someone on MS SQL. But SQLite is many times simpler - this, for example, follows from the epigraph to the article taken from the official SQLite website .

Quite by accident, I noticed that it

SELECT * FROM tablename WHERE col1 LIKE ‘%string%’ AND col2 = 123456

works 3-4 times slower than

SELECT * FROM tablename WHERE col2 = 123456 AND col1 LIKE ‘%string%’

on a table of, say, 300,000 records. We just swapped operators, and how the result changed!

In the textbooks on databases, most likely, they do not even focus on such features, and they do it right - in all enterprise solutions there are query optimizers. For example, on MS SQL Server 2008 Web Edition there is no difference on the same data and the same queries.

But there is one on SQLite. This must be remembered. In the SQLite world, simpler operations should always go to the left of more complex ones.

SQLite database can and should be indexed too

Thinking of SQLite as an alternative to fopen, a database without stored procedures, semaphores and users, you forget that it, like any normal database, supports indexes. So much has been written about them that it is not worth focusing on syntactic features - just remember that as soon as the size of the database exceeds 50,000 lines, it needs to be indexed. And when using complex queries - before.

I will only allow myself a small remark - indexing is best done after the basic queries are written, based on their analysis. When designing a database, when a developer does not yet know the entire business logic of the application by heart, he may be mistaken as to which field will most often be searched / selected. However, having SQL queries in front of your eyes, writing the right index is worthless.

If you often select across multiple tables, it makes sense to cache the data

On one of the projects, each time at the start of the application, it was necessary to ask the user to select the car with which he wants to work. In order to compile a complete description of the machine, it was necessary to refer to several tables:

Year of issue
Manufacturer
Model
Equipment
Some other technical characteristics that the customer wished to show when choosing a machine. For example, MPG (miles per gallon, an analogue of our kilometers per liter).

Making a query on five tables and making a list of cars, we slowed down the iPhone to a response of 6-8 seconds at each picker. There are two exits - at the first start (we have it during synchronization), you can either compose a presentation with all the possible data, or, if more convenient, save objects with data directly to the hard drive. Also once, at a convenient moment when the user knows about the delay and is ready to wait.

It can be more convenient to save objects if you had to include several fields from the picking table in a view for each, say, machine. In the database, this would lead to the inevitable duplication of lines (if one to many links to several lines of configuration parameters would have to create several identical lines with the vehicle ID and other parameters), and in the object all data would be stored in one copy.

SQLite - single threaded database

It is not directly related to optimization, but you should not forget about this either. Turning to a SQLite database of two threads at the same time will inevitably cause a crash. There are two outputs:

Synchronize calls using the @synchronized directive. This is if it’s too late to change the architecture, as I had;
If the task is laid at the design stage, start a query manager based on NSOperationQueue. It insures against errors automatically, and what is done automatically is often done without errors.

In general, it is always better to access the database from secondary threads, even if the UI thread does nothing but scroll through the wait indicator. Time to write delegate methods and asynchronize business logic will more than pay off for a positive user experience.

Conclusion

All that I listed above were only special cases. There is no universal optimization task that can be estimated in 1.5 days or 10% of the total project time. This has to be done as necessary. The main rule that always helped me to choose the right optimization path is that the time spent on sampling should not increase with the growth of the database size except in special cases . It should remain approximately the same both on one hundred records and one hundred thousand.

And one more rule, my boss’s favorite phrase - do not optimize what does not need optimization. Often the user doesn’t care if your code will be executed for one millisecond or one hundred, and it’s important for the customer whether you delayed the project for a day or delivered on time.

Tags: