We optimize the speed of SQL insertion on Android devices
Good afternoon.
During the development of my Android project, I was faced with the task of updating a large number of rows in an SQLite database directly on the device.
The initial solution to the forehead gave terribly slow results, because More than 40,000 rows were to be updated. About how I improved the performance of these row updates in the database, and the story will go.
The Android application was distributed with the SQLite database inside (inside the APK, in asset). The database had information on cities. But the information is basic, language-independent, and language-dependent fields were only in English.
It would be unrealistic to distribute the program with all languages, as each language in the database would add an installer from 1-2 mb to the original APK file. And languages were supported 11.
Therefore, language patches were invented for the database, which would be downloaded from the Internet (from the server), and already rolled onto the base on the device.
The patch was a gzip-clamped text file, each line of which had values separated by a tab (\ t).
There were no problems with downloading. It is fast. But the bottleneck of this scheme is the insertion into the database on the device.
I wrote the first option in Java using the well-known features of working with the SQLite database built into Android.
It was necessary to update 3 fields per line (add values from the patch). Here are just such lines were from 20,000 to 60,000, depending on the language.
The first option looked something like this (I don’t pretend to the reliability of the code, there was nothing left of the first versions, because they were all rewritten and not saved anywhere. I quote the code to show ideas and bottlenecks so that people don’t repeat the mistakes made by me )
Well, of course, such a solution worked super slowly. More precisely, to say that slowly is to say nothing. The code added lines to the database for dozens of minutes.
The first thing that came to my request was to add a transaction.
Added before the loop
After the cycle added
And in the finaly block
This increase was not large. Then I remembered that in Eclipse, after installing Android Developer Tools ( ADT ) there, there is an excellent DDMS perspective , in which there is an excellent function profiling function.
You can read about how to use it here . Habré, by the way, did not find the article describes this functionality ADT (can be bad looking, but found only about memory analysis .)
With the help of this profiling mechanism, it became immediately apparent that I had the performance drawdowns in the following places.
1. Spit is a terribly slow method. It may not be a revelation to anyone, but I was surprised.
2. Work with strings, in terms of gluing strings for queries. Inside execSql, as you can see, each time a bunch of new StringBuilders were made in the loop, which were then thrown out for no need. As they wrote here , never in cycles do not glue lines by plus (+). Use one predefined StringBuilder. And even better, in general, prepare the lines in advance, before the cycle. Which, in my case, is the most obvious improvement.
3. The work of SQL itself inside the android libraries. A bunch of locks and unlocks were made there. After reading the documentation, I found a method at the setLockingEnabled database , setting it to false, we get a good increase in speed.
4. Preparing SQLiteStatement for each call to execSql. This is also an expensive operation. A little googling, I found that statements can, like lines, be prepared in advance, and in a loop, just bind parameters to them, and then execute them.
Having solved all these problems, getting rid of split, taking out the preparation of SQLiteStatement from the loop, removing all work with strings from the loop, adding getDb () before the loop. SetLockingEnabled (false), I got this option
The methods
parseIdFromString (String line),
parseNameFromString (String line, int from) and
parseDataFromString (String line, int from) are extremely clumsy, but they work faster than the split option.
Here are those who are interested:
As a result, this update option in the database worked ten times faster than the original one. Those. updating the 43,000 lines this algorithm did in about 1.5-2 minutes on HTC Desire
But, this result could not satisfy me. It’s not very cool for the user of the program to wait 2 minutes until we insert the language he needs into the database.
With Java, one could still play a shaman, but it would not be possible to increase the speed by orders of magnitude, because the results of running the profiler on the last version of the code clearly showed that now the longest is done inside the native_execute () method from the Android SDK \ sources \ android- 14 \ android \ database \ sqlite \ SQLiteStatement.java. This is a native method working in c ++
But along with it, we lost a lot of time in some kind of magic logTimeStat method, the need for which I still did not understand, and how to disable it, too. In addition, the bindings in the operation did not work very fast either, and indeed, it’s Java ... what kind of performance can be in it (sarcasm, I have nothing wrong with Java)
As a result, I decided not to bother further with Java and write my insert in C ++ (as they say with blackjack and ...). Gather NDK and call it from Java via JNI .
There is one problem with this idea: where to get sqlite under NDK? Well, actually - elementary. We take the source sqlite file from of. site and just add it entirely to your libin under the NDK.
I will not write how to collect code under NDK, because on Habré, and not only on it there is a lot of information.
A small remark about the inclusion of SQLite sources in your library. While I googled this topic, I found official google groups of android developers (unfortunately the links weren’t preserved), where they discussed options for working with the database through their native libs, into which some of their own versions of SQLite are embedded (because the versions are different). So, there, official people from Android did not approve of such a practice very much, they said that in theory this could spoil the database, because the device itself may have some other version of SQLite, and that it works from your own source, and then from Java by standard means, with the same base, you can break it. But in my practice, breaking the database occurred only when the program was forced to terminate, at the moment when my database was updating the contents of the database. This is a rare case. because the program does not last long. But for me this is not a critical case, because the base always lies in my asset and I can restore it at any fault, and ask the user to download the languages again and roll them again.
So, in the case when you need to work with SQLite databases from native code, people from Android advise you to work only from it, without touching these Java databases using Android tools. In this case, with the bases everything is guaranteed to be fine, since you will work with them only with the version of SQLite that you have.
Back to the story.
I decided to write my update. SQLite dragged. The C ++ code itself repeats all the ideas that have already been improved in Java code.
Also, this article helped me a lot . It describes in great detail how to increase insertion speed.
It turned out a lot of code, who are interested, can see
By the way, I wrote and debugged this code under Windows in Visual Studio, and then I built it under the NDK, and everything worked magically on Android.
The version of the code is not final, so you should not find fault with some not the best solutions. The essence of the code is to show how to do the same in C ++ as in Java, but only in C ++ it will work many times faster.
The same insert of 43,000 lines, on the same HTC Desire under debug (with Eclipse connected), worked out about 43 seconds. Those. somewhere, one line in 1 ms. If you cut off Eclipse and debug, you get a truly fast result in the region of 20-25 seconds. On more powerful devices, such as HTC One S, the insertion process generally took about 10-15 seconds. Which, in comparison with the initial minutes, shows that all efforts to improve productivity were made not in vain.
Using the example of my task, I showed how you can speed up work with SQLite when developing for Android (up to the transition to the native level). I don’t argue that there are still plenty of options to do the same, but I think this information will help someone to make their applications even more responsive and fast.
UPD:
Thanks to_climb , mentioned another obvious improvement. Glue 3 update requests into 1n request.
Those. not 3 statutes, but some long ones like
update table set var1 =?, var2 =? where _id =?
During the development of my Android project, I was faced with the task of updating a large number of rows in an SQLite database directly on the device.
The initial solution to the forehead gave terribly slow results, because More than 40,000 rows were to be updated. About how I improved the performance of these row updates in the database, and the story will go.
A more detailed description of the task:
The Android application was distributed with the SQLite database inside (inside the APK, in asset). The database had information on cities. But the information is basic, language-independent, and language-dependent fields were only in English.
It would be unrealistic to distribute the program with all languages, as each language in the database would add an installer from 1-2 mb to the original APK file. And languages were supported 11.
Therefore, language patches were invented for the database, which would be downloaded from the Internet (from the server), and already rolled onto the base on the device.
The patch was a gzip-clamped text file, each line of which had values separated by a tab (\ t).
There were no problems with downloading. It is fast. But the bottleneck of this scheme is the insertion into the database on the device.
I wrote the first option in Java using the well-known features of working with the SQLite database built into Android.
It was necessary to update 3 fields per line (add values from the patch). Here are just such lines were from 20,000 to 60,000, depending on the language.
The first version of Java code
The first option looked something like this (I don’t pretend to the reliability of the code, there was nothing left of the first versions, because they were all rewritten and not saved anywhere. I quote the code to show ideas and bottlenecks so that people don’t repeat the mistakes made by me )
try
{
buffRead = new BufferedReader(fileIn, (1000*1024)); // 1000 KB
String line;
int lineNum = 0;
while ((line = buffRead.readLine()) != null)
{
try
{
String[] values = line.split("\t");
if (values.length < 2) // cause 3rd value van be empty
{
// some error, try next line
continue;
}
int idInt = Integer.valueOf(values[0]);
String name = values[1];
getDb().execSql("update " + getTableNabe() + " set " + lang.getColumnName() + " = ? where " + COLUMN_ID + " = ? ",
new String[] { name, String.valueOf(idInt) });
getDb().execSql("update " + getTableNabe() + " set " + lang.getColumnSort() + " = ? where " + COLUMN_ID + " = ? ",
new String[] { String.valueOf(lineNum++), String.valueOf(idInt) });
if (values.lengh == 3 && values[2].length != 0)
{
String data = values[2];
getDb().execSql("update "+ getTableNabe() + " set " + lang.getColumnData() + " = ? where " + COLUMN_ID + " = ? ",
new String[] { data, String.valueOf(idInt) });
}
}
catch (NumberFormatException e)
{
e.printStackTrace();
return false;
}
catch (SQLException e)
{
e.printStackTrace();
return false;
}
createIndexOnLang(lang);
}// end of while
}
catch (IOException e)
{
e.printStackTrace();
return false;
}
finally
{
if (buffRead != null)
{
try
{
buffRead.close();
}
catch (IOException e)
{
e.printStackTrace();
}
}
}
return true;
Well, of course, such a solution worked super slowly. More precisely, to say that slowly is to say nothing. The code added lines to the database for dozens of minutes.
The first thing that came to my request was to add a transaction.
Added before the loop
getDb().beginTransaction();
After the cycle added
getDb().setTransactionSuccessful();
And in the finaly block
getDb().endTransaction();
This increase was not large. Then I remembered that in Eclipse, after installing Android Developer Tools ( ADT ) there, there is an excellent DDMS perspective , in which there is an excellent function profiling function.
You can read about how to use it here . Habré, by the way, did not find the article describes this functionality ADT (can be bad looking, but found only about memory analysis .)
Drawdowns of performance
With the help of this profiling mechanism, it became immediately apparent that I had the performance drawdowns in the following places.
1. Spit is a terribly slow method. It may not be a revelation to anyone, but I was surprised.
2. Work with strings, in terms of gluing strings for queries. Inside execSql, as you can see, each time a bunch of new StringBuilders were made in the loop, which were then thrown out for no need. As they wrote here , never in cycles do not glue lines by plus (+). Use one predefined StringBuilder. And even better, in general, prepare the lines in advance, before the cycle. Which, in my case, is the most obvious improvement.
3. The work of SQL itself inside the android libraries. A bunch of locks and unlocks were made there. After reading the documentation, I found a method at the setLockingEnabled database , setting it to false, we get a good increase in speed.
4. Preparing SQLiteStatement for each call to execSql. This is also an expensive operation. A little googling, I found that statements can, like lines, be prepared in advance, and in a loop, just bind parameters to them, and then execute them.
Second version of Java code
Having solved all these problems, getting rid of split, taking out the preparation of SQLiteStatement from the loop, removing all work with strings from the loop, adding getDb () before the loop. SetLockingEnabled (false), I got this option
try
{
buffRead = new BufferedReader(fileIn, (1000*1024)); // 1000 KB
String line;
int lineNum = 0;
checkDbErrors();
getDb().beginTransaction();
getDb().setLockingEnabled(false);
// Prepare SQL queries
String updateStatment = "update " + getTableName() + " set ";
String whereStatment = " where " + COLUMN_ID + " = ?";
String updateNameSQL = updateStatment + lang.getColumns().getColumnName() + " = ? " + whereStatment;
String updatqDataSQL = updateStatment + lang.getColumns().getColumnData() + " = ? " + whereStatment;
String updatqSortSQL = updateStatment + lang.getColumns().getColumnSort() + " = ? " + whereStatment;
SQLiteStatement updateName = getDb().compileStatement(updateNameSQL);
SQLiteStatement updateData = getDb().compileStatement(updatqDataSQL);
SQLiteStatement updateSort = getDb().compileStatement(updatqSortSQL);
while ((line = buffRead.readLine()) != null)
{
try
{
int idInt = parseIdFromString(line);
String name = parseNameFromString(line, line.indexOf('\t') + 1);
String data= parseDataFromString(line, name.length() + 1);
updateName.bindString(1, name);
updateName.bindLong(2, idInt);
updateName.execute();
if (data.length() != 0)
{
updateWiki.bindString(1, data);
updateWiki.bindLong(2, idInt);
updateWiki.execute();
}
updateSort.bindLong(1, lineNum++);
updateSort.bindLong(2, idInt);
updateSort.execute();
}
catch (NumberFormatException e)
{
e.printStackTrace();
return false;
}
catch (SQLException e)
{
e.printStackTrace();
return false;
}
}
getDb().setTransactionSuccessful();
}
catch (IOException e)
{
e.printStackTrace();
return false;
}
finally
{
getDb().endTransaction();
if (buffRead != null)
{
try
{
buffRead.close();
}
catch (IOException e)
{
e.printStackTrace();
}
}
}
return true;
The methods
parseIdFromString (String line),
parseNameFromString (String line, int from) and
parseDataFromString (String line, int from) are extremely clumsy, but they work faster than the split option.
Here are those who are interested:
Helper Method Code
As you can see in parseNameFromString there is even a frontal translation of the String into int, but this clumsy version works faster than Integer.valueOf (verified through the profiler)
private int parseIdFromString(String line)
{
int ind = line.indexOf('\t');
if (ind == -1)
{
return 0;
}
String idStr = line.substring(0, ind);
int length = idStr.length();
if (length == 0)
{
return 0;
}
int result = 0;
int zero = '0';
for (int i = 0; i < length; ++i)
{
result += (idStr.charAt(i) - zero) * Math.pow(10, length - i - 1);
}
return result;
}
private String parseNameFromString(String line, int from)
{
int ind = line.indexOf('\t', from);
if (ind == -1)
{
return new String();
}
return line.substring(from, ind);
}
private String parseDataFromString(String line, int from)
{
int ind = line.indexOf('\t', from);
if (ind == -1)
{
return new String();
}
return line.substring(from, ind);
}
As you can see in parseNameFromString there is even a frontal translation of the String into int, but this clumsy version works faster than Integer.valueOf (verified through the profiler)
As a result, this update option in the database worked ten times faster than the original one. Those. updating the 43,000 lines this algorithm did in about 1.5-2 minutes on HTC Desire
But, this result could not satisfy me. It’s not very cool for the user of the program to wait 2 minutes until we insert the language he needs into the database.
With Java, one could still play a shaman, but it would not be possible to increase the speed by orders of magnitude, because the results of running the profiler on the last version of the code clearly showed that now the longest is done inside the native_execute () method from the Android SDK \ sources \ android- 14 \ android \ database \ sqlite \ SQLiteStatement.java. This is a native method working in c ++
But along with it, we lost a lot of time in some kind of magic logTimeStat method, the need for which I still did not understand, and how to disable it, too. In addition, the bindings in the operation did not work very fast either, and indeed, it’s Java ... what kind of performance can be in it (sarcasm, I have nothing wrong with Java)
We write everything in C ++
As a result, I decided not to bother further with Java and write my insert in C ++ (as they say with blackjack and ...). Gather NDK and call it from Java via JNI .
There is one problem with this idea: where to get sqlite under NDK? Well, actually - elementary. We take the source sqlite file from of. site and just add it entirely to your libin under the NDK.
I will not write how to collect code under NDK, because on Habré, and not only on it there is a lot of information.
A small remark about the inclusion of SQLite sources in your library. While I googled this topic, I found official google groups of android developers (unfortunately the links weren’t preserved), where they discussed options for working with the database through their native libs, into which some of their own versions of SQLite are embedded (because the versions are different). So, there, official people from Android did not approve of such a practice very much, they said that in theory this could spoil the database, because the device itself may have some other version of SQLite, and that it works from your own source, and then from Java by standard means, with the same base, you can break it. But in my practice, breaking the database occurred only when the program was forced to terminate, at the moment when my database was updating the contents of the database. This is a rare case. because the program does not last long. But for me this is not a critical case, because the base always lies in my asset and I can restore it at any fault, and ask the user to download the languages again and roll them again.
So, in the case when you need to work with SQLite databases from native code, people from Android advise you to work only from it, without touching these Java databases using Android tools. In this case, with the bases everything is guaranteed to be fine, since you will work with them only with the version of SQLite that you have.
Back to the story.
I decided to write my update. SQLite dragged. The C ++ code itself repeats all the ideas that have already been improved in Java code.
Also, this article helped me a lot . It describes in great detail how to increase insertion speed.
It turned out a lot of code, who are interested, can see
C ++ Code
std::vector& split(const std::string &s, char delim, std::vector &elems)
{
elems.clear();
std::stringstream ss(s);
std::string item;
while (std::getline(ss, item, delim))
{
elems.push_back(item);
}
return elems;
}
std::string prepareUpdateStatment(std::string columnName, std::string columnValue, std::string id)
{
std::ostringstream constructor;
constructor << "update cities set " << columnName << " = \"" << columnValue << "\" where _id = " << id;
return constructor.str();
}
std::string prepareUpdateStatmentForBind(std::string columnName)
{
std::ostringstream constructor;
constructor << "update cities set " << columnName << " = ? where _id = ? ";
return constructor.str();
}
std::string getColumnName(std::string column, std::string lang)
{
std::ostringstream constructor;
constructor << lang << "_" << column;
return constructor.str();
}
std::string parseInt(int i)
{
std::ostringstream ss;
ss << i;
return ss.str();
}
bool pushToDBWithPreparedStatments(std::string line, sqlite3* db, std::string lang, int lineNum, sqlite3_stmt* stmtnUpdateName, sqlite3_stmt* stmtnUpdateSort, sqlite3_stmt* stmtnUpdateData)
{
if (line.size() == 0)
{
return true; // end of file
}
int error = SQLITE_OK;
std::vector elems;
elems = split(line, '\t', elems);
if (elems.size() < 2)
{
log("line parse error");
return false;
}
std::string& idStr = elems[0];
int idInt = atoi(idStr.c_str());
std::string& nameStr = elems[1];
sqlite3_bind_text(stmtnUpdateName, 1 , nameStr.c_str(), -1, SQLITE_STATIC);
sqlite3_bind_int(stmtnUpdateName, 2 , idInt);
if ((error = sqlite3_step(stmtnUpdateName)) != SQLITE_DONE)
{
logError(error, sqlite3_errmsg(db));
return false;
}
sqlite3_clear_bindings(stmtnUpdateName);
sqlite3_reset(stmtnUpdateName);
sqlite3_bind_int(stmtnUpdateSort, 1 , lineNum);
sqlite3_bind_int(stmtnUpdateSort, 2 , idInt);
if ((error = sqlite3_step(stmtnUpdateSort)) != SQLITE_DONE)
{
logError(error, sqlite3_errmsg(db));
return false;
}
sqlite3_clear_bindings(stmtnUpdateSort);
sqlite3_reset(stmtnUpdateSort);
if (elems.size() == 3)
{
std::string& DataStr = elems[2];
sqlite3_bind_text(stmtnUpdateData, 1 , DataStr.c_str(), -1, SQLITE_STATIC);
sqlite3_bind_int(stmtnUpdateData, 2 , idInt);
if ((error = sqlite3_step(stmtnUpdateData)) != SQLITE_DONE)
{
logError(error, sqlite3_errmsg(db));
return false;
}
sqlite3_clear_bindings(stmtnUpdateData);
sqlite3_reset(stmtnUpdateData);
}
return true;
}
void parseAndUpdateDB(std::string databasePath, std::string patchPath, std::string lang)
{
time_t beforeStartTime = time(NULL);
sqlite3* db;
if (sqlite3_open_v2(databasePath.c_str(), &db, SQLITE_OPEN_READWRITE | SQLITE_OPEN_NOMUTEX | SQLITE_OPEN_PRIVATECACHE, NULL) != SQLITE_OK)
{
logError("Error wile opening db", sqlite3_errmsg(db));
return;
}
std::string line;
std::ifstream myfile(patchPath.c_str());
if (!myfile.is_open())
{
log("Error wile opening patch file");
return;
}
int lineNum = 0;
int error = SQLITE_OK;
// Begin transaction
if ( (error = sqlite3_exec(db, "begin", NULL, NULL, NULL)) != SQLITE_OK)
{
logError(error, sqlite3_errmsg(db));
return;
}
sqlite3_stmt* stmtnUpdateName;
std::string updateName = prepareUpdateStatmentForBind(getColumnName("name", lang));
if ( (error = sqlite3_prepare(db, updateName.c_str(), updateName.length(), &stmtnUpdateName, NULL)) != SQLITE_OK)
{
logError(error, sqlite3_errmsg(db));
return;
}
sqlite3_stmt * stmtnUpdateSort;
std::string updateSort = prepareUpdateStatmentForBind(getColumnName("sort", lang));
if ( (error = sqlite3_prepare(db, updateSort.c_str(), updateSort.length(), &stmtnUpdateSort, NULL)) != SQLITE_OK)
{
logError(error, sqlite3_errmsg(db));
return;
}
sqlite3_stmt * stmtnUpdateData;
std::string updateData = prepareUpdateStatmentForBind(getColumnName("data", lang));
if ( (error = sqlite3_prepare(db, updateData.c_str(), updateData.length(), &stmtnUpdateData, NULL)) != SQLITE_OK)
{
logError(error, sqlite3_errmsg(db));
return;
}
// For fast work
sqlite3_exec(db, "PRAGMA synchronous = OFF", NULL, NULL, NULL);
sqlite3_exec(db, "PRAGMA journal_mode = MEMORY", NULL, NULL, NULL);
while ( myfile.good() )
{
std::getline(myfile, line);
if (!pushToDBWithPreparedStatments(line, db, lang, lineNum++, stmtnUpdateName, stmtnUpdateSort, stmtnUpdateData))
{
break;
}
}
sqlite3_finalize(stmtnUpdateName);
sqlite3_finalize(stmtnUpdateSort);
sqlite3_finalize(stmtnUpdateData);
// End transaction
if ( (error = sqlite3_exec(db, "end", NULL, NULL, NULL)) != SQLITE_OK)
{
logError(error, sqlite3_errmsg(db));
return;
}
sqlite3_close(db);
myfile.close();
time_t afterFinishTime = time(NULL);
int result = afterFinishTime- beforeStartTime;
log("result of run is %d secs" , result);
}
By the way, I wrote and debugged this code under Windows in Visual Studio, and then I built it under the NDK, and everything worked magically on Android.
The version of the code is not final, so you should not find fault with some not the best solutions. The essence of the code is to show how to do the same in C ++ as in Java, but only in C ++ it will work many times faster.
So about speed.
The same insert of 43,000 lines, on the same HTC Desire under debug (with Eclipse connected), worked out about 43 seconds. Those. somewhere, one line in 1 ms. If you cut off Eclipse and debug, you get a truly fast result in the region of 20-25 seconds. On more powerful devices, such as HTC One S, the insertion process generally took about 10-15 seconds. Which, in comparison with the initial minutes, shows that all efforts to improve productivity were made not in vain.
Morality
Using the example of my task, I showed how you can speed up work with SQLite when developing for Android (up to the transition to the native level). I don’t argue that there are still plenty of options to do the same, but I think this information will help someone to make their applications even more responsive and fast.
UPD:
Thanks to_climb , mentioned another obvious improvement. Glue 3 update requests into 1n request.
Those. not 3 statutes, but some long ones like
update table set var1 =?, var2 =? where _id =?