Full-text search on App Engine now
Of course, everyone is looking forward to full-text search appearing in App Engine , but so far it is not even in roadmap . However, for GAE / Java, full-text search can be screwed on its own right now.
You will be surprised, but this is Lucene . No, no, don’t be in a hurry to leave disappointedly, sighing, "Well, this is how much to mess around." Making friends with Lucene with the App Engine datastore is probably not five minutes, but if you use JDO, then the compass-project developers in the compass 2.3 under development have already done this for you (now in beta).
Compass can turn your JDO objects into indexed documents and return them as search results. The code of the simplest JDO object with indexing and search, and comments on interesting places in it, see below. That is compiled and work, you need to add in
[1] Annotations indicate JDO classes to be indexed ...
[2] ... properties that identify documents, ...
[3] ... and finally properties whose values will be indexed.
[4] When constructing an instance,
[5] This is a subtlety specific to App Engine. Compass generally runs on multiple threads, but threads cannot be created on the App Engine, so threads must be disabled.
[6]This is also a small hack relevant for App Engine. Compass actually knows how to update the index automatically when something changes in persistent data that is interesting to it. But when working on the App Engine, it may not meet the allotted 30 seconds. A call
[7, 8] App Engine support is still damp, and sometimes Compass seems to forget to release index locks. I have not figured out the reasons and the correct way of correction, but an explicit request to release the locks helps. Of course, one must understand that this may not be very safe. If anyone finds the time to catch and defeat the problem, let me know.
[9]Actually, this is how an object is indexed manually.
[10] The result of the search is a collection
You will be surprised, but this is Lucene . No, no, don’t be in a hurry to leave disappointedly, sighing, "Well, this is how much to mess around." Making friends with Lucene with the App Engine datastore is probably not five minutes, but if you use JDO, then the compass-project developers in the compass 2.3 under development have already done this for you (now in beta).
Compass can turn your JDO objects into indexed documents and return them as search results. The code of the simplest JDO object with indexing and search, and comments on interesting places in it, see below. That is compiled and work, you need to add in
war/WEB-INF/lib
a few libraries: lucene-core, commons-logging и compass-2.3.0-beta1
. They all live in Compass’s night build distribution, which is buried deep in the jungle of their continuous build system. At the time of writing, the last successful build lives here: http://build.compass-project.org/download/CMPTRK-NIGHTLY/artifacts/build-786/Release . I admit that I did not check the performance of this particular build.@PersistenceCapable(identityType = IdentityType.APPLICATION)
@Searchable // [1]
public class GreetingServiceUser {
@PrimaryKey
@Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY)
@SearchableId
private Long id; // [2]
@Persistent
private String name;
public GreetingServiceUser(String name) {
this.name = name;
}
@SearchableProperty // [3]
public String getName() {
return name;
}
private static final PersistenceManagerFactory factory =
JDOHelper.getPersistenceManagerFactory("transactions-optional");
private static Compass compass;
private static CompassGps compassGps;
static {
// [4]
compass = new CompassConfiguration()
.setConnection("gae://index")
// [5]
.setSetting(CompassEnvironment.ExecutorManager.EXECUTOR_MANAGER_TYPE,
"disabled")
.addScan("compass_test")
.buildCompass();
compassGps = new SingleCompassGps(compass);
Jdo2GpsDevice jdo2GpsDevice = new Jdo2GpsDevice("appengine", factory);
// [6]
jdo2GpsDevice.setMirrorDataChanges(false);
compassGps.addGpsDevice(jdo2GpsDevice);
// [7]
// if (compass.getSearchEngineIndexManager().isLocked()) {
// compass.getSearchEngineIndexManager().releaseLocks();
// }
compassGps.start();
}
public void save() {
PersistenceManager pm = factory.getPersistenceManager();
CompassIndexSession indexSession = null;
try {
pm.makePersistent(this);
// [8]
// compass.getSearchEngineIndexManager().releaseLocks();
// [9]
indexSession = compass.openIndexSession();
indexSession.save(this);
indexSession.commit();
}
catch (Throwable e) {
e.printStackTrace();
}
finally {
pm.close();
if (indexSession != null) {
indexSession.close();
}
}
}
public static List search(String query) {
CompassSearchSession searchSession = compass.openSearchSession();
CompassHits hits = searchSession.find(query);
List results =
new ArrayList(hits.getLength());
PersistenceManager pm = factory.getPersistenceManager();
try {
for (int i=0; i
// [10]
results.add(pm.getObjectById(
GreetingServiceUser.class,
Long.valueOf(hits.resource(i).getId())));
}
}
catch (JDOObjectNotFoundException e) {
e.printStackTrace();
}
finally {
pm.close();
}
return results;
}
}
* This source code was highlighted with Source Code Highlighter.
[1] Annotations indicate JDO classes to be indexed ...
[2] ... properties that identify documents, ...
[3] ... and finally properties whose values will be indexed.
[4] When constructing an instance,
Compass
you specify which packages should be scanned in search of such annotations. [5] This is a subtlety specific to App Engine. Compass generally runs on multiple threads, but threads cannot be created on the App Engine, so threads must be disabled.
[6]This is also a small hack relevant for App Engine. Compass actually knows how to update the index automatically when something changes in persistent data that is interesting to it. But when working on the App Engine, it may not meet the allotted 30 seconds. A call
setMirrorDataChanges(false)
will turn off automatic updating and you will need to update the index manually. This is not painful, and as compensation it can be done, for example, in a separate task in the task queue. [7, 8] App Engine support is still damp, and sometimes Compass seems to forget to release index locks. I have not figured out the reasons and the correct way of correction, but an explicit request to release the locks helps. Of course, one must understand that this may not be very safe. If anyone finds the time to catch and defeat the problem, let me know.
[9]Actually, this is how an object is indexed manually.
[10] The result of the search is a collection
CompassHits
. Theoretically, it is possible to extract the found objects ( GreetingServiceUser
in our case) from it using the method data(int)
, but I obtained them with empty values. Therefore, I use the Resource object that corresponds to the document, extract the document id from it and look for the corresponding JDO object using JDO / AppEngine.