heilage June 10, 2014 at 10:04

2GIS for% browser% or how we did the extension

Once we asked ourselves a question: how can we help a user choose a company outside 2gis.ru ? An implementation option for the idea in the form of a browser extension was proposed almost immediately, and after the stages of research and planning, we started developing 2GIS for browsers .

As the main implementation option, we settled on the button on the side of the address bar, when clicked, an information window opens. Additionally - highlight phone numbers inside site content.

For greater convenience, it was decided to determine the user's current coordinates in order to show him reliable information about the distance to the company of interest and give a quick link to search for directions from the current location to the company. Subsequently, the determination of coordinates became more important, since it was necessary to determine the city in which the user is located in order to show him data from the most suitable source - 2GIS or Google.

Background and content scripts

From a technical point of view, the average browser extension is a relatively simple JavaScript application with one or more entry points.

Regardless of which browser the extension is written for, two groups of scripts can be distinguished within the application, namely, background and content.

Background scripts are executed in the sandbox (sandboxed evaluation) at the application level, and their data is the same for the running browser instance. They are launched once per session, either when installing the extension, or when starting the browser with the extension installed.

Content scriptsif they exist, they are executed in the context of the content of the site being viewed and have access to the DOM tree, but their data is different for each window and browser tab. Content scripts are launched every time the onload event of the page loaded into the browser occurs. Background scripts do not have direct access to the content of the pages, but browser engines provide mechanisms for interaction between scripts in the background and scripts in the content area. In addition, each engine has its own characteristics in the organization of data exchange.

In addition to background and content scripts, the extension can include other entry points - one for each additional resource. For example, a popup info window and one or more pages of extension options. Additional resource scripts are executed every time they are initialized.

Framework for creating extensions. Kango extensions

We wanted the extension to work in different browsers, so we turned our attention to cross-browser development frameworks. Some of them were provided as SaaS, which did not suit us at all. Others - knew how to work only with the content area and were more likely to implement cross-browser user scripts (userscripts). And only two frameworks from all this "variety" met the requirements. We settled on the Kango Extensions framework , since it has:

lower entry threshold;
cross-platform assembly;
Russian developer as part of their team :).

In the free version, which we have chosen, you can create extension packages for Mozilla Firefox, Safari, Google Chrome and its derivatives. The interface is quite limited, common to all browsers, but it is enough to create buttons in the address bar and pop-ups.

The framework includes:

Permanent data storage kango.storage, accessible only from background scripts. Interestingly, the repository behaves slightly differently on different browsers. For example, in firefox, data is saved when the extension is removed and reinstalled, and in webkit browsers it is erased when the extension is deleted;
an interface for managing the kango.browser browser, from which you can not only get the current state of windows and tabs, but also manage them;
a wrapper for XMLHttpRequest called kango.xhr that solves the problem of instantiating XMLHttpRequest in Firefox background scripts;
kango.console debugging interface, with a single log method for outputting debugging messages from scripts in the background sandbox;
A single interface for exchanging messages between background and content scripts. Background script methods are available through kango.invokeAsync, there is also a mechanism for dispatching messages to the content area from the background through the KangoBrowserTab.dispatchMessage and kango.addMessageListener methods. The above methods receive data asynchronously, therefore, event programming will have to be fully applied and monitor potential race conditions.

Used data and external services

Most of the 2GIS for browsers data is stored in background scripts and, in order to avoid unnecessary requests, is also cached there. Among them, for example - the current location of the user and data on organizations obtained by searching by domain or by phone. Organizational data is pulled from 2GIS and Google Places through their API . The user's location, in turn, is determined either using the HTML5 Geolocation API, or using a third-party open GeoIP service if the API returned an error or is not available.

There are several difficult decisions associated with determining the user's current location, since there were quite a few possible situations. For example, what happens if the user closes the lid of the laptop and leaves for the other end of the city, and then opens the laptop? And if he just moves between two Wi-Fi points with a short-term loss of connection? Solving the problem head on - through the built-in method navigator.geolocation.watchPosition - did not work. The positionChanged event occurred much less often than we would like, besides it did not work in the situation with closing and opening the lid of the laptop. The navigator.onLine flag came to the rescue. Watching him, we could trigger an event at the time of the appearance of the Internet connection. Unfortunately, the flag only works reliably in webkit browsers, but this is better than nothing.

If the organization is located in the same city as the user, or in one of the cities of 2GIS, we will show the information from 2GIS, as we consider it more reliable and verified. To obtain the coordinates of an area about which we do not have information, Google Places is used, but with the use of filters. Information is shown only if the domain in the search results matches the domain on which the page being viewed is located. Similarly, we do with the search by phone. The only difference is we filter by phone, not by domain. This is useful if your organization’s phone is hosted, for example, on a third-party ad site.

In other words, even if the user is on the other side of the globe with respect to the organization on whose website he visited, we will still try to find and provide the correct data about this organization.

How does it work

The extension works like a regular javascript application: it receives the initial data necessary for work, adds handlers for certain events, and with their help processes incoming events. Thus, the behavior when switching windows and tabs is implemented, for example, changing the indicator, depending on the availability of information on the site opened in this tab.

In addition to switching tabs, we must also respond to changing the URL by the user. To do this, we handle the onBeforeNavigate event, which the browser fires before starting to load the page. A little difficulty arose with Safari - it does not trigger this event every time the address bar changes, only if the page is not in the cache or the cache has expired. I had to compromise - add a handler to the onLoad event, which is called after the page is loaded.

We also encountered the problem of possible redirects through response headers, through a meta tag, or through a js call — software redirects of this kind do not generate additional onBeforeNavigate events. On the other hand, I don’t really want to get attached to onLoad for all browsers either, since this would visually look annoyingly slow. As a solution, we chose to periodically poll the status of the URL in the current tab for 30 seconds and update the data if necessary.

Phone recognition in web pages

In the content area of the page, the extension does the only task - extracting from the content sequences of numbers and symbols that can be considered phones. This allows you to get information on these phones and make a call by sending a call to your smartphone.

The task of recognizing phones in itself is non-trivial due to the wide variety of formats, so we did not have any illusions about a completely reliable solution. Examples of formats that we had to deal with: 34.76.35.05.39 - France, +1 234 345 6789 - USA, 67 2354 9548 - Brazil.

The best solution was a two-stage filter based on regular expressions and simple conditions. At the first stagesequences of characters that theoretically resemble telephones are highlighted from the text. For this, relatively lax regular expressions and conditions are used, in addition, a context is checked for occurrences, which can provide additional information. In addition, in the first step, long sequences can be broken up into short ones; parts of a sequence with inappropriate context may also be excluded.

The second stage takes a set of sequences found and, on the basis of stricter rules, decides whether to accept this expression or not. Rules include checks on known masks and simple conditions. For example, is the transmitted sequence a string representation of a floating-point number?

Due to some strict rules, real numbers are not recognized, but this is corrected by exceptions or additional viewing of the context. Today, to check the phone parser, a set of 385 test lines is used, including both positive and negative cases. Readiness for production is assessed by the threshold of 95% of successfully passed tests from the set.

During a simple iterative traversal of the tree in width, phones are searched in the contents of each text DOM node of the document. By steps, it looks something like this: the telephone number found in the text is highlighted → when you hover over it with a cursor to external services, the company information is requested by phone number → the received data is displayed in a pop-up window → if nothing similar is found, the “Call” link is displayed.

Data is received in the background, which does not interfere with the page, and caching of data excludes additional requests by the same phone from other tabs and windows.

And if the data is obtained by Ajax?

If with parsing the page code at the stage of its loading it is more or less clear, what to do with phones that can be received dynamically via ajax or simply formed by client scripts?

In this case, we provided for the processing of new text data in a page based on the MutationObserver interface, which is supported by all modern browsers (with the exception of Safari 5 for Windows). To subscribe to events that occur when the DOM changes, we used the mutation-summary library. It allows you to easily and easily add added DOM nodes. To reduce the amount of processed data, we subscribe only to receive new text nodes and try to parse their contents. It is interesting to observe the behavior of the extension when the phone is entered in the field with simultaneousdisplaying it in another text node :

Internationalization. We speak English

At the moment, the extension is released in Russian and English, the choice of language depends on the current browser locale and its settings. When translating, we were faced with the fact that from content and additional scripts, it is possible to reach the i18n component only through an asynchronous call to the background. This did not fit into the idea of an internationalization interface. Therefore, a small module was implemented, the tasks of which included:

definition of the current locale;
loading localization lists depending on the locale, either through a direct call to kango.io.getExtensionFileContents, or via asynchronous if the module was instantiated in content or in additional scripts;
providing a simple interface for receiving localization messages with simple string interpolation.

Because of the potential asynchrony, it was necessary to provide the ability to initialize locales earlier than other objects, since their designers could contain calls to get localized texts. At the same time, we did not move away from the locale format accepted in kango, since in addition to kango.i18n, locale files are partially used in building extension packages.

Package assembly

As for the assembly itself, the framework provides an easy way to build the extension using the python-based builder. To build, just call the kango.py script with the build parameter and the directory where the extension files are located. After that, the collector will create the necessary unpacked assemblies and even pack those related to Chrome and Firefox. Unfortunately, due to the complicated mechanism of signing extensions, the out-of-the-box builder is not able to package the package for Safari. You can get by building the extension through the Safari extension builder by going through the entire certification process at the Safari developer center. And you can use the patched xar archiver and your certificates to be able to collect the extension automatically and on any * nix-system.

In addition to the kango builder and a simple bash script to automate the assembly under safari, we also use gulp to check the code style, run the tests, and a number of auxiliary actions. First, you need to substitute the version number of the current assembly into several files. Secondly, it would be nice to make sure that all debugging output is disabled. And finally, we have a need to build special package options. For example, addons.mozilla.org requires that the extension does not have a URL for automatic updating, motivating it to update automatically from their site. On the other hand, we publish the extension for Firefox through our own source, so an option is also required where the URL for the update will be present. Also, a number of localization changes are required to place the extension in the Opera store.

What's next

The world does not stand still and 2GIS for browsers is also growing and developing, and here are some of what we are going to do in the near future:

Iterative phone recognition enhancement. We add more test cases, we correct the parser - life becomes better and more fun;
embedding an interactive 2GIS map in a pop-up window instead of a static picture;
cosmetic changes that allow our elements to look great even on very poorly designed sites;
and of course, various fixes and improvements, proposals for which we are waiting for the mail extension@2gis.ru

Thanks for attention. Make extensions for browsers - it's not only fun, but also useful!

Tags: