MicVit December 30, 2011 at 11:13

Want to know what other Internet users are currently reading?

Then I propose to try the extension for Google Chrome, which allows you to fulfill this desire. It is called the “ Eye of the People ” and, by design, should show the most interesting of the network users currently reading.
In short, the essence of his work is as follows. The extension monitors browser user activity, logs information about where and when he “went”, and periodically sends data to the server. The server in real time integrates the received information, generates and maintains an ever-changing rating of popular content. This rating is returned to all users of the extension in the form of a short list of links. In fact, there is an exchange of visited Internet pages, allowing you to find out what is interesting to other people now.
The advantages of this method of sharing visited resources include the following.

Efficiency Everything happens almost in real time. Your visit to the web page may affect the current ranking after 10-15 minutes.
Objectivity. The attractiveness of content is evaluated by the system, excluding the subjective desires of the user.
Ease. The user himself does not need to do absolutely any special actions. Only get the result.

There are still potential advantages that may appear in the future, as well as certain disadvantages, but about them a little lower.
And before you try to install the extension or decide that you do not need it, let me tell you a little about why it is safe, how it works and what it was actually created for.

The idea behind this system is pretty simple. Therefore, when she came to my mind, I was sure that something similar probably already exists. However, after searching, I did not find anything close and decided to try to do it myself. Most likely I was just looking badly. Therefore, if someone shares a link to something similar, I will look with interest. In any case, the possible development of this system, which I have in my head, probably differs from anything existing, and this is also a little lower.

Security

Probably one of the first fair questions that may arise is as follows. But will it not turn out that this extension will take and begin to share confidential information that may be on the pages viewed by the user? I will try to answer.
First, the extension does not send any content from user-viewed content. Only the URLs of the pages visited are sent to the server, accompanied by additional information about user behavior. At the moment, this consists solely of the time spent on the corresponding resource. You can verify this by looking at the background.html file. Even if the extension wanted to do this, it would not be able to, because it does not contain content scripts without proper permissions. That is why during the installation of the extension you will not see a warning that it “may gain access to your personal data on all websites”.
Secondly, any address before being ranked is tested by the server for availability. Therefore, if a page is protected from free access, then even its address will definitely remain “between us”.
Thirdly, the rating construction algorithm is designed to minimize the likelihood of “random” pages appearing in the rating. I mean those that are formally open, but you would not want them to appear before the eyes of a wide range of viewers. The main thing that hinders the possibility of such an event is the decisive advantage of those pages that have been visited by many different people.
example of the 'Eye of the people' expansion window

example of the 'Eye of the people' expansion window

Nonetheless. At the time of writing this text, the extension has not yet been published in the public domain. The purpose of this story is precisely to attract a noticeable number of users immediately after its publication, providing it with a minimally meaningful performance. Therefore, one of you will be among the very first real users. The relative weight of user links, when their number is small, is understandably quite high. To mitigate the effect of distortions associated with the peculiarities of the initial period of use, I “tightened” some parameters that govern the likelihood of “highlighting” user links on the one hand, and on the other, so as not to be boring, I added the possibility of semi-random links not related to user activity. They can be observed in the list for the first time (example in the picture). The list itself has so far been made shorter. When a certain number of users is gathered, I will disconnect this external source, and the list will become more authentic.
Despite all these circumstances, if you want a 100% guarantee that certain pages will not appear in the list, you can first open them in incognito mode (in this mode, extensions are blocked by default).

How it works

A bit more about how the system works.
Two characteristics are used as a basis for assessing the popularity of a resource: the number of users who have visited it for a certain time elapsed since the current moment, as well as indicators of individual interest in it shown by each user. The index of interest is calculated as a nonlinear function of the time spent viewing a specific page. To consider only time is rather rude. But for the first version, I find it quite acceptable. In the future, this can be improved if we take into account additional characteristics of behavior, as well as features of the page itself.
how does the 'Eye of the people' work

The extension measures the times for each web page and periodically transfers the accumulated data to the server, receiving in response the current rating list. It is clear that the transferred data will be reflected in the rating at best for the next request. The interval between requests now is usually equal to five minutes, but this value is a variable that can vary under certain circumstances.
The server adds the data received from the user into his individual “piggy bank”. With a certain periodicity, the server “prints out” piggy banks of active users, calculates interest indicators, brings to its time scale and integrates data from different users, turning them into estimates of the popularity of each available page for each predetermined point in time. With another certain frequency, the server builds the rating. To do this, we analyze the array of accumulated “instant” estimates of popularity at a certain depth in the past. In the process of analysis, these ratings are summarized already in the final. This takes into account the number of users who visited the page, as well as the relevance of each “instant” rating (more recent - more relevant). The existing page list is sorted, clipped, checked, filtered and becomes current. Depth of view - a variable that depends on the current user activity. The range is from about a couple of hours to half a day.

Development

I hope that even in the form as it is now, this system will turn out to be interesting to someone.
But for me, its creation is only a small step, which will probably allow us to get closer to implementing an idea that interests me in fact.
Many of us read news daily, look for something fresh interesting for ourselves, based on our personal preferences. To do this, we use search, rss readers, friends' feeds, just run through your favorite sites, and also use a bunch of different ways. The flow of fresh content is rather big today and sometimes it takes a lot of time to filter it. Yes, there are many ways to configure this stream based on your preferences, but this does not always work, in itself still requiring time and effort.
In thinking about this problem, I somehow had the idea of a system that would study the behavior of a person on the network and, on the basis of the data obtained, would learn for its interests. Based on these interests, such a system will automatically build and modify, if necessary, a set of personal filters, which it will apply to the flow of constantly incoming fresh content. And the user will be offered already “selective” content, without requiring any special actions from him.
user group allocation

Now back to our system with a browser extension. In the process of her work, she constantly has a “hot” list of visited web pages. We can say that many elements of this list define the coordinate system in a multidimensional space, and the values of the corresponding indicators of interest determine the position of a point associated with a user-defined one in this space. If we succeed in distinguishing groups of points close to each other that are at the same time removed from the others, then we will obtain a separation of users into groups with similar interests in a certain sense. Then, if we build independent ratings within the framework of the selected user groups, we will already get a lot of lists focused on certain interests of people. And these lists will not be determined by some artificial categories such as “sport” or “politics”,
Then you can connect the analysis of the content of the pages directly. If you learn to classify content, it will become real to solve the problem of connecting the entire information stream, including that part that is not covered by users of the system.
Of course, you should not take the given model too literally with user-points in a multidimensional space. This is an illustration of the feasibility of the approach, and the implementation may vary slightly. Since this is still in my thoughts, I will not go into further details. The specifics should be based on data research, which I hope to receive through the development of the current system implementation.
It is quite possible that the results of this very current implementation will turn out to be banal and not interesting for you as an extension user. But maybe an understanding of the idea will encourage us to nevertheless shelter a peeping green eye for some time (by the way, if you right-click on the extension button, then there is the option “Hide button”).

About the disadvantages

The main drawback today is the lack of users. It is clear that the results of such a system will not be very reliable. For example, since this publication appeared on Habré, in the beginning the generated rating will turn out to be a kind of rating of articles from Habré. It is hoped that over time it will correct itself.
It is also obvious that since now the system does not preliminarily classify the URLs arriving to it, rather quickly the top lines of the rating will be occupied by large services such as GMail or the main pages of popular resources (for example, habrahabr again). I think that these cases will need to be gradually and purposefully filtered, perhaps organizing a separate rating. But for now, I want to see with you what happens.
It is clear that, by design, we would like to see in the generated list something interesting for us from what we might have missed for some reason. There is no guarantee that the proposed method will meet these expectations. And I propose not to consider this system as an alternative to anything, but only as another way to look at the flow of incoming information.
I have no idea how many people want to install an extension, what the load will be and how it will be distributed. Of course I tested the system, but it was an “artificial” mode. Therefore, I apologize in advance for possible “misunderstandings” if something goes wrong.
Now, before publishing the text, I can publish the extension. And I wish all readers good luck in the upcoming new year!

Tags:

Want to know what other Internet users are currently reading?

Security

How it works

Development

About the disadvantages

Also popular now: