matiouchkine March 31, 2011 at 10:44

Hello doctor, I have problems with karma

"90-60-90", or driving through the city by a traffic cop

I know a sufficient number of services on which the concept of “karma” is present. And all these services consider the rating according to some kind of slurred algorithm. The introduction of even empirical nonlinear dependencies of this indicator on all kinds of phases of the moon and the favor of other users leads to a noticeable number of problems. That's about them (and possible ways to solve them) I would like to speculate today.

Gaetz with radar

I will first outline all the problems that are visible to me, and then I will try to draw wholesale conclusions.

Problem 1. Weighting factors

Karmic indicators can not simply be added and subtracted, as, for example, this is done on a hub This is quite obvious: let's compare two users with the same karma (say = 50) - Vasya and Petya. At the same time, Vasya got this fifty dollars by subtracting one minus from fifty-one pluses (the ratio was pleasant ÷ did not like = 50 ÷ 1). And the aksakal Petit has five hundred pluses and 450 minuses (that is, apologists and antagonists shared approximately equally). It seems to me very transparent that Vasya’s karma should have been greater than Petina, despite her youth and recent registration.
But if you just stupidly normalize karma to the total number of votes, it’s not quite what we expect: it’s Vasya will be 50 times cooler than Petit, for whom 20 times more people will be heard, and he managed to stay in the black. In addition, maybe there are evil stupid trolls in the amount of three hundred people who minus all at all, have visited Petya for a long time, but have not visited Vasya yet?

Problem 2. Objective and subjective indicators

Yes, the indicators are purely objective (98% of those who saw the announcement came under the cut), and purely subjective (Vasya went to Pete's profile and did not give a damn about his karma), but nevertheless the overwhelming majority - they are something in between. For example, the number of bookmarkers is a statistically more or less objective indicator, and the number ± per article - even in a large sample of voters, is closer to the subjective. In addition, when calculating subjective parameters, it makes sense to take into account the tendency of the voter (if someone spends the entire allocated resource on minus - his minuses should depreciate, and pluses - on the contrary - should rise in price, and vice versa).

Problem 3. Obvious and non-obvious indicators

An obvious indicator, I consider, for example, the number of bookmarks added to the note. This is an important parameter, but it cannot be used directly: someone bookmarks every second note, someone - one a year. Therefore, it makes sense to use the parameter normalized by the tendency to add to the “laying”. Something like a normalized inverse frequency of adding bookmarks. Then Vasya, who is at the service for a week and “pawns” for the first time, and Petya, who is at the service for a year and this is his fiftieth bookmark, will make the same contribution to karma, and Kolya (month at the service, 30 bookmarks, “Total Recall Syndrome” ) - 7 times smaller.
It makes sense to be considered non-obvious, say, the “atmosphere” in the comments (the total amount of pluses / minuses). For example, a note in the comments to which there is not a single minus (or the number is negligible) is probably more useful than holivar. The total number of comments, however, it seems to me, cannot act as a parameter in any way.

Problem 4. Karma vs. Rating

There are two orthogonal indicators: karma and rating. Karma is a function of subjective non-obvious indicators; rating - on the contrary - is a function of the objective obvious. It seems to me obvious that the first is only suitable for championships for measuring the length of the childbearing organ, while only objective, obvious indicators can be used to determine the “authority” of a user.

Entities

I was able to come up with the following entities by which to evaluate:

record
a comment
action (± and similar)
total activity

All subjective assessments should be normalized both to the user himself (the value of the assessment is inversely proportional to the balance of such assessments for this user), and to their total number.
Now let's try to bring it all together.

Posts and comments

Posts and comments can be rated and ∕ or added to bookmarks, in addition, they can be referenced both from the outside and from the inside of the hub. The weight of the comment (and associated ratings) is less than that of the post.

Actions and other activities

These two parameters can only be measured.

The summing up

Actually, I don’t have a silver bullet. Odds need an empirical fit. But I believe that the formula for calculating both of the parameters I mentioned should look something like this:

Karma = NORM _all (∑ NORM _this (± _{to karma} ))

Rating = NORM _all (∑ NORM _this (± _{for topics} )) + ⅓ × NORM _all ( ∑ NORM _this (± _{for comments} )) + F (number of external links) + ⅓ × F (number of internal links) + ∑ NORM _this (in bookmarks)

The weights of each member of this formula must be adjusted manually.

Constructive as well as destructive criticism is very welcome. I am sure that I missed something important and I hope to supplement the methodology with your help. Unfortunately, I don’t have the opportunity to go through all the records on the hub and calculate these numbers manually (and if you have direct access to the database, this is a couple of requests). If you could persuade the administration - no! Do not modify the existing system! - but counting these two values for all users would be cool. Well, I think so.

Tags: