Expert Assessment Methods

    It is often necessary to choose among many alternatives, each with different advantages. And how to choose the best, having the opinion of dozens, or even hundreds of experts?


    Both the calculation of the rating of a computer game based on critical assessments of graphics, gameplay and plot, as well as the collective choice of a priority task before the appearance of the customer, refers to expert assessment methods .

    Brief educational program


    Methods of expert assessments are part of the vast field of decision theory , and expert assessment itself is a procedure for obtaining an assessment of a problem based on the opinions of specialists (experts) for the purpose of subsequent decision making (choice).
    In cases of extreme complexity of the problem, its novelty, insufficient information available, the impossibility of mathematical formalization of the solution process, one has to turn to the recommendations of competent specialists who are well aware of the problem - to experts. Their solution of the problem, argumentation, the formation of quantitative estimates, processing of the latter with formal methods are called the method of expert estimates.

    There are two groups of peer reviews:
    1. Individual assessments are based on the use of the opinions of individual experts, independent of each other.
    2. Collective assessments are based on the collective opinion of experts.

    Roughly speaking, the first group includes the evaluation of articles on the hub, voting in polls, etc., when each expert makes a decision on his own. Selection (elimination) of experts is carried out through karma. It is the first group that prevails on the Internet 2 due to the possibility of reaching more experts.

    Methods for measuring objects
    1. Ranking is the arrangement of objects in ascending or descending order of any property inherent in them. Ranking allows you to choose the most significant from the studied set of factors.
    2. Pairwise comparison is the establishment of preference for objects when comparing all possible pairs. Here it is not necessary, as in ranking, to order all the objects, it is necessary in each of the pairs to identify a more significant object or to establish their equality.
    3. Direct assessment . It is often desirable not only to order (to rank the objects of analysis), but also to determine how much one factor is more significant than others. In this case, the range of changes in the characteristics of the object is divided into separate intervals, each of which is assigned a certain rating (score), for example, from 0 to 10. That is why the direct assessment method is sometimes called the point method.

    A simple ranking method is that each expert is asked to arrange the features in order of preference.

    a ij - assessment of the sign by the expert. n is the number of attributes, m is the number of experts.
    Then, S i is calculated - the average value of the importance of the attribute.

    Weighting Method (a ij )
    1. all signs are assigned weight coefficients so that the sum of the coefficients is equal to some fixed number (for example, one, ten or one hundred);
    2. the most important of all signs is given a weight coefficient equal to some fixed number, and to all the rest - coefficients equal to fractions of this number.

    The method of successive comparisons is as follows:
    1. the expert arranges all the signs in decreasing order of their significance: A1> A2> ...> An;
    2. assigns a value equal to one to the first sign: A1 = 1, assigns weight coefficients in fractions of one to the remaining signs;
    3. compares the value of the first sign with the sum of all subsequent ones.


    In pairwise comparison, it is not necessary, as in ranking, to order all objects, it is necessary in each pair to identify a more significant object or establish their equality. Pairwise comparisons can be made with a large number of objects, as well as in cases where the difference between the objects is so insignificant that their ranking is practically impossible.
    When using the method, a matrix of size n x n is most often compiled , where n is the number of compared objects.

    When comparing objects, the matrix is ​​filled with elements a ij as follows (a different filling scheme can also be proposed):
    • 2, if object i is preferable to object j (i> j),
    • 1, if the equality of objects (i = j) is established,
    • 0 if object j is preferable to object i (i <j).

    Direct assessment . It is often desirable not only to order (to rank the objects of analysis), but also to determine how much one factor is more significant than others. In this case, the range of changes in the characteristics of the object is divided into separate intervals, each of which is assigned a certain rating (score), for example, from 0 to 10 . That is why the direct assessment method is sometimes called the point method .



    And now, the most delicious ...

    Analysis of the results of expert assessments


    To analyze the results, various methods of mathematical statistics are used . Moreover, they can be combined and vary depending on the type of task and the desired result.

    Formation of a generalized assessment


    So, let a group of experts evaluate an object, then x j is the estimate of the j-th expert, where m is the number of experts.
    To form a generalized assessment of a group of experts, average values ​​are most often used . For example, the median , for which such an assessment is taken, in relation to which the number of large assessments is equal to the number of smaller ones.
    Determining the relative weights of objects
    Sometimes it is necessary to determine how important a particular factor (object) is (significant) from the point of view of any criterion. In this case, they say that you need to determine the weight of each factor. It differs from the formation of a generalized assessment in that it is determined not by a general assessment of the object, but by an assessment for each of its attributes.
    And also there
    are a huge number of possible methods for processing estimates.
    Alternatively, use the Elo rating system for the pair comparison method .

    Hierarchy analysis method
    Paradox Condorcet
    Board Rule
    ELECTRE

    Moreover, the result may consist of several algorithms, interwoven with others. For example, an algorithm for calculating an expert’s competency coefficient may influence the average statistical assessment of this expert, etc.

    Establishing a degree of agreement among experts


    If several experts participate in the survey, discrepancies in their estimates are inevitable, however, the magnitude of this discrepancy is important. A group assessment can be considered sufficiently reliable only if the answers of individual specialists are well coordinated.
    To analyze the scatter and consistency of estimates, statistical characteristics are used - scatter measures or statistical variation .
    So, the methods for calculating the spread measure :
    Variation range
    image

    Average linear deviation
    image

    RMS deviation
    image

    Variance
    image

    Spearman's rank correlation
    image
    coefficient Coefficient (valueimage) can vary in the range from –1 to +1. If the estimates coincide completely, the coefficient is equal to unity. Equality of the coefficient minus one is observed with the greatest discrepancy in the opinions of experts.
    x ij is the rank ( importance ) assigned to the i-th object by the j-th expert, x ik is the rank assigned to the i-th object by the k-th expert, d i is the difference between the ranks assigned to the i-th object.

    Kendell’s concordance
    coefficient The coefficient may range from 0 to 1. With full agreement of the experts ’opinions, the concordance coefficient is equal to unity, with complete disagreement — zero. The most real case of partial agreement of expert opinions.
    Calculation
    The average rank of the set of attributes is
    image
    determined : The deviation d j of the middle rank of the jth attribute from the average rank of the population is calculated :
    image
    The number of identical ranks assigned by the experts to the jth attribute is determined - t q .
    The number of groups of identical ranks is determined - Q. The coefficient of concordance is determined by the formula:
    image
    where
    image


    Speaking about the consistency of expert opinions, it is worth mentioning that ranking does not imply (or does not always imply) distance. That is, one expert A> B> C means that A >> B> C, and the other A> B >> C. And all sorts of correlations and calculations of average ratings will not help here. Alternatively, read the consistency index. Something like the number of conflicting closed chains of expert opinions (The first believes that A is better than B, the second that B is better than C, and the third that C is better than A) to the number of all such chains.

    Ratings are usually based on some probabilistic model, so you need to carefully consider the area of ​​their possible application.

    Conclusion


    The article does not claim to be a complete multi-stage analysis of methods and evaluation algorithms, only a superficial description of them. Therefore, if you know the methods and algorithms that are applicable in this case (not described by me), I will gladly add them to the article. Or any useful subject literature.

    Then I take my leave. All a happy holiday, ramen. And for those who came to look at the girls - here you are

    References:
    Wikipedia - the free encyclopedia
    www.rae.ru
    emm.ostu.ru
    teorver-online.narod.ru
    www.habarov.spb.ru

    Also popular now: