Analysis of the Quran with AI

    I apologize for the possibly "yellow" heading, because srau I turn to the point. In the course of work, it was required to test a system that carries out a qualitative analysis of the text using various classifiers, such as gender, sentiment (mood), age, and so on. As one of the tested samples, I decided to take sura from the Koran, and then I analyzed the entire text of the manuscript.

    Watson


    Initially, there was a desire to “skip” the Quran through the notorious Watson , especially since the service allows you to analyze Arabic text. The first Watson surah was rejected because contained very little text, so it was decided to use the second . The results turned out to be informative, but not quite what was required, since Watson does not give estimates by sentiment, gender and age.

    Having processed the surah of Al Baqarah, Watson presented numerical values ​​of the main characteristics of the author of the text, which can be viewed in JSON format.

    An absolute introvert (almost closed), with a narrow emotional range, pronounced impulsiveness. Ready to try something new. Disciplined and compulsory. Altruism, modesty, sympathy and kindness are not found. High degree of credulity. Expresses an urgent need for love, harmony, closeness and organization of the routine. Attaches importance to traditions and achievements within the framework of generally accepted social standards. Not interested in helping others and enjoying life.

    Json
    (Предупреждение про 285 слов это баг ватсона, возможно связанно с особенностями обработки арабской вязи. На самом деле 285 строф, слов значительно больше).
    {
      "id": "*UNKNOWN*",
      "source": "*UNKNOWN*",
      "tree": {
        "id": "r",
        "name": "root",
        "children": [
          {
            "id": "personality",
            "name": "Big 5",
            "children": [
              {
                "id": "Extraversion_parent",
                "name": "Extraversion",
                "category": "personality",
                "percentage": 0,
                "children": [
                  {
                    "id": "Openness",
                    "name": "Openness",
                    "category": "personality",
                    "percentage": 0.7401994172490934,
                    "sampling_error": 0.0632961745,
                    "children": [
                      {
                        "id": "Adventurousness",
                        "name": "Adventurousness",
                        "category": "personality",
                        "percentage": 0.99,
                        "sampling_error": 0.0531619985
                      },
                      {
                        "id": "Artistic interests",
                        "name": "Artistic interests",
                        "category": "personality",
                        "percentage": 0.01376496058079709,
                        "sampling_error": 0.1084097325
                      },
                      {
                        "id": "Emotionality",
                        "name": "Emotionality",
                        "category": "personality",
                        "percentage": 0,
                        "sampling_error": 0.049707709
                      },
                      {
                        "id": "Imagination",
                        "name": "Imagination",
                        "category": "personality",
                        "percentage": 0.99,
                        "sampling_error": 0.0672327285
                      },
                      {
                        "id": "Intellect",
                        "name": "Intellect",
                        "category": "personality",
                        "percentage": 0.99,
                        "sampling_error": 0.0588966455
                      },
                      {
                        "id": "Liberalism",
                        "name": "Authority-challenging",
                        "category": "personality",
                        "percentage": 0.99,
                        "sampling_error": 0.0869470685
                      }
                    ]
                  },
                  {
                    "id": "Conscientiousness",
                    "name": "Conscientiousness",
                    "category": "personality",
                    "percentage": 0.9123376288115044,
                    "sampling_error": 0.079482993,
                    "children": [
                      {
                        "id": "Achievement striving",
                        "name": "Achievement striving",
                        "category": "personality",
                        "percentage": 0.9259149497833751,
                        "sampling_error": 0.102722867
                      },
                      {
                        "id": "Cautiousness",
                        "name": "Cautiousness",
                        "category": "personality",
                        "percentage": 0.99,
                        "sampling_error": 0.09507552899999999
                      },
                      {
                        "id": "Dutifulness",
                        "name": "Dutifulness",
                        "category": "personality",
                        "percentage": 0.009847553196768501,
                        "sampling_error": 0.063380138
                      },
                      {
                        "id": "Orderliness",
                        "name": "Orderliness",
                        "category": "personality",
                        "percentage": 0.00951030345654535,
                        "sampling_error": 0.0730250425
                      },
                      {
                        "id": "Self-discipline",
                        "name": "Self-discipline",
                        "category": "personality",
                        "percentage": 0.7957300993092047,
                        "sampling_error": 0.048516363
                      },
                      {
                        "id": "Self-efficacy",
                        "name": "Self-efficacy",
                        "category": "personality",
                        "percentage": 0.031706902665228645,
                        "sampling_error": 0.096044686
                      }
                    ]
                  },
                  {
                    "id": "Extraversion",
                    "name": "Extraversion",
                    "category": "personality",
                    "percentage": 0,
                    "sampling_error": 0.059340869500000004,
                    "children": [
                      {
                        "id": "Activity level",
                        "name": "Activity level",
                        "category": "personality",
                        "percentage": 0,
                        "sampling_error": 0.0810636685
                      },
                      {
                        "id": "Assertiveness",
                        "name": "Assertiveness",
                        "category": "personality",
                        "percentage": 0,
                        "sampling_error": 0.0866103315
                      },
                      {
                        "id": "Cheerfulness",
                        "name": "Cheerfulness",
                        "category": "personality",
                        "percentage": 0,
                        "sampling_error": 0.10896341150000001
                      },
                      {
                        "id": "Excitement-seeking",
                        "name": "Excitement-seeking",
                        "category": "personality",
                        "percentage": 0,
                        "sampling_error": 0.083409996
                      },
                      {
                        "id": "Friendliness",
                        "name": "Outgoing",
                        "category": "personality",
                        "percentage": 0.004154940124177926,
                        "sampling_error": 0.078376899
                      },
                      {
                        "id": "Gregariousness",
                        "name": "Gregariousness",
                        "category": "personality",
                        "percentage": 0.006610468323581309,
                        "sampling_error": 0.059563857
                      }
                    ]
                  },
                  {
                    "id": "Agreeableness",
                    "name": "Agreeableness",
                    "category": "personality",
                    "percentage": 0.99,
                    "sampling_error": 0.100387345,
                    "children": [
                      {
                        "id": "Altruism",
                        "name": "Altruism",
                        "category": "personality",
                        "percentage": 0.00835173524008939,
                        "sampling_error": 0.073512979
                      },
                      {
                        "id": "Cooperation",
                        "name": "Cooperation",
                        "category": "personality",
                        "percentage": 0.99,
                        "sampling_error": 0.0826257435
                      },
                      {
                        "id": "Modesty",
                        "name": "Modesty",
                        "category": "personality",
                        "percentage": 0,
                        "sampling_error": 0.058549201499999995
                      },
                      {
                        "id": "Morality",
                        "name": "Uncompromising",
                        "category": "personality",
                        "percentage": 0.99,
                        "sampling_error": 0.06559944549999999
                      },
                      {
                        "id": "Sympathy",
                        "name": "Sympathy",
                        "category": "personality",
                        "percentage": 0.00899654205884867,
                        "sampling_error": 0.101299643
                      },
                      {
                        "id": "Trust",
                        "name": "Trust",
                        "category": "personality",
                        "percentage": 0.99,
                        "sampling_error": 0.059132582
                      }
                    ]
                  },
                  {
                    "id": "Neuroticism",
                    "name": "Emotional range",
                    "category": "personality",
                    "percentage": 0.12553186654101073,
                    "sampling_error": 0.094767615,
                    "children": [
                      {
                        "id": "Anger",
                        "name": "Fiery",
                        "category": "personality",
                        "percentage": 0.009857823702785023,
                        "sampling_error": 0.0976305695
                      },
                      {
                        "id": "Anxiety",
                        "name": "Prone to worry",
                        "category": "personality",
                        "percentage": 0.10522549628333466,
                        "sampling_error": 0.0574906605
                      },
                      {
                        "id": "Depression",
                        "name": "Melancholy",
                        "category": "personality",
                        "percentage": 0.0012238948047045572,
                        "sampling_error": 0.061626443999999996
                      },
                      {
                        "id": "Immoderation",
                        "name": "Immoderation",
                        "category": "personality",
                        "percentage": 0.25656958950189773,
                        "sampling_error": 0.0550395485
                      },
                      {
                        "id": "Self-consciousness",
                        "name": "Self-consciousness",
                        "category": "personality",
                        "percentage": 0.06392969963372698,
                        "sampling_error": 0.0593781605
                      },
                      {
                        "id": "Vulnerability",
                        "name": "Susceptible to stress",
                        "category": "personality",
                        "percentage": 0.10113758876238299,
                        "sampling_error": 0.088768721
                      }
                    ]
                  }
                ]
              }
            ]
          },
          {
            "id": "needs",
            "name": "Needs",
            "children": [
              {
                "id": "Ideal_parent",
                "name": "Ideal",
                "category": "needs",
                "percentage": 0.003832960708229936,
                "children": [
                  {
                    "id": "Challenge",
                    "name": "Challenge",
                    "category": "needs",
                    "percentage": 0.6100166548928185,
                    "sampling_error": 0.086264993
                  },
                  {
                    "id": "Closeness",
                    "name": "Closeness",
                    "category": "needs",
                    "percentage": 0.8251348807632928,
                    "sampling_error": 0.08506778699999999
                  },
                  {
                    "id": "Curiosity",
                    "name": "Curiosity",
                    "category": "needs",
                    "percentage": 0.6427034487726155,
                    "sampling_error": 0.1232055355
                  },
                  {
                    "id": "Excitement",
                    "name": "Excitement",
                    "category": "needs",
                    "percentage": 0.005544228138235261,
                    "sampling_error": 0.11254523300000001
                  },
                  {
                    "id": "Harmony",
                    "name": "Harmony",
                    "category": "needs",
                    "percentage": 0.99,
                    "sampling_error": 0.112534116
                  },
                  {
                    "id": "Ideal",
                    "name": "Ideal",
                    "category": "needs",
                    "percentage": 0.003832960708229936,
                    "sampling_error": 0.10201695250000001
                  },
                  {
                    "id": "Liberty",
                    "name": "Liberty",
                    "category": "needs",
                    "percentage": 0.5752122746131392,
                    "sampling_error": 0.1490213055
                  },
                  {
                    "id": "Love",
                    "name": "Love",
                    "category": "needs",
                    "percentage": 0.99,
                    "sampling_error": 0.103592588
                  },
                  {
                    "id": "Practicality",
                    "name": "Practicality",
                    "category": "needs",
                    "percentage": 0.99,
                    "sampling_error": 0.089956072
                  },
                  {
                    "id": "Self-expression",
                    "name": "Self-expression",
                    "category": "needs",
                    "percentage": 0.009886632263973901,
                    "sampling_error": 0.083656981
                  },
                  {
                    "id": "Stability",
                    "name": "Stability",
                    "category": "needs",
                    "percentage": 0.011545403965898251,
                    "sampling_error": 0.109521769
                  },
                  {
                    "id": "Structure",
                    "name": "Structure",
                    "category": "needs",
                    "percentage": 0.99,
                    "sampling_error": 0.0821582255
                  }
                ]
              }
            ]
          },
          {
            "id": "values",
            "name": "Values",
            "children": [
              {
                "id": "Self-transcendence_parent",
                "name": "Self-transcendence",
                "category": "values",
                "percentage": 0,
                "children": [
                  {
                    "id": "Conservation",
                    "name": "Conservation",
                    "category": "values",
                    "percentage": 0.99,
                    "sampling_error": 0.069950964
                  },
                  {
                    "id": "Openness to change",
                    "name": "Openness to change",
                    "category": "values",
                    "percentage": 0.008825493504679734,
                    "sampling_error": 0.0660268375
                  },
                  {
                    "id": "Hedonism",
                    "name": "Hedonism",
                    "category": "values",
                    "percentage": 0.008326985786020414,
                    "sampling_error": 0.140913567
                  },
                  {
                    "id": "Self-enhancement",
                    "name": "Self-enhancement",
                    "category": "values",
                    "percentage": 0.765277368499976,
                    "sampling_error": 0.10627466249999999
                  },
                  {
                    "id": "Self-transcendence",
                    "name": "Self-transcendence",
                    "category": "values",
                    "percentage": 0,
                    "sampling_error": 0.0846075525
                  }
                ]
              }
            ]
          }
        ]
      },
      "warnings": [
        {
          "id": "WORD_COUNT_MESSAGE",
          "message": "There were 285 words in the input. We need a minimum of 3,500, preferably 6,000 or more, to compute statistically significant estimates"
        }
      ]
    }

    It is curious that almost all the characteristics provided by Watson are elevated to absolute, which was rarely seen in the analysis of other texts, in other words, Watson rarely demonstrates such a high degree of confidence in the results.

    image

    In addition to the numerical values, Watson also provides a small textual description of the results, in the so-called “human readable format”. The very fact of the existence of such a description is convenient, but of little interest, but the process of its generation is a bit unexpected. The code responsible for creating the text is implemented on the client side, in JavaScript - the most striking features are highlighted, ranked and each assigned an identifier. Then sentences like:

    switch (intervalFor(valuesList[0].percentage)) {
        case 0:
            sentence = format(tphrase('You are relatively unconcerned with both %s and %s'), term1, term2) + '.';
        break;
        case 1:
            sentence = format(tphrase("You don't find either %s or %s to be particularly motivating for you"), term1, term2) + '.';
        break;
    }


    uClassify


    Another shareware service that uses machine learning technology to analyze text for selected classifiers. The choice fell on him for two reasons - high results on test samples and the presence of the required classifiers. Moreover, there is no restriction on the minimum number of words for analysis, which allows you to analyze each individual sura.

    Unfortunately, uClassify only works with text written in English, so the text of the Koran in the English translation was analyzed. Based on information from various sources, I chose the most accurate, widely recognized and frequently used translation option .

    I started with a sentiment classifier that demonstrates the general mood of the narrative - negative or positive.
    Sura1234567
    Negative9%76%71%76%60%60%54%
    Positive91%24%29%24%40%40%46%

    On the graph below, it is clearly visible that the text of the Quran begins extremely positively, and then intensively rushes to the negative, with negative moods remaining throughout the text, only slightly weakening closer to the end, when the mood of the narrative approaches neutral (negative 54%).

    sentiment graph

    It is quite natural that the negative mood prevails throughout the entire text, all the more if we take into account the fact that the first sura is incommensurably small in volume, in comparison with each subsequent one.

    sentiment graph

    If we assume that the Koran’s text was written not by one person, but by several and to analyze the gender classifier, then we get an equally interesting “picture”:

    image
    Sura1234567
    The man20%49%48%60%69%54%47%
    Girl80%51%52%40%31%46%53%

    The correlation between the two data sets is clearly visible even visually, namely 0.7839422223, which indicates a direct relationship between the sex classifiers and the mood of the text.

    Epilogue


    Of course, these are just numbers, graphs and formulas.

    Also popular now: