The average word length for different authors

    So far no one has said: “Why invent a bicycle?”, And tomatoes didn’t fly into this bicycle, I immediately say that the average length of the Russian word has long been calculated and amounts to 5.28 characters. Here is the link to the source . And this topic made me write the following. When discussing my previous post , the stetzen and alienator habrayusers suggested that the average word length of different authors will differ depending on their presentation style, and there may be some anatomical differences, I don’t know. By the way, try to guess the average length of what is most sought after in Google. In general, I decided to check whether this is really so.

    Below is the source of the program, which considers the total number of words in the text, as well as the average word length. The program is written in perl. Almost all the texts that I used were taken from the Moshkov library . That's what I did. Conclusions how much the average word length differs for different authors do it yourself.

    use strict;
    use locale;
    use POSIX qw (locale_h);
    setlocale(LC_CTYPE, 'ru_RU.CP1251');
    setlocale(LC_ALL, 'ru_RU.CP1251');
    open (TEXT, " undef $/;
    my $text = ;
    my @words = $text =~ m/[А-Я]+/ig;
    open(OUT, ">out.txt");
    my ($count, $sum);
    $sum += length($_);
    print OUT "Всего слов: $count\nСредняя длина слова: ".($sum/$count);


    Also popular now: