How susceptible you are to dullness, and a little holywar

    I accidentally heard a story from one colleague to another about the fact that everything is very cool and convenient in scripting languages. The narrator promoted perl as something very cool, simple, and straightforward.
    I could not resist and interrupted, saying that python will be more intelligible, and it provides no less opportunities.
    Then I was asked how things are with regular expressions in python , and as a result we came to the following task:

    There is a line, it is necessary to display all the words in it that occur N times.

    The task, of course, is completely trivial, and writing a solution takes a matter of minutes, but this dumbass consumed everyone for at least a few hours. If you succumb to dumbasses, then welcome tackle


    Here are some solutions:

    test1.py writing 3 min. "War and peace" this decision did not overpower
    #!/usr/bin/env python
    import re
    n = 5
    with open("./input_file.txt", "r") as f:
    	s = f.read()
    l = re.findall(r'\w+',s)
    print repr([x for x in l if l.count(x) == n])
    


    test2.pl writing 30 min (for some reason, the person wove for a long time). working hours on “war and peace”: 0m0.282s
    #!/usr/bin/perl
    open FH, ";
    $n = 5;
    %h = ();
    $h{$1}++ while $str =~ /(\w+)/g;
    print '['.(join ", ", grep {$h{$_} == 5} keys %h).']';
    


    the working time of the first solution did not satisfy at all, so this was written:
    test3.py writing 3 min. working hours on “war and peace”: 0m0.285s
    #!/usr/bin/env python
    import re
    n = 5
    with open("./input_file.txt", "r") as f:
    	s = f.read()
    d={}
    for x in re.findall(r'\w+',s):
    	if x in d:
    		d[x] += 1
    	else:
    		d[x] = 1
    print repr([k for k,v in d.items() if v == n])
    


    test4.pl writing 30 min (again a very long time). working hours on “war and peace”: 0m0.221s
    But alas, the result is incorrect
    #!/usr/bin/perl
    open FH, "<./input_file.txt";
    local $/;
    $str = ;
    $n = 5;
    chomp($str);
    foreach(split(/ /, $str)){$h{$_}++;}
    my $res ="[";
    foreach(keys(%h)) {if($h{$_} == $n){$res .= "$_, ";}}
    $res .=  "]";
    print $res;
    


    test5.hs writing 10 min. working hours on “war and peace”: 0m2.948s
    A man built it to say “but I can also use Haskell,” but the result was not impressive
    import Data.List
    import Data.Char
    main = interact                          -- IO
      $ unlines                              -- combine meta-data into string
      . map (\(n, w) -> if (n == 5) then show w else "")
      . sort                                 -- sort meta-data by occurances
      . map (\s -> (length s, head s))       -- transform to sublist meta-data
      . group                                -- break into sublists of unique words
      . sort                                 -- sort words
      . words                                -- break into words
      . map (\c ->                           -- simplify chars in input
        if isAlpha c
          then toLower c
          else ' ')
    


    Well, if so, then in C ++ we also implement
    test6.cpp writing 5 min. working hours on “war and peace”: 0m0.640s
    #include 
    #include 
    #include 
    #include 
    #include 
    #include 
    std::multiset words;
    void prepare() {    
        char *p = 0;
        int err = 1;
        while ( 1 == scanf("%as", &p)) {
            words.insert(p);
            free(p);
        }
        if (errno) {
            perror("scanf");
            abort();
        }
    }
    void output(int n) {
        std::multiset::iterator it(words.begin());
        std::string out;
        for(;it!=words.end(); it = words.upper_bound(*it))  {
            if ( words.count(*it) == n )
                out.append(" ," + *it);
        }
        if (out.find(" ,") == 0 )
            out.erase(0, 2);
        std::cout<<"["<


    Предлагайте свои решения в комментариях, если вы подвержены тупняку и способны сделать лучше за вменяемое время ;)
    этот файл использовался как input_file.txt, это «Война и мир» из библиотеки Гутенберга

    Also popular now: