Programmer graphophilia

    For the last couple of days, I have been immersed in the task of beautiful text-writing. I caught the idea when I read a post about the keyword graph for articles from alexwolf's website .
    But I wanted to create such graphs for arbitrary texts and see how beautiful and interesting they can be. I’m not sure that everyone who reads me now shares my programmatic understanding of beauty, but I still want to say that in my opinion it turned out beautifully and funny.


    In order not to fence my bike for drawing graphs, I used the GraphViz package (you can also download it here ). The same utilities are used by Doxygen for drawing beautiful graphs.
    To prepare the data for the graph and simplify the work, I wrote a program in C ++ ( CodeGraphMFC.rar , CodeGraphMFC_Sources.rar ).
    Using the links you can download both the program and the source texts (for VisualStudio 2005).
    The program interface looks like this:
    From textgraphs

    You can insert any text into it, click "Create graph" and it will create a graph of dependent words of this text.
    Dependent words are words that are close to each other in the text (the maximum range can be selected in the interface). You can also choose one of several graph creation methods.
    The basic rule: the closer the vertices of the resulting graph are to each other, the more often the words at these vertices occur together in the text.
    The program only works if GraphViz is installed and the path to it is spelled correctly, and the default program for opening gif files must also be registered, because gifs are created as a result.
    From restrictions: for some reason, this package is terribly slow and crashes on large numbers of vertices of the graph. Therefore, the maximum number of different close words that everything works on is about 1000.

    Today the program was already ready and I played a little with it. The first results in the form of pictures can be seen here: Web album .

    For example, a beautiful Count of Pushkin’s poem “Anchar”: Actually, I must say that beautiful counts are obtained from Russian poems. From English it’s not so beautiful (although the author is far from Pushkin): And here is how a graph of a small class announcement looks (C ++): But a couple of rather complex functions on a couple of screens (C ++): And here is a beautiful graph of one of my posts about communism :














    From interesting observations: beautiful graphs are obtained for texts where there are very few repeating words, or vice versa, where there are a lot of repetitions. Moreover, for poetry, for example, the most beautiful counts were obtained for the most beautiful verses. The articles also have a similar situation: the graph of a piece of user agreement text looked intimidating, and light and simple articles are displayed in beautiful graphs.

    Initially, I wanted to compare the graphs for the same program written in several different programming languages, but so far I haven’t found anything, it’s more difficult to say “Hello, World”, and I myself am reluctant to write.
    Does anyone have such programs for different languages, at least 30 lines?

    I would be grateful if someone downloads the program and generates their beautiful graphs.
    Or advise how to further develop this idea and what else to graph and compare?

    Also popular now: