Introducing Coarray Fortran: Shall We Be Parallel?

    For a long time I wanted to write about the stage at which one of the "progenitors" of programming languages ​​popular today is. Yes, I'm talking about Fortran. I will try to break the stereotype that exists in the minds of many developers - that Fortran is an ancient language without any prospects, and certainly no one else writes on it. On the contrary, it is very actively evolving and has long been offering a rich set of various functionalities enshrined in standards, quite, by the way, comparable to the same C / C ++.
    There isn’t so much left of the “old” 77th Fortran ... in the 95 standard we could create our own data types, dynamically allocate and clear memory, work with pointers, overload functions and operators, and much more. In fact, it differs slightly from With its set of tools. However, I do not want to try to compare languages ​​- this is a matter of philosophy. I can only say that the Fortran Intel compiler is in great demand, and, in fact, is acquired even more actively than the same C ++.
    The purpose of this post is to talk about the current state of affairs. Actually, Fortran today is a parallel language, and it became one after the adoption of the Fortran 2008 standard, in which Coarray appeared.

    So, first things first. The programming model SPMD (Single Program MultipleData) was taken as the basis. If you are familiar with MPI, then the essence is the same - we are writing our application, copies of which will be executed a certain number of times in parallel. In addition, each copy has its own local data. Those data that must be accessed from different copies are described using a special syntax called Coarray.

    For understanding, just give a simple Hello World example:

    program hello
    write(*,*) "Hello world"
    end program hello
    


    Actually, the most common code. Just after compiling it with the –coarray key (Intel compiler), we will see “greetings” from several different copies of the program, or, in Coarray terms, from different Image'eys (images). Moreover, their number can be controlled, for example, via the –coarray-num-images = x key, or the FOR_COARRAY_NUM_IMAGES environment variable. It is clear that there is a way to determine in which way the execution takes place. Let's complicate our example:

    program hello_image
    write(*,*) "Hello from image ", this_image(), "out of ", num_images()," total images“
    end program hello_image


    After launch, we will see something similar to this:

    Hello from image            1 out of            4  total images
    Hello from image            4 out of            4  total images
    Hello from image            2 out of            4  total images
    Hello from image            3 out of            4  total images
    


    Obviously, our application was run 4 times (4 copies / image). Having this data about Coarray, we are, in principle, already able to create parallel applications.

    Here are just very stupid, because there is no answer to the main question - what about the data? To do this, a very simple and understandable syntax is introduced:

    real, codimension[*] :: x
    real :: y[*]


    Square brackets tell us that we use Coarray.
    In this example, these are just scalars that are still available in every copy of the program. But now we can refer to the value of this scalar in the copy (image) we need.

    For example, writing y [2], we turn to the value of y in image 2. This opens up possibilities for “real” parallel work with data.

    Naturally, there are a number of logical limitations imposed on Coarray, such as any attempt to associate a Coarray object with another object through pointers, or passing Coarray objects to C code.
    Let's look at a few more examples, considering that we have previously declared the variable x as Coarray:

    x = 42.0


    In this case, we are operating with the local variable x for the image.
    As soon as square brackets appear in our code, this is an explicit pointer to the fact that the variable is being accessed in another program image:

    x[3] = 42.0  ! задаёт значение х равное 42 в образе 3
    x = x[1]  ! присваиваем локальной для текущего образа переменной х значение из образа 1
    x[i] = x[j]  !присваиваем переменной х в образе I значение переменной х из образа j


    What's good about Coarray is that, unlike pure MPI, we don’t care about sending or receiving messages. All this is on the shoulders of the implementation (which already uses the same MPI). But we are “above” this. In addition, the code will work both on systems with distributed memory, and on systems with shared memory. Just change the key to coarray = shared or coarray = distributed.

    Since we have data in different copies of our program, it is logical to assume that there should be means for their synchronization. Of course they are. This is, for example, the SYNC ALL construct, which synchronizes all images. There is also SYNC IMAGES (), which allows you to synchronize only certain images.

    One more example:

    integer, codimension[*] :: fact
    integer :: i, factorial
    fact = this_image() 
    SYNC ALL           
    if ( this_image() == 1 ) then
       factorial = 1
       do i = 1, num_images()
         factorial = factorial * fact[i]
       end do
       write(*, *) num_images(), 'factorial is ', factorial
    end if


    Naturally, this is not the fastest way to calculate factorial, but it well illustrates the essence of working with Coarray.

    To begin with, we declare fact as Coarray, and then in each image we assign a value equal to the number of the image. Before we multiply all the values, we need to make sure that they are already assigned, so we use SYNC ALL. And in the image with number 1, such as supposedly “master image”, we calculate the factorial.

    As a result, we got a very effective tool - part of the language that allows you to create parallel applications for systems with different memory organization. Naturally, the main difficulty in compiler support and implementation of Coarray is performance. At the moment, it still remains not the strongest point ... but here there are great prospects for various compilers.

    I am finishing my short description regarding the “new goodies” from the last adopted standard. I hope it wasn’t very boring to watch the Fortran code. If the reviews show the opposite and wake up a lively interest on this topic, then I will not refuse you the pleasure and continue the topic. And now thank you all for your attention.

    Also popular now: