Some OpenMP Tips

    image

    OpenMP is a standard that defines a set of compiler directives, library procedures, and environment variables for creating multi-threaded programs.

    Many articles were articles on OpenMP. However, this article contains some tips to help you avoid some mistakes. These tips are not often featured in lectures or books.

    1. Name critical sections

    In the queue, you sons of bitches, in the queue! // M. A. Bulgakov “Dog Heart”

    Using the critical directive, we can specify a section of code that will be executed by only one thread at a time. If one of the threads started executing a critical section with the given name, then the other threads that started executing the same section will be blocked. They will be waiting in line. As soon as the first thread completes the execution of the section, one of the blocked threads will enter it. The choice of the next thread that will execute the critical section will be random.

    #pragma omp critical [(имя)] новая строка   
    структурированный блок
    


    Critical sections can be named or unnamed. In various situations, improves performance. According to the standard, all critical sections without a name will be associated with one name. Assigning a name will allow you to simultaneously execute two or more critical sections simultaneously.

    Example:

    #pragma omp critical (first) 
    {
        workA();
    }
    #pragma omp critical (second) 
    {
        workB();
    }
    //секции с workA() и workB() будут выполнены одновременно 
    #pragma omp critical  
    {
        workC();
    }
    #pragma omp critical 
    {
        workD();
    }
    // будет завершена секция с  workC() , а только потом  с workD()
    


    When assigning a name, be careful not to assign the names of system functions or the names that have already been used. If your critical sections work with the same resource (output to a single file, output to the screen), it is worth assigning the same name or not at all.

    2. Do not use! = In loop control

    The for directive imposes restrictions on the structure of the corresponding loop. Definitely, the corresponding cycle should have a canonical form.

    image

    Developer Response from the OpenMP Architecture Review Board

    If we allow! =, Programmers can get an indefinite number of iterations of the loop. The problem is in the compiler when it generates code to calculate the number of iterations.

    For a simple loop like:
    for( i = 0; i < n; ++i )
    

    it is possible to determine the number of iterations, n if n> = 0, and zero iterations if n <0.

    for( i = 0; i != n; ++i ) 
    

    it is possible to determine n iterations if n> = 0; if n <0, we do not know the number of iterations.

    for( i = 0; i < n; i += 2 )
    

    the number of iterations is the integer part of ((n + 1) / 2) if n> = 0, and 0 if n <0.

    for( i = 0; i != n; i += 2 )
    

    cannot determine when i equals n. What if n is an odd number?

    for( i = 0; i < n; i += k )
    

    the number of iterations is the largest integer of ((n + k-1) / k) if n> = 0, and 0 if n <0; in case k <0, this is not a valid OpenMP program.

    for( i = 0; i != n; i += k )
    

    i increases or decreases? Will there be equality? All this can lead to an endless loop.

    3. Install nowait carefully

    If Hachiko wants to wait, he must wait.// Hachiko: The most faithful friend

    If nowait clause is not specified, the for construct will implicitly end with barrier synchronization. At the end of a parallel cycle, implicit barrier synchronization of parallel running flows occurs: their further execution occurs only when they all reach a given point; if such a delay is not necessary, the nowait option allows threads that have already reached the end of the loop to continue execution without synchronization with the others.

    Example:
    #pragma omp parallel shared(n,a,b,c,d,sum) private(i) schedule(dynamic)
    {
       #pragma omp for nowait
       for (i = 0; i < n; i++)
       a[i] += b[i];
       #pragma omp for nowait
       for (i = 0; i < n; i++)
       c[i] += d[i];
       #pragma omp for nowait reduction(+:sum)
       for (i = 0; i < n; i++)
       sum += a[i] + c[i];
    }
    


    There is an error in this example, it is in schedule (dynamic) . The fact is that nowait cycles dependent on data are valid only with c schedule (static) . Only in this way of work scheduling does the standard guarantee correct work with nowait for data-dependent loops. In our case, it is enough to erase schedule (dynamic) in most implementations, schedule (static) is used by default .

    4. Check the code carefully before using task untied

    int dummy;
    #pragma omp threadprivate(dummy)
    void foo() {dummy = …; }
    void bar() {… = dummy; }
    #pragma omp task untied
    {
        foo();
        bar();
    }
    

    task untied specifies that the task is not tied to the thread that started it. Another thread may continue to execute the task after suspension. In this example, the incorrect use of task untied . The programmer assumes that both functions in the task will be performed by one thread. However, if after pausing the task bar () will be executed differently. Due to the fact that each thread has its own dummy variable (in our case, it is threadprivate ). Assignment in bar () will happen incorrectly.

    I hope these tips will help beginners.

    Useful links:
    OpenMP 4.0 pdf
    examples OpenMP 4.0 github examples OpenMP 4.0 pdf
    standard
    All directives on 4 sheets C ++ pdf
    All directives on Fortran 4 sheets pdf
    Some of the best OpenMP slides in Russian by Mikhail Kurnosov pdf

    Also popular now: