Following Week 4’s implementation of the sections, single, and master constructs, Week 5 shifted focus to the TASK construct and resolving issues with shared and private variables in parallel regions. Initially, I planned to work on the teams construct, but addressing these foundational issues took precedence. This week, I successfully merged PR #7760 and made significant progress on PR #7832, which now passes all tests. Approximately 37 hours were invested in these efforts to enhance LFortran’s OpenMP capabilities.

Implementing the TASK Construct

In PR #7760, I implemented the TASK construct, contributing to Issue #7365 and Issue #7332. This involved compiling and running two simple minimal reproducible examples (MREs), openmp_50.f90 and openmp_51.f90. Additional changes included fixing a bug in ASR generation, adding a utility to visit the body of an OMPRegion, and extending visit_If_t and visit_DoLoop_t to handle nested OMPRegion statements. The PR was successfully merged, enabling basic task parallelism in LFortran.

Discrepancies in TASK and PARALLEL Constructs

After merging PR #7760, I noticed that the TASK construct worked only for MREs without shared variables. More complex examples involving shared variables failed, suggesting an issue with passing the thread_data struct to GOMP calls. To investigate further, I examined the parallel construct independently. I found that only arrays were correctly treated as shared, while non-array variables were incorrectly handled as private. This revealed a deeper flaw in the existing OpenMP logic that needed immediate attention.

Addressing Faulty Logic in shared and private Clauses

The root cause was very tricky to find , but it was one outdated assumption in the OpenMP pass: all variables within a parallel region were treated as private except for arrays. This logic was insufficient for robust OpenMP support. In PR #7832, I tackled the handling of shared and private variables. I first tried that, why not just copy back those private vars back to original vars? It turned out this approach was completely wrong as what if the changes done by one thread is needed by other thread? What if we have a critical section? In those scenarios it failed completely. Hence, I adopted a pointer-based approach for shared variables, creating pointers in the lcompilers_parallel_func and defining corresponding CPtr-typed members in the thread_data struct. This ensured all threads could access the same memory location. For private variables, I retained their original types in the struct. Additionally, I implemented the barrier and critical constructs to support synchronization in these scenarios. This approach enabled LFortran to compile and run three complex MREs, validating the fixes.

Examples: TASK Construct and shared/private Fixes

Below are the MREs showcasing the Week 5 contributions. The first two demonstrate the TASK construct from PR #7760, while the subsequent three illustrate the fixes for shared and private variables from PR #7832.

View MRE for TASK Construct (openmp_51.f90)
1program openmp_51
2    use omp_lib
3    implicit none
4    call omp_set_num_threads(10)
5
6    !$omp parallel
7        if(omp_get_thread_num() == 0) then
8            !$omp task
9                print *, "Task 0 done by TID:-",omp_get_thread_num()
10            !$omp end task
11        end if
12        
13        if(omp_get_thread_num() == 1) then
14            !$omp task
15                print *, "Task 1 done by TID:-",omp_get_thread_num()
16            !$omp end task
17        end if
18    !$omp end parallel 
19end program openmp_51
View MRE for TASK Construct (openmp_50.f90)
1program openmp_50
2    use omp_lib
3    implicit none
4    integer :: i=0,n=10
5    call omp_set_num_threads(8)
6
7  !$OMP PARALLEL 
8    !$OMP MASTER
9      do i = 1, n
10          !$OMP TASK  private(i)
11              print *, "Task ",i,"done by TID:-",omp_get_thread_num()
12          !$OMP END TASK
13      end do
14    !$OMP END MASTER
15  !$OMP END PARALLEL
16
17end program openmp_50
View MRE for shared/private Fix (openmp_52.f90)
1program openmp_52
2  use omp_lib
3  implicit none
4  integer, parameter :: N = 100, init=0
5  integer :: a(N), i, total
6  a = 1  ! Initialize all elements to 1
7
8  !$omp parallel shared(a, total) private(i)
9    total = init  ! Initialize total to 0
10    !$omp barrier
11    
12    !$omp do
13        do i = 1, N
14            !$omp critical
15            total = total + a(i)
16            !$omp end critical
17        end do
18    !$omp end do
19  !$omp end parallel
20
21  print *, "Total sum:", total
22  if (total /= N) error stop "Incorrect sum"
23end program openmp_52
View MRE for shared/private Fix (openmp_53.f90)
1program openmp_53
2  use omp_lib
3  implicit none
4  integer :: x
5  integer, parameter:: N = 0
6
7  !$omp parallel shared(x)
8  x=N
9  !$omp barrier
10    !$omp critical
11    x = x + 1
12    !$omp end critical
13  !$omp end parallel
14
15  print *, "Final x:", x
16  if (x /= omp_get_max_threads()) error stop "x is not equal to number of threads"
17end program openmp_53
View MRE for shared/private Fix (openmp_54.f90)
1program openmp_54
2  use omp_lib
3  implicit none
4
5  integer, parameter :: N = 1000
6  integer :: i, tid
7  integer :: total_sum
8  integer :: partial_sum
9
10  !$omp parallel shared(total_sum) private(i, partial_sum, tid)
11    tid = omp_get_thread_num()
12    partial_sum = 0
13    total_sum = 0
14    !$omp barrier
15
16    !$omp do
17        do i = 1, N
18            partial_sum = partial_sum + i
19        end do
20    !$omp end do
21
22    ! Critical update to the shared total_sum
23    !$omp critical
24        total_sum = total_sum + partial_sum
25        print *, "Thread ", tid, " added partial_sum ", partial_sum
26    !$omp end critical
27
28    !$omp barrier
29
30    !$omp single
31        if (total_sum /= N*(N+1)/2) then
32            print *, "ERROR: total_sum = ", total_sum, " expected = ", N*(N+1)/2
33            error stop
34        else
35            print *, "Success! total_sum = ", total_sum
36        end if
37    !$omp end single
38
39  !$omp end parallel
40
41end program openmp_54

Next Steps

For Week 6, I plan to:

  • Enhance the TASK construct to support shared and private variables, using and building upon the fixes from PR #7832.
  • Implement the teams construct using the OMPRegion node (Issue #7363).

I express my gratitude to my mentors, Ondrej Certik, Pranav Goswami, and Gaurav Dhingra, for their guidance and thorough reviews, which were instrumental in resolving these issues. I also thank the LFortran community for their continued support.