Following Week 4’s implementation of the sections
, single
, and master
constructs, Week 5 shifted focus to the TASK
construct and resolving issues with shared
and private
variables in parallel
regions. Initially, I planned to work on the teams
construct, but addressing these foundational issues took precedence. This week, I successfully merged PR #7760 and made significant progress on PR #7832, which now passes all tests. Approximately 37 hours were invested in these efforts to enhance LFortran’s OpenMP capabilities.
Implementing the TASK Construct
In PR #7760, I implemented the TASK
construct, contributing to Issue #7365 and Issue #7332. This involved compiling and running two simple minimal reproducible examples (MREs), openmp_50.f90
and openmp_51.f90
. Additional changes included fixing a bug in ASR generation, adding a utility to visit the body of an OMPRegion
, and extending visit_If_t
and visit_DoLoop_t
to handle nested OMPRegion
statements. The PR was successfully merged, enabling basic task parallelism in LFortran.
Discrepancies in TASK and PARALLEL Constructs
After merging PR #7760, I noticed that the TASK
construct worked only for MREs without shared
variables. More complex examples involving shared
variables failed, suggesting an issue with passing the thread_data
struct to GOMP calls. To investigate further, I examined the parallel
construct independently. I found that only arrays were correctly treated as shared
, while non-array variables were incorrectly handled as private
. This revealed a deeper flaw in the existing OpenMP logic that needed immediate attention.
Addressing Faulty Logic in shared and private Clauses
The root cause was very tricky to find , but it was one outdated assumption in the OpenMP pass: all variables within a parallel
region were treated as private
except for arrays. This logic was insufficient for robust OpenMP support. In PR #7832, I tackled the handling of shared
and private
variables. I first tried that, why not just copy back those private vars back to original vars? It turned out this approach was completely wrong as what if the changes done by one thread is needed by other thread? What if we have a critical section? In those scenarios it failed completely. Hence, I adopted a pointer-based approach for shared
variables, creating pointers in the lcompilers_parallel_func
and defining corresponding CPtr
-typed members in the thread_data
struct. This ensured all threads could access the same memory location. For private
variables, I retained their original types in the struct. Additionally, I implemented the barrier
and critical
constructs to support synchronization in these scenarios. This approach enabled LFortran to compile and run three complex MREs, validating the fixes.
Examples: TASK Construct and shared/private Fixes
Below are the MREs showcasing the Week 5 contributions. The first two demonstrate the TASK
construct from PR #7760, while the subsequent three illustrate the fixes for shared
and private
variables from PR #7832.
View MRE for TASK
Construct (openmp_51.f90)
1program openmp_51
2 use omp_lib
3 implicit none
4 call omp_set_num_threads(10)
5
6 !$omp parallel
7 if(omp_get_thread_num() == 0) then
8 !$omp task
9 print *, "Task 0 done by TID:-",omp_get_thread_num()
10 !$omp end task
11 end if
12
13 if(omp_get_thread_num() == 1) then
14 !$omp task
15 print *, "Task 1 done by TID:-",omp_get_thread_num()
16 !$omp end task
17 end if
18 !$omp end parallel
19end program openmp_51
View MRE for TASK
Construct (openmp_50.f90)
1program openmp_50
2 use omp_lib
3 implicit none
4 integer :: i=0,n=10
5 call omp_set_num_threads(8)
6
7 !$OMP PARALLEL
8 !$OMP MASTER
9 do i = 1, n
10 !$OMP TASK private(i)
11 print *, "Task ",i,"done by TID:-",omp_get_thread_num()
12 !$OMP END TASK
13 end do
14 !$OMP END MASTER
15 !$OMP END PARALLEL
16
17end program openmp_50
View MRE for shared
/private
Fix (openmp_52.f90)
1program openmp_52
2 use omp_lib
3 implicit none
4 integer, parameter :: N = 100, init=0
5 integer :: a(N), i, total
6 a = 1 ! Initialize all elements to 1
7
8 !$omp parallel shared(a, total) private(i)
9 total = init ! Initialize total to 0
10 !$omp barrier
11
12 !$omp do
13 do i = 1, N
14 !$omp critical
15 total = total + a(i)
16 !$omp end critical
17 end do
18 !$omp end do
19 !$omp end parallel
20
21 print *, "Total sum:", total
22 if (total /= N) error stop "Incorrect sum"
23end program openmp_52
View MRE for shared
/private
Fix (openmp_53.f90)
1program openmp_53
2 use omp_lib
3 implicit none
4 integer :: x
5 integer, parameter:: N = 0
6
7 !$omp parallel shared(x)
8 x=N
9 !$omp barrier
10 !$omp critical
11 x = x + 1
12 !$omp end critical
13 !$omp end parallel
14
15 print *, "Final x:", x
16 if (x /= omp_get_max_threads()) error stop "x is not equal to number of threads"
17end program openmp_53
View MRE for shared
/private
Fix (openmp_54.f90)
1program openmp_54
2 use omp_lib
3 implicit none
4
5 integer, parameter :: N = 1000
6 integer :: i, tid
7 integer :: total_sum
8 integer :: partial_sum
9
10 !$omp parallel shared(total_sum) private(i, partial_sum, tid)
11 tid = omp_get_thread_num()
12 partial_sum = 0
13 total_sum = 0
14 !$omp barrier
15
16 !$omp do
17 do i = 1, N
18 partial_sum = partial_sum + i
19 end do
20 !$omp end do
21
22 ! Critical update to the shared total_sum
23 !$omp critical
24 total_sum = total_sum + partial_sum
25 print *, "Thread ", tid, " added partial_sum ", partial_sum
26 !$omp end critical
27
28 !$omp barrier
29
30 !$omp single
31 if (total_sum /= N*(N+1)/2) then
32 print *, "ERROR: total_sum = ", total_sum, " expected = ", N*(N+1)/2
33 error stop
34 else
35 print *, "Success! total_sum = ", total_sum
36 end if
37 !$omp end single
38
39 !$omp end parallel
40
41end program openmp_54
Next Steps
For Week 6, I plan to:
- Enhance the
TASK
construct to supportshared
andprivate
variables, using and building upon the fixes from PR #7832. - Implement the
teams
construct using theOMPRegion
node (Issue #7363).
I express my gratitude to my mentors, Ondrej Certik, Pranav Goswami, and Gaurav Dhingra, for their guidance and thorough reviews, which were instrumental in resolving these issues. I also thank the LFortran community for their continued support.