Clean up pass for AWS tutorial (#25)

* first clean up pass * fix source file * clean up pass * clean up 4 * clean up pass * clean up pass * clean up 7 * clean up 8 * clean up 09 * clean up 10 * update 10 file * clean up 11 * clean up pass * fix twelve * fix eight * fixes for 08 * clean up pass * clean up 5 and 6 * Update 06_raja_umpire_uvm_solution.cpp * use global thread policies in examples
LLNL · Aug 12, 2024 · 484c8d2 · 484c8d2
1 parent dc4866e
commit 484c8d2
Show file tree

Hide file tree

Showing 44 changed files with 67 additions and 68 deletions.
diff --git a/Intro_Tutorial/lessons/01_blt_cmake/one.cpp → ...ial/lessons/01_blt_cmake/01_blt_cmake.cpp b/Intro_Tutorial/lessons/01_blt_cmake/one.cpp → ...ial/lessons/01_blt_cmake/01_blt_cmake.cpp
diff --git a/Intro_Tutorial/lessons/01_blt_cmake/CMakeLists.txt b/Intro_Tutorial/lessons/01_blt_cmake/CMakeLists.txt
@@ -1,3 +1,3 @@
 blt_add_executable(
-  NAME one
-  SOURCES one.cpp)
+  NAME 01_blt_cmake
+  SOURCES 01_blt_cmake.cpp)
diff --git a/Intro_Tutorial/lessons/01_blt_cmake/README.md b/Intro_Tutorial/lessons/01_blt_cmake/README.md
@@ -27,8 +27,8 @@ all the source code files that make up your application:
 
 ```
 blt_add_executable(
-    NAME one
-    SOURCES one.cpp)
+    NAME 01_blt_cmake
+    SOURCES 01_blt_cmake.cpp)
 ```
 
 For now, we have filled these out for you, but in later lessons you will need to
@@ -53,14 +53,14 @@ practice when using CMake.  Once you are in the build directory, you can use the
 `make` command to compile the executable:
 
 ```
-$ make one
+$ make 01_blt_cmake
 ```
 
 You will see some output as the code is compiled. You can then run the
 executable:
 
 ```
-$ ./bin/one
+$ ./bin/01_blt_cmake
 Hello, world!
 ```
 

diff --git a/...o_Tutorial/lessons/02_raja_umpire/two.cpp → ...lessons/02_raja_umpire/02_raja_umpire.cpp b/...o_Tutorial/lessons/02_raja_umpire/two.cpp → ...lessons/02_raja_umpire/02_raja_umpire.cpp
diff --git a/Intro_Tutorial/lessons/02_raja_umpire/CMakeLists.txt b/Intro_Tutorial/lessons/02_raja_umpire/CMakeLists.txt
@@ -1,5 +1,5 @@
   # TODO: Add the RAJA, umpire and cuda targets as DEPENDS_ON arguments
 blt_add_executable(
-  NAME two
-  SOURCES two.cpp
+  NAME 02_raja_umpire
+  SOURCES 02_raja_umpire.cpp
   DEPENDS_ON )
diff --git a/Intro_Tutorial/lessons/02_raja_umpire/README.md b/Intro_Tutorial/lessons/02_raja_umpire/README.md
@@ -16,8 +16,8 @@ use to list dependencies.
 
 ```
 blt_add_executable(
-    NAME one
-    SOURCES one.cpp
+    NAME 01_blt_cmake
+    SOURCES 01_blt_cmake.cpp
     DEPENDS_ON )
 ```
 
@@ -33,8 +33,8 @@ $ cd build
 You can then compile and run the lesson:
 
 ```
-$ make two
-$ ./bin/two
+$ make 02_raja_umpire
+$ ./bin/02_raja_umpire
 Hello, world (with RAJA and Umpire)!
 ```
 

diff --git a/...ial/lessons/03_umpire_allocator/three.cpp → ..._umpire_allocator/03_umpire_allocator.cpp b/...ial/lessons/03_umpire_allocator/three.cpp → ..._umpire_allocator/03_umpire_allocator.cpp
diff --git a/Intro_Tutorial/lessons/03_umpire_allocator/CMakeLists.txt b/Intro_Tutorial/lessons/03_umpire_allocator/CMakeLists.txt
@@ -1,4 +1,4 @@
 blt_add_executable(
-  NAME three
-  SOURCES three.cpp
+  NAME 03_umpire_allocator
+  SOURCES 03_umpire_allocator.cpp
   DEPENDS_ON RAJA umpire cuda)
diff --git a/Intro_Tutorial/lessons/03_umpire_allocator/README.md b/Intro_Tutorial/lessons/03_umpire_allocator/README.md
@@ -1,7 +1,7 @@
 # Lesson 3
 
 In this lesson, you will learn how to use Umpire to allocate memory. The file
-`three.cpp` contains some `TODO:` comments where you can add code to allocate and
+`03_umpire_allocator.cpp` contains some `TODO:` comments where you can add code to allocate and
 deallocate memory.
 
 The fundamental concept for accessing memory through Umpire is the
@@ -38,7 +38,7 @@ https://umpire.readthedocs.io/en/develop/sphinx/tutorial/allocators.html
 Once you have made your changes, you can compile and run the lesson:
 
 ```
-$ make three
-$ ./bin/three
+$ make 03_umpire_allocator
+$ ./bin/03_umpire_allocator
 Address of data: 0x?????
 ```
diff --git a/...ire_allocator/solution/three_solution.cpp → ...solution/03_umpire_allocator_solution.cpp b/...ire_allocator/solution/three_solution.cpp → ...solution/03_umpire_allocator_solution.cpp
diff --git a/..._Tutorial/lessons/04_raja_forall/four.cpp → ...lessons/04_raja_forall/04_raja_forall.cpp b/..._Tutorial/lessons/04_raja_forall/four.cpp → ...lessons/04_raja_forall/04_raja_forall.cpp
diff --git a/Intro_Tutorial/lessons/04_raja_forall/CMakeLists.txt b/Intro_Tutorial/lessons/04_raja_forall/CMakeLists.txt
@@ -1,4 +1,4 @@
 blt_add_executable(
-  NAME four
-  SOURCES four.cpp
+  NAME 04_raja_forall
+  SOURCES 04_raja_forall.cpp
   DEPENDS_ON RAJA umpire cuda)
diff --git a/Intro_Tutorial/lessons/04_raja_forall/README.md b/Intro_Tutorial/lessons/04_raja_forall/README.md
@@ -32,17 +32,16 @@ this example, we will use the `RAJA::seq_exec` policy to execute this loop on
 the CPU. In later lessons, we will learn about other policies that allow us to
 run code on a GPU.
 
-In the file four.cpp, you will see a `TODO` comment where you can add a 
-`RAJA::forall` loop to initialize the array you allocated in the previous 
+In the file 04_raja_forall.cpp, you will see a `TODO` comment where you can add a
+`RAJA::forall` loop to initialize the array you allocated in the previous
 lesson.
 
 When you have made your changes, compile and run the code in the same way as the
 other lessons:
 
 ```
-$ make four
-$ ./bin/four
-Address of data: 
+$ make 04_raja_forall
+$ ./bin/04_raja_forall
+Address of data:
 data[50] = 50
 ```
-
diff --git a/...04_raja_forall/solution/four_solution.cpp → ...rall/solution/04_raja_forall_solution.cpp b/...04_raja_forall/solution/four_solution.cpp → ...rall/solution/04_raja_forall_solution.cpp
diff --git a/..._Tutorial/lessons/05_raja_reduce/five.cpp → ...lessons/05_raja_reduce/05_raja_reduce.cpp b/..._Tutorial/lessons/05_raja_reduce/five.cpp → ...lessons/05_raja_reduce/05_raja_reduce.cpp
diff --git a/Intro_Tutorial/lessons/05_raja_reduce/CMakeLists.txt b/Intro_Tutorial/lessons/05_raja_reduce/CMakeLists.txt
@@ -1,4 +1,4 @@
 blt_add_executable(
-  NAME five
-  SOURCES five.cpp
+  NAME 05_raja_reduce
+  SOURCES 05_raja_reduce.cpp
   DEPENDS_ON RAJA umpire)
diff --git a/Intro_Tutorial/lessons/05_raja_reduce/README.md b/Intro_Tutorial/lessons/05_raja_reduce/README.md
@@ -29,13 +29,13 @@ https://raja.readthedocs.io/en/develop/sphinx/user_guide/feature/policies.html#r
 The second parameter, the `TYPE` parameter, is just the data type of the 
 variable, such as `int`.
 
-In the file `five.cpp`, follow the instruction in the `TODO` comment to create
+In the file `05_raja_reduce.cpp`, follow the instruction in the `TODO` comment to create
 a RAJA Reduction using `seq_exec`. 
 
 
 Once you have filled in the correct reduction statement, compile and run:
 
 ```
-$ make five
-$ ./bin/five
+$ make 05_raja_reduce
+$ ./bin/05_raja_reduce
 ```
diff --git a/...05_raja_reduce/solution/five_solution.cpp → ...duce/solution/05_raja_reduce_solution.cpp b/...05_raja_reduce/solution/five_solution.cpp → ...duce/solution/05_raja_reduce_solution.cpp
@@ -25,7 +25,7 @@ int main()
     }
   );
 
-  std::cout << "dot product is "<< dot << std::endl;
+  std::cout << "dot product is "<< dot.get() << std::endl;
 
   allocator.deallocate(a);
   allocator.deallocate(b);

diff --git a/...torial/lessons/06_raja_umpire_uvm/six.cpp → ...06_raja_umpire_uvm/06_raja_umpire_uvm.cpp b/...torial/lessons/06_raja_umpire_uvm/six.cpp → ...06_raja_umpire_uvm/06_raja_umpire_uvm.cpp
diff --git a/Intro_Tutorial/lessons/06_raja_umpire_uvm/CMakeLists.txt b/Intro_Tutorial/lessons/06_raja_umpire_uvm/CMakeLists.txt
@@ -1,7 +1,7 @@
 if (ENABLE_CUDA)
   blt_add_executable(
-    NAME six
-    SOURCES six.cpp
+    NAME 06_raja_umpire_uvm
+    SOURCES 06_raja_umpire_uvm.cpp
     DEPENDS_ON RAJA umpire cuda)
 endif()
 
diff --git a/Intro_Tutorial/lessons/06_raja_umpire_uvm/README.md b/Intro_Tutorial/lessons/06_raja_umpire_uvm/README.md
@@ -36,13 +36,13 @@ as a template parameter. Finally, as we are filling in the lambda portion of
 the `RAJA::forall`, we need to specify where it will reside in GPU memory. 
 This can be done directly or by using the `RAJA_DEVICE` macro. 
 
-There are several `TODO` comments in the `six.cpp` exercise file where you 
+There are several `TODO` comments in the `06_raja_umpire_uvm.cpp` exercise file where you 
 can modify the code to work on a GPU. When you are done, build 
 and run the example:
 
 ```
-$ make six
-$ ./bin/six
+$ make 06_raja_umpire_uvm
+$ ./bin/06_raja_umpire_uvm
 ```
 
 For more information on Umpire's resources, see our documentation:

diff --git a/...raja_umpire_uvm/solution/six_solution.cpp → .../solution/06_raja_umpire_uvm_solution.cpp b/...raja_umpire_uvm/solution/six_solution.cpp → .../solution/06_raja_umpire_uvm_solution.cpp
diff --git a/...sons/07_raja_umpire_host_device/seven.cpp → ...ost_device/07_raja_umpire_host_device.cpp b/...sons/07_raja_umpire_host_device/seven.cpp → ...ost_device/07_raja_umpire_host_device.cpp
diff --git a/Intro_Tutorial/lessons/07_raja_umpire_host_device/CMakeLists.txt b/Intro_Tutorial/lessons/07_raja_umpire_host_device/CMakeLists.txt
@@ -1,6 +1,6 @@
 if (ENABLE_CUDA)
   blt_add_executable(
-    NAME seven
-    SOURCES seven.cpp
+    NAME 07_raja_umpire_host_device
+    SOURCES 07_raja_umpire_host_device.cpp
     DEPENDS_ON RAJA umpire cuda)
 endif()
diff --git a/Intro_Tutorial/lessons/07_raja_umpire_host_device/README.md b/Intro_Tutorial/lessons/07_raja_umpire_host_device/README.md
@@ -3,7 +3,7 @@
 In this lesson, you will learn how to use Umpire's operations to copy data
 between CPU and GPU memory in a portable way.
 
-In `seven.cpp`, we create an allocator for the GPU with:
+In `07_raja_umpire_host_device.cpp`, we create an allocator for the GPU with:
 ```  
 auto allocator = rm.getAllocator("DEVICE");
 ```
@@ -30,12 +30,12 @@ void umpire::ResourceManager::copy (void* dst_ptr, void * src_ptr, std::size_t s
 
 *Note:* The destination is the first argument.
 
-In the file `seven.cpp`, there is a `TODO` comment where you should insert two copy
+In the file `07_raja_umpire_host_device.cpp`, there is a `TODO` comment where you should insert two copy
 calls to copy data from the CPU memory to the DEVICE memory.
 
 When you are done editing the file, compile and run it:
 
 ```
-$ make seven
-$ ./bin/seven
+$ make 07_raja_umpire_host_device
+$ ./bin/07_raja_umpire_host_device
 ```
diff --git a/...e_host_device/solution/seven_solution.cpp → ...n/07_raja_umpire_host_device_solution.cpp b/...e_host_device/solution/seven_solution.cpp → ...n/07_raja_umpire_host_device_solution.cpp
diff --git a/...ssons/08_raja_umpire_quick_pool/eight.cpp → ..._quick_pool/08_raja_umpire_quick_pool.cpp b/...ssons/08_raja_umpire_quick_pool/eight.cpp → ..._quick_pool/08_raja_umpire_quick_pool.cpp
diff --git a/Intro_Tutorial/lessons/08_raja_umpire_quick_pool/CMakeLists.txt b/Intro_Tutorial/lessons/08_raja_umpire_quick_pool/CMakeLists.txt
@@ -1,6 +1,6 @@
 if (ENABLE_CUDA)
   blt_add_executable(
-    NAME eight
-    SOURCES eight.cpp
+    NAME 08_raja_umpire_quick_pool
+    SOURCES 08_raja_umpire_quick_pool.cpp
     DEPENDS_ON RAJA umpire cuda)
 endif()
diff --git a/Intro_Tutorial/lessons/08_raja_umpire_quick_pool/README.md b/Intro_Tutorial/lessons/08_raja_umpire_quick_pool/README.md
@@ -21,13 +21,13 @@ To create a new memory pool allocator using the `QuickPool` strategy, we can use
 
 This newly created `pool` is an `umpire::Allocator` using the `QuickPool` strategy. As you can see above, we can use the `ResourceManager::makeAllocator` function to create the pool allocator. We just need to pass 
 in: (1) the name we would like the pool to have, and (2) the allocator we previously created with the `ResourceManager` (see line 17 in the
-file `eight.cpp`). Remember that you will also need to include the `umpire/strategy/QuickPool.hpp` header file.
+file `08_raja_umpire_quick_pool.cpp`). Remember that you will also need to include the `umpire/strategy/QuickPool.hpp` header file.
 
 There are other arguments that could be passed to the pool constructor if needed. These additional option arguments are a bit advanced and are beyond the scope of this tutorial. However, you can visit the documentation page for more: https://umpire.readthedocs.io/en/develop/doxygen/html/index.html
 
 When you have created your QuickPool allocator, uncomment the COMPILE define on line 7;
 then compile and run the code:
 ```
-$ make eight
-$ ./bin/eight
+$ make 08_raja_umpire_quick_pool
+$ ./bin/08_raja_umpire_quick_pool
 ```
diff --git a/...re_quick_pool/solution/eight_solution.cpp → ...on/08_raja_umpire_quick_pool_solution.cpp b/...re_quick_pool/solution/eight_solution.cpp → ...on/08_raja_umpire_quick_pool_solution.cpp
diff --git a/Intro_Tutorial/lessons/09_raja_view/nine.cpp → ...ial/lessons/09_raja_view/09_raja_view.cpp b/Intro_Tutorial/lessons/09_raja_view/nine.cpp → ...ial/lessons/09_raja_view/09_raja_view.cpp
diff --git a/Intro_Tutorial/lessons/09_raja_view/CMakeLists.txt b/Intro_Tutorial/lessons/09_raja_view/CMakeLists.txt
@@ -1,4 +1,4 @@
 blt_add_executable(
-  NAME nine
-  SOURCES nine.cpp
+  NAME 09_raja_view
+  SOURCES 09_raja_view.cpp
   DEPENDS_ON cuda RAJA umpire)
diff --git a/Intro_Tutorial/lessons/09_raja_view/README.md b/Intro_Tutorial/lessons/09_raja_view/README.md
@@ -33,15 +33,15 @@ RAJA::View<double, RAJA::Layout<2, int>> view(data, N, N);
 where `data` is a `double*`, and `N` is the size of each dimension. The size of
 `data` should be at least `N*N`.
 
-In the file `nine.cpp`, there is a `TODO` comment where you should create three
+In the file `09_raja_view.cpp`, there is a `TODO` comment where you should create three
 views, A, B, and C. You will notice that we are doing the same dot product 
 calculation, but this time for matrices. Thus, we are now doing a matrix
 multiplication. When you are ready, uncomment the COMPILE define on line 7;
 then you can compile and run the code:
 
 ```
-$ make nine
-$ ./bin/nine
+$ make 09_raja_view
+$ ./bin/09_raja_view
 ```
 
 For more information on Views and Layouts, see the RAJA

diff --git a/...s/09_raja_view/solution/nine_solution.cpp → ...ns/09_raja_view/solution/09_raja_view.cpp b/...s/09_raja_view/solution/nine_solution.cpp → ...ns/09_raja_view/solution/09_raja_view.cpp
diff --git a/...o_Tutorial/lessons/10_raja_kernel/ten.cpp → ...lessons/10_raja_kernel/10_raja_kernel.cpp b/...o_Tutorial/lessons/10_raja_kernel/ten.cpp → ...lessons/10_raja_kernel/10_raja_kernel.cpp
diff --git a/Intro_Tutorial/lessons/10_raja_kernel/CMakeLists.txt b/Intro_Tutorial/lessons/10_raja_kernel/CMakeLists.txt
@@ -1,4 +1,4 @@
 blt_add_executable(
-  NAME ten
-  SOURCES ten.cpp
+  NAME 10_raja_kernel
+  SOURCES 10_raja_kernel.cpp
   DEPENDS_ON cuda RAJA umpire)
diff --git a/Intro_Tutorial/lessons/10_raja_kernel/README.md b/Intro_Tutorial/lessons/10_raja_kernel/README.md
@@ -68,6 +68,6 @@ When you have finished making your changes, uncomment the COMPILE define on line
 then compile and run the code:
 
 ```
-$ make ten
-$ ./bin/ten
+$ make 10_raja_kernel
+$ ./bin/10_raja_kernel
 ```
diff --git a/.../10_raja_kernel/solution/ten_solution.cpp → ...0_raja_kernel/solution/10_raja_kernel.cpp b/.../10_raja_kernel/solution/ten_solution.cpp → ...0_raja_kernel/solution/10_raja_kernel.cpp
diff --git a/.../lessons/11_raja_device_kernel/eleven.cpp → ...a_device_kernel/11_raja_device_kernel.cpp b/.../lessons/11_raja_device_kernel/eleven.cpp → ...a_device_kernel/11_raja_device_kernel.cpp
diff --git a/Intro_Tutorial/lessons/11_raja_device_kernel/CMakeLists.txt b/Intro_Tutorial/lessons/11_raja_device_kernel/CMakeLists.txt
@@ -1,6 +1,6 @@
 if (ENABLE_CUDA)
   blt_add_executable(
-    NAME eleven
-    SOURCES eleven.cpp
+    NAME 11_raja_device_kernel
+    SOURCES 11_raja_device_kernel.cpp
     DEPENDS_ON RAJA umpire cuda)
 endif()
diff --git a/Intro_Tutorial/lessons/11_raja_device_kernel/README.md b/Intro_Tutorial/lessons/11_raja_device_kernel/README.md
@@ -17,8 +17,8 @@ before this will work!
 Once you are ready, uncomment the COMPILE define on line 7; then you can build and run the example:
 
 ```
-$ make eleven
-$ ./bin/eleven
+$ make 11_raja_device_kernel
+$ ./bin/11_raja_device_kernel
 ```
 
 For reference, lesson 12 contains the solution, so don't worry if you get stuck!
diff --git a/...evice_kernel/solution/eleven_solution.cpp → ...lution/11_raja_device_kernel_solution.cpp b/...evice_kernel/solution/eleven_solution.cpp → ...lution/11_raja_device_kernel_solution.cpp
@@ -32,9 +32,9 @@ int main()
  // TODO: convert EXEC_POL to use CUDA
   using EXEC_POL =
       RAJA::KernelPolicy<
-        RAJA::statement::CudaKernel<
-          RAJA::statement::For<1, RAJA::cuda_block_x_loop,
-            RAJA::statement::For<0, RAJA::cuda_thread_x_loop,
+        RAJA::statement::CudaKernelFixed<256,
+          RAJA::statement::For<1, RAJA::cuda_global_size_y_direct<16>,
+	    RAJA::statement::For<0, RAJA::cuda_global_size_x_direct<16>,
               RAJA::statement::Lambda<0>
             >
           >

diff --git a/...12_raja_device_kernel_complete/twelve.cpp → ...mplete/12_raja_device_kernel_complete.cpp b/...12_raja_device_kernel_complete/twelve.cpp → ...mplete/12_raja_device_kernel_complete.cpp
@@ -31,9 +31,9 @@ int main()
 
   using EXEC_POL =
       RAJA::KernelPolicy<
-        RAJA::statement::CudaKernel<
-          RAJA::statement::For<1, RAJA::cuda_block_x_loop,
-            RAJA::statement::For<0, RAJA::cuda_thread_x_loop,
+        RAJA::statement::CudaKernelFixed<256,
+          RAJA::statement::For<1, RAJA::cuda_global_size_y_direct<16>,
+	    RAJA::statement::For<0, RAJA::cuda_global_size_x_direct<16>,
               RAJA::statement::Lambda<0>
             >
           >

diff --git a/Intro_Tutorial/lessons/12_raja_device_kernel_complete/CMakeLists.txt b/Intro_Tutorial/lessons/12_raja_device_kernel_complete/CMakeLists.txt
@@ -1,6 +1,6 @@
 if (ENABLE_CUDA)
   blt_add_executable(
-    NAME twelve
-    SOURCES twelve.cpp
+    NAME 12_raja_device_kernel_complete
+    SOURCES 12_raja_device_kernel_complete.cpp
     DEPENDS_ON RAJA umpire cuda)
 endif()