foo@bar:~$ make main
foo@bar:~$ ./main option_type -B barrier_price -K strike_price -N number_of_paths
foo@bar:~$ make main_omp
foo@bar:~$ ./main_omp option_type -B barrier_price -K strike_price -N number_of_paths -threads thread_count
Multiple Node, Multiple CPUs, Multiple GPUs running on multiple Cuda threads and multiple CPU threads
foo@bar:~$ make main_mpi
foo@bar:~$ mpirun -bind-to none -n number_nodes ./main_mpi option_type -B barrier_price -K strike_price -N number_of_paths -threads thread_count
This project simulates barrier options, where the pay-off not only depends on the underlying asset's price at maturity but also on whether the underlying hits a price known as the barrier.
We have implemented the following types of barrier options which can be passed as a command line option:
- "daoc" - Down and Out Call Options
- "uaop" - Up and Out Put Options
- "uaic" - Up and In Call Options
- "daip" - Down and In Put Options
We have set the rebate price as 0, but an option to allow a different rebate price can be added in the fututre.
We run our program on the CARC High-Performance computing cluster. The architecture looks something like this:
The computation is divided between nodes, and each node runs a process. Each node interacts with each other using the Message Passing Interface (MPI)
Each node has multiple CPU Cores in it, and these cores can run multiple threads for each node process. These threads use OpenMP for interaction and parallelization of threads
Each node also has accelerated GPU units associated with it where each unit can run multiple CUDA threads.
We use the Geometric Brownian Motion to simulate the underlying price, and the discretized Euler method version comes down to this:
The code generates an array of random elements
-
Simple single, threaded version
-
Single CPU, Single GPU running on multiple Cuda threads and a single CPU thread
Multiple CPU, Multiple GPU running on multiple Cuda threads and multiple CPU threads
Each CPU runs a single CPU thread. Here, we allocate a GPU to every CPU and reduce the result from multiple CUDA threads running on a Single GPU performed on a Single CPU thread.-
Multiple Node, Multiple CPUs, Multiple GPUs running on multiple Cuda threads and multiple CPU threads
Each node runs a version of Multiple CPUs, Multiple GPUs running on multiple Cuda threads, and multiple CPU threads
We perform all the tests on Down-and-in-Put-Options but we have implemented other versions of options too
Weak scaling on the number of MPI nodes, if threads per node = 1
Weak scaling on the number of MPI nodes, if threads per node = 2
Weak scaling on the number of MPI nodes, if threads per node = 4
Weak Scaling on the number of threads (Keeping nodes = 1)
Strong scaling on the number of MPI nodes, if threads per node = 1
Strong scaling on the number of MPI nodes, if threads per node = 2
Strong scaling on the number of MPI nodes, if threads per node = 4
Strong scaling on the number of threads, if node = 1
Strong scaling on the number of threads, if node = 2
Strong scaling on the number of threads, if node = 4
Scaling of Cuda speed with respect to size of the input
We also tested the program on larger inputs. For inputs larger than these we were running into memory constraints. Although we have an idea about how to get around them, it wasn't possible to do it before the submission deadline, as we would have to perform tests on those versions as well. Below are some runtimes on larger inputs that we didn't plot, just to keep the size of this Readme brief.
As we can see from the above charts, the scaling depends on the configuration. We believe that if we get around the memory constraints for larger input sizes, we would see much better scaling in larger sizes of inputs. Right now, it is faster to run if the input size fits exactly the GPU memory. Adding nodes and threads adds communcation overhead. But once the input size is much larger than CUDA's memory capacity, Parallel Nodes and threads improve the scaling of the program.
- "Monte Carlo Simulations In CUDA - Barrier Option Pricing", QuantStart, Link