Auto Tuning HPL on Qiming and Taiyi Cluster Using Bayesian Optimization

SUSTech CCSE

Haibin Lai, Student Assistant [email protected]

Junjie Qiu, Student Assistant [email protected]

Introduction

The High Performance Linpack (HPL) benchmark[4] is utilized to assess the floating-point performance of clusters and serves as the standard for evaluating supercomputer performance in the Top 500 benchmark[5]. Successfully running and optimizing HPL is fundamental for effective cluster management. Achieving peak performance involves tuning approximately fifteen parameters, such as matrix block sizes and algorithmic settings.

In this project, we applied Bayesian optimization (BO) to tune a CPU node inside a cluster system for the benchmark. Bayesian optimization is a fast and rapid optimization method on tuning the hyperparameter of a software.

Framework

The following figure displays the main framework of the Tuning System.

We determine the starting point using empirical formula by detecting the size of memory and the number and performance of cores. Then we run bayesian optimization on one node with N iteration. On each iteration, we run HPL on the client node of the cluster and take the PFLOPS as the target function, then the optimizer will
try to maximize it.

At the end of the optimization, the system output the best group of solution with the best result.

The communication framework works like RPC(Remote Procedure Call). The program that run optimization on process A at server send process B which run HPL on client. Then process A wait until process B is finished. After that, A continues to optimize the parameter.

Parallel Working

The optimization system can also run with several thread, where it take a signal node as mission. So by sending a list to the system, it will submit it and solve them automatically.

Future Work

Since the component of HPL can be changed, we can apply HPL's GPU version like [1] to the system. Also, HPCG has been a rapid growing testing software in recent years. These can also be applied to the system.

Reference

[1] iyazaki, T., Sato, I., Shimizu, N. (2018). Bayesian Optimization of HPC Systems for Energy Efficiency. In: Yokota, R., Weiland, M., Keyes, D., Trinitis, C. (eds) High Performance Computing. ISC High Performance 2018. Lecture Notes in Computer Science(), vol 10876. Springer, Cham. https://doi.org/10.1007/978-3-319-92040-5_3

[2] Willemsen, F., van Nieuwpoort, R., & van Werkhoven, B. (2021). Bayesian Optimization for auto-tuning GPU kernels. 2021 International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS), 106-117.https://www.semanticscholar.org/paper/Bayesian-Optimization-for-auto-tuning-GPU-kernels-Willemsen-Nieuwpoort/b6f70b9a25b49c4dbe1e0cec03060835f358bba9

[3] Jason Ansel, Shoaib Kamil, Kalyan Veeramachaneni, Jonathan Ragan-Kelley, Jeffrey Bosboom, Una-May O'Reilly, and Saman Amarasinghe. 2014. OpenTuner: an extensible framework for program autotuning. In Proceedings of the 23rd international conference on Parallel architectures and compilation (PACT '14). Association for Computing Machinery, New York, NY, USA, 303–316. https://doi.org/10.1145/2628071.2628092

[4] J. J. Dongarra, P. Luszczek, and A. Petitet, The LINPACK Benchmark: past, present and future, Concurrency and Computation: Practice and Experience, vol. 15, no. 9, pp. 803820, 2003.

[5] Top500, Top 500 supercomputer sites,http://www.top500.org/, 2010.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.idea		.idea
picture		picture
test		test
ClusterInfo.py		ClusterInfo.py
HPL.dat		HPL.dat
HPL_Parameter.txt		HPL_Parameter.txt
README.md		README.md
RunHPL.py		RunHPL.py
SECURITY.md		SECURITY.md
Scheduler.py		Scheduler.py
bayes.txt		bayes.txt
bayes_optM.py		bayes_optM.py
hpl.lsf		hpl.lsf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Auto Tuning HPL on Qiming and Taiyi Cluster Using Bayesian Optimization

SUSTech CCSE

Haibin Lai, Student Assistant [email protected]

Junjie Qiu, Student Assistant [email protected]

Introduction

Framework

Parallel Working

Future Work

Reference

About

Releases

Packages

Languages

Handicraft-Computing-Team/HPLAutoTune

Folders and files

Latest commit

History

Repository files navigation

Auto Tuning HPL on Qiming and Taiyi Cluster Using Bayesian Optimization

SUSTech CCSE

Haibin Lai, Student Assistant [email protected]

Junjie Qiu, Student Assistant [email protected]

Introduction

Framework

Parallel Working

Future Work

Reference

About

Resources

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages