A Language-Based Tuning Mechanism for Task and Pipeline Parallelism
-
Name:
Konferenzartikel
-
Author:
Frank Otto, Christoph A. Schaefer, Matthias Dempe, Walter F. Tichy
-
Zusammenfassung
Current multicore computers differ in many hardware aspects. Tuning parallel applications is indispensable to achieve best performance on a particular hardware platform. Auto-tuners represent a promising approach to systematically optimize a program's tuning parameters, such as the number of threads, the size of data partitions, or the number of pipeline stages. However, an auto-tuner requires several tuning runs to find optimal values for all parameters. In addition, a program optimized for execution on one machine usually has to be re-tuned on other machines. Our approach tackles this problem by introducing a language-based tuning mechanism. The key idea is the inference of essential tuning parameters from high-level parallel language constructs. Instead of identifying and adjusting tuning parameters manually, we exploit the compiler's context knowledge about the program's parallel structure to configure the tuning parameters at runtime. Consequently, our approach significantly reduces the need for platform-specific tuning runs. We implemented the approach as an integral part of XJava, a Java language extension to express task and pipeline parallelism. Several benchmark programs executed on different hardware platforms demonstrate the effectiveness of our approach. On average, our mechanism sets 91% of the relevant tuning parameters automatically and achieves 93% of the optimum performance.
-
Year:
2010
- Links:
Bibtex
@inproceedings{,
author={Frank Otto, Christoph A. Schaefer, Matthias Dempe, Walter F. Tichy},
title={A Language-Based Tuning Mechanism for Task and Pipeline Parallelism},
year=2010,
month=Sep,
booktitle={Proceedings of the 16th International Euro-Par Conference on Parallel Processing},
url={https://ps.ipd.kit.edu/downloads/ka_2010_language_based_tuning_mechanism.pdf},
abstract={Current multicore computers differ in many hardware aspects. Tuning parallel applications is indispensable to achieve best performance on a particular hardware platform. Auto-tuners represent a promising approach to systematically optimize a program's tuning parameters, such as the number of threads, the size of data partitions, or the number of pipeline stages. However, an auto-tuner requires several tuning runs to find optimal values for all parameters. In addition, a program optimized for execution on one machine usually has to be re-tuned on other machines. Our approach tackles this problem by introducing a language-based tuning mechanism. The key idea is the inference of essential tuning parameters from high-level parallel language constructs. Instead of identifying and adjusting tuning parameters manually, we exploit the compiler's context knowledge about the program's parallel structure to configure the tuning parameters at runtime. Consequently, our approach significantly reduces the need for platform-specific tuning runs. We implemented the approach as an integral part of XJava, a Java language extension to express task and pipeline parallelism. Several benchmark programs executed on different hardware platforms demonstrate the effectiveness of our approach. On average, our mechanism sets 91% of the relevant tuning parameters automatically and achieves 93% of the optimum performance.},
pages={328-340},
}