From 044a4173899942682beec27e6604144f4261b774 Mon Sep 17 00:00:00 2001 From: Augustin Degomme Date: Tue, 22 Jun 2021 10:09:33 +0200 Subject: [PATCH] document _TAG versions of sampling macros --- docs/source/app_smpi.rst | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/docs/source/app_smpi.rst b/docs/source/app_smpi.rst index 0462def07c..b6febf7c3e 100644 --- a/docs/source/app_smpi.rst +++ b/docs/source/app_smpi.rst @@ -687,7 +687,13 @@ their duration, and this duration will be used for the subsequent iterations. These samples are done per processor with SMPI_SAMPLE_LOCAL, and shared between all processors with SMPI_SAMPLE_GLOBAL. Of course, none of this will work if the execution -time of your loop iteration are not stable. +time of your loop iteration are not stable. If some parameters have an +incidence on the timing of a kernel, and if they are reused often +(same kernel launched with a few different sizes during the run, for example), +SMPI_SAMPLE_LOCAL_TAG and SMPI_SAMPLE_GLOBAL_TAG can be used, with a tag +as last parameter, to differentiate between calls. The tag is a character +chain crafted by the user, with a maximum size of 128, and should include +what is necessary to group calls of a given size together. This feature is demoed by the example file `examples/smpi/NAS/ep.c `_ -- 2.20.1