Added a utility class for Relative Error compute. #1446

sarathnandu · 2024-07-12T16:07:57Z

Description

Added a utility class for Relative Error compute.
Added relative error compute for seismic example.

Fixes # - issue number(s) if exists

Type of change

Choose one or multiple, leave empty if none of the other choices apply

Add a respective label(s) to PR if you have permissions

bug fix - change that fixes an issue
[x ] new feature - change that adds functionality
tests - change in tests
infrastructure - change in infrastructure and CI
documentation - documentation update

Tests

added - required for new features and some bug fixes
[ x] not needed

Documentation

updated in # - add PR number
needs to be updated
[x ] not needed

Breaks backward compatibility

Yes
[x ] No
Unknown

Notify the following users

List users with @ to send notifications

Other information

examples/common/utility/utility.hpp

aleksei-fedotov

I believe we need to continue measurements until one of the following conditions are met:

Relative error is within the acceptable limit
Overall limit on running time is reached.

examples/parallel_for/seismic/main.cpp

aleksei-fedotov · 2024-10-01T16:14:40Z

examples/common/utility/utility.hpp

+        std::chrono::duration<double> duration = _endTime - _startTime;
+        // store the duration in seconds
+        _secPerFrame.push_back(duration.count());


Usually, it makes sense to introduce as little as possible overhead during the measurements to avoid influence the the computations being measured. So, consider pushing the postprocessing step to a later stage. Here only store _startTime and _endTime, and post process them in the independent from the main computations part, e.g., computeRelError.

Thanks, Aleksei, for bringing this point. I am not sure which approach is less intrusive.
Currently I have only one store with a subtract instruction. In the approach you suggested we need to 2 vector stores in the measurement loop. I was under the impression subtract + 1 store is less intrusive, please correct me if I am wrong.

In any case I think I should reserve some memory for the vector stores based on the iteration count. I will update the constructor accordingly.

That's correct! We want to have the impact from the measurement system itself as little as possible. Therefore, we need to consider implementing at least the following:

Having the memory pre-allocated. I guess just need to add a parameter to the constructor.

Are there cases when the number of iterations is not known beforehand?

Do all the calculations only after the measurement loop is completed. This means inside the measurement loop can only be stores of the retrieved timings.

Modified the constructor to accept a parameter for number of iterations and reserve the memory accordingly. All calculations are done after the measurement loop is completed.

sarathnandu · 2024-10-14T14:05:28Z

I believe we need to continue measurements until one of the following conditions are met:

Relative error is within the acceptable limit

Overall limit on running time is reached.

Yes, that's the intent. The relative error may also vary from platform to platform for the same iteration count. In my experiments, for certain benchmarks, I have seen Relative error more stable on client systems whereas on high core count NUMA systems it sometimes increases with iteration.

The Performance CI scripts will utilize the "Relative_Err" string to adjust the iteration count to minimize the relative error in respective systems.

aleksei-fedotov · 2024-10-14T19:09:07Z

examples/common/utility/utility.hpp

+        std::chrono::duration<double> duration = _endTime - _startTime;
+        // store the duration in seconds
+        _secPerFrame.push_back(duration.count());


That's correct! We want to have the impact from the measurement system itself as little as possible. Therefore, we need to consider implementing at least the following:

Having the memory pre-allocated. I guess just need to add a parameter to the constructor.

Are there cases when the number of iterations is not known beforehand?

Do all the calculations only after the measurement loop is completed. This means inside the measurement loop can only be stores of the retrieved timings.

examples/common/utility/utility.hpp

examples/parallel_for/seismic/main.cpp

examples/common/utility/utility.hpp

aleksei-fedotov · 2024-10-14T20:30:17Z

I believe we need to continue measurements until one of the following conditions are met:

Relative error is within the acceptable limit

Overall limit on running time is reached.

Yes, that's the intent. The relative error may also vary from platform to platform for the same iteration count. In my experiments, for certain benchmarks, I have seen Relative error more stable on client systems whereas on high core count NUMA systems it sometimes increases with iteration.

The Performance CI scripts will utilize the "Relative_Err" string to adjust the iteration count to minimize the relative error in respective systems.

Do I understand correctly that CI scripts will keep re-starting the measurements each time specifying increased number of iterations to run? So that they work as if in the following steps:

Set num_iterations to previously saved value (see step 3).
Run the benchmark and get the relative error computed.
If it is within the per benchmark pre-defined limit, save that number of iterations. The system will start from that value in the next measurement session (see step 1).
If it is not within the per benchmark pre-defined limit, then:
- If overall per benchmark measurement session time is acceptable, increase the num_iterations (for example, double it) and go to the step 2.
- If overall time is not acceptable, stop and report error saying that the measurement session is unstable, results are not reliable and that further analysis or re-consideration of either system or benchmark is required.

We also, can add step 3.1 that before saving the num_iterations will try decreasing it a little bit and see if still produces acceptable value of relative error. This way, we could make the system that will automatically and dynamically adjusts to the value of num_iterations closest to the one that gives stable relative error within defined range.

Add changes to store start and end time interval and postporcess. Co-authored-by: Aleksei Fedotov <[email protected]>

sarathnandu · 2024-10-28T02:58:01Z

I believe we need to continue measurements until one of the following conditions are met:

Relative error is within the acceptable limit

Overall limit on running time is reached.

Yes, that's the intent. The relative error may also vary from platform to platform for the same iteration count. In my experiments, for certain benchmarks, I have seen Relative error more stable on client systems whereas on high core count NUMA systems it sometimes increases with iteration.
The Performance CI scripts will utilize the "Relative_Err" string to adjust the iteration count to minimize the relative error in respective systems.

Do I understand correctly that CI scripts will keep re-starting the measurements each time specifying increased number of iterations to run? So that they work as if in the following steps:

Set num_iterations to previously saved value (see step 3).

Run the benchmark and get the relative error computed.

If it is within the per benchmark pre-defined limit, save that number of iterations. The system will start from that value in the next measurement session (see step 1).

If it is not within the per benchmark pre-defined limit, then:

If overall per benchmark measurement session time is acceptable, increase the num_iterations (for example, double it) and go to the step 2.

If overall time is not acceptable, stop and report error saying that the measurement session is unstable, results are not reliable and that further analysis or re-consideration of either system or benchmark is required.

We also, can add step 3.1 that before saving the num_iterations will try decreasing it a little bit and see if still produces acceptable value of relative error. This way, we could make the system that will automatically and dynamically adjusts to the value of num_iterations closest to the one that gives stable relative error within defined range.

Yes this workflow needs to be implemented in the CI perf runner script to adjust the number of iterations until the relative error is reached within acceptable limits for the example benchmark. The intent is to improve the stability of measurements from run to run so that perf report geomean results are more reliable.

aleksei-fedotov

More remarks to consider.

examples/parallel_for/seismic/main.cpp

aleksei-fedotov · 2024-10-28T11:28:18Z

examples/common/utility/utility.hpp

+        if (0 == _time_intervals.size()) {
+            std::cout << "No time samples collected \n";
+            return 0;
+        }


This is related to reporting and processing of an error. I would decouple these two parts as measurements class should mostly be responsible for taking measurements rather then detecting and reporting errors. I think at most we could have throwing of an std::domain_error exception here or just leave an assert here that will fire in the debug mode checking the vector's size, but result in NaN computation in release, which is okay, I think.

Make sense, modified to use assert that will fire in debug mode.

examples/common/utility/utility.hpp

aleksei-fedotov · 2024-10-28T12:03:02Z

examples/common/utility/utility.hpp

+                return total + std::chrono::duration_cast<std::chrono::microseconds>(
+                                   interval.second - interval.first)
+                                   .count();


I wonder if it would be better to optimize by computing the std::chrono::duration_cast<std::chrono::microseconds>(interval.second - interval.first).count() only once and reuse the result below. What do you think?

Storing the duration at each time point in a vector and reading it back for accumulation and standard deviation calculations increases memory usage. In contrast, calculating the duration on the fly for these functions is more computationally complex but avoids additional memory overhead. Since this code is not on the critical path, I’m fine with the current approach. However, if you think storing durations in a vector for later access is preferable, I can switch to that.

I don't see myself any preference right now. That's why I asked for you opinion. Seems like you are also not sure what would be the best approach here. Therefore, I don't think we need to change this part at least now. So, let's have it as it is now.

aleksei-fedotov · 2024-10-28T13:34:48Z

this workflow needs to be implemented in the CI perf runner script to adjust the number of iterations until the relative error is reached within acceptable limits for the example benchmark.

Don't you think that this better be implemented here instead? At least, it won't require restarting the whole benchmark for each new trial of measurements, which might be relatively complex process for some of them.

sarathnandu · 2024-10-28T20:41:15Z

Don't you think that this better be implemented here instead? At least, it won't require restarting the whole benchmark for each new trial of measurements, which might be relatively complex process for some of them.

We would like this feature to be available as part of performance CI and extend it to all benchmarks that are part of the geomean calculation. Integrating this feature into the performance CI enables benchmark modifications to minimal (specify iteration counts and output relative error directly). For benchmarks where source code access is unavailable, the script can leverage the application runtime to compute relative error.

aleksei-fedotov

I think the patch is mostly done. Consider the last bunch of comments that are targeted to improve reliability of the patch.

examples/common/utility/utility.hpp

aleksei-fedotov · 2024-10-31T10:43:43Z

examples/common/utility/utility.hpp

+                return total + std::chrono::duration_cast<std::chrono::microseconds>(
+                                   interval.second - interval.first)
+                                   .count();


I don't see myself any preference right now. That's why I asked for you opinion. Seems like you are also not sure what would be the best approach here. Therefore, I don't think we need to change this part at least now. So, let's have it as it is now.

examples/parallel_for/seismic/main.cpp

Co-authored-by: Aleksei Fedotov <[email protected]>

aleksei-fedotov

Looks good to me!

Added a utility class for Relative Error compute.

e8c7e65

sarathnandu requested review from dnmokhov, JhaShweta1 and pavelkumbrasev July 12, 2024 16:25

sarathnandu marked this pull request as draft July 13, 2024 02:07

dnmokhov reviewed Jul 13, 2024

View reviewed changes

examples/common/utility/utility.hpp Show resolved Hide resolved

sarathnandu added 6 commits July 31, 2024 10:08

Add a utility class measurements for relative error analysis

74d29b7

Add eof for measurements.hpp

17431e2

Fix clang format error.

9019432

Apply llvm-format to measurements.hpp

e99e983

Move the measurements class to utility.hpp

8efac72

Fix clang-format error

59a68e4

sarathnandu marked this pull request as ready for review August 1, 2024 19:19

sarathnandu requested a review from dnmokhov August 1, 2024 21:08

aleksei-fedotov reviewed Oct 1, 2024

View reviewed changes

aleksei-fedotov reviewed Oct 14, 2024

View reviewed changes

sarathnandu and others added 4 commits October 14, 2024 18:40

Apply suggestions from code review

c7ac228

Add changes to store start and end time interval and postporcess. Co-authored-by: Aleksei Fedotov <[email protected]>

Updated the measurements class to store start and end samples.

31e2863

Updates based on review comments.

8845c14

Remove the clear method.

5067fc3

aleksei-fedotov reviewed Oct 28, 2024

View reviewed changes

Address review comments.

3a60d1a

aleksei-fedotov reviewed Oct 31, 2024

View reviewed changes

Apply suggestions from code review

4ce5303

Co-authored-by: Aleksei Fedotov <[email protected]>

aleksei-fedotov previously approved these changes Oct 31, 2024

View reviewed changes

Fix build issue due to variable rename.

e619d81

sarathnandu dismissed aleksei-fedotov’s stale review via e619d81 October 31, 2024 18:34

aleksei-fedotov approved these changes Nov 4, 2024

View reviewed changes

sarathnandu merged commit 73ccfb5 into master Nov 4, 2024
25 checks passed

sarathnandu deleted the dev/sarathna/relativeerror branch November 4, 2024 14:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added a utility class for Relative Error compute. #1446

Added a utility class for Relative Error compute. #1446

sarathnandu commented Jul 12, 2024

aleksei-fedotov left a comment

aleksei-fedotov Oct 1, 2024

sarathnandu Oct 14, 2024 •

edited

Loading

sarathnandu Oct 14, 2024

aleksei-fedotov Oct 14, 2024

sarathnandu Oct 28, 2024

sarathnandu commented Oct 14, 2024

aleksei-fedotov Oct 14, 2024

aleksei-fedotov commented Oct 14, 2024

sarathnandu commented Oct 28, 2024 •

edited

Loading

aleksei-fedotov left a comment

aleksei-fedotov Oct 28, 2024

sarathnandu Oct 28, 2024

aleksei-fedotov Oct 28, 2024

sarathnandu Oct 28, 2024

aleksei-fedotov Oct 31, 2024

aleksei-fedotov commented Oct 28, 2024

sarathnandu commented Oct 28, 2024

aleksei-fedotov left a comment

aleksei-fedotov Oct 31, 2024

aleksei-fedotov left a comment

Added a utility class for Relative Error compute. #1446

Added a utility class for Relative Error compute. #1446

Conversation

sarathnandu commented Jul 12, 2024

Description

Type of change

Tests

Documentation

Breaks backward compatibility

Notify the following users

Other information

aleksei-fedotov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sarathnandu Oct 14, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sarathnandu commented Oct 14, 2024

Choose a reason for hiding this comment

aleksei-fedotov commented Oct 14, 2024

sarathnandu commented Oct 28, 2024 • edited Loading

aleksei-fedotov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aleksei-fedotov commented Oct 28, 2024

sarathnandu commented Oct 28, 2024

aleksei-fedotov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aleksei-fedotov left a comment

Choose a reason for hiding this comment

sarathnandu Oct 14, 2024 •

edited

Loading

sarathnandu commented Oct 28, 2024 •

edited

Loading