-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: CoreMark benchmark sample #55461
RFC: CoreMark benchmark sample #55461
Conversation
Coremark is a benchmark that evaluates CPU performance. Signed-off-by: Volodymyr Bondarchuk <[email protected]>
The following west manifest projects have been modified in this Pull Request:
Note: This message is automatically posted and updated by the Manifest GitHub Action. |
a2fa3d9
to
b4233f8
Compare
Coremark porting layer and integration to zephyr build system added. Signed-off-by: Volodymyr Bondarchuk <[email protected]>
b4233f8
to
a196bb3
Compare
@@ -0,0 +1,10 @@ | |||
.. _benchmarks-sample: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you are introducing samples/benchmarks
directory -- this can be potentially misleading because we already have tests/benchmarks
where we have various benchmark tests, and benchmarks are often considered "tests."
My suggestion would be to place the coremark
benchmark application directory under tests/benchmarks
or samples/modules
(since this technically a "sample" exercising the coremark module).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stephanosio, I assume, that in future other benchmarks could be added to zephyr.
It to check what eembc has - benchmarks that test different use cases for targets in the IoT topic. Sooner or later some more similar tests could appear here. Better to start group it from the beginning.
And the main purpose of this benchmark to evaluate and test target MCU and system performance.
Is the purpose of tests/benchmarks
is the same? To not confuse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the purpose of
tests/benchmarks
is the same?
More or less, yes.
samples/benchmarks/coremark/boards/nrf5340dk_nrf5340_cpuapp.overlay
Outdated
Show resolved
Hide resolved
samples/benchmarks/coremark/boards/nrf5340dk_nrf5340_cpunet.overlay
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As previously mentioned in the Discord thread, this is arguably better off as a standalone application (like the example-application) since coremark is not really a component that can be reused by applications (i.e. not a library).
But, I can also see the value of having this as an upstream test/sample in order to take advantage of the upstream CI and testing process to ensure that coremark can (ideally) run on all upstream Zephyr boards, since coremark is a quite popular benchmark suite that can be used for various purposes (e.g. Zephyr performance sanity check, performance comparisons among supported Zephyr boards and configurations, etc.).
CoreMark sample added. Sample is used to evaluate CPU performance. Multicore targets supported. Signed-off-by: Volodymyr Bondarchuk <[email protected]>
a196bb3
to
f824411
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution! As a matter of fact, I was thinking about adding CoreMark in Zephyr in one or another way myself for a long time and now it's almost there - which is very nice.
But please generalize it so that it becomes completely platform-agnostic and so:
- Usable by anybody on any platform of choice but what's even more important...
- It will be functionally tested in the Zephyr CI, so that we may make sure it works all the time!
config APP_MODE_FLASH_AND_RUN | ||
bool "Run CoreMark benchmark on start up" | ||
help | ||
If enabled, CoreMark will start execution immidietly after CPU start up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
immidietly -> immediately
bool "Run CoreMark benchmark on start up" | ||
help | ||
If enabled, CoreMark will start execution immidietly after CPU start up. | ||
Otherwise, it will wait button to be pressed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a questionable option, especially given by default it will wait for a button which it's not easy to press in remote and automated setups. I might not be well aware of the current state of tests and samples in Zephyr these days, but I would imagine most if not all of them are meant to be run immediately after loading binary onto the target.
And why messing with a button at all? For measuring time? ;)
Let's make is as platform neutral as possible.
|
||
source "share/sysbuild/Kconfig" | ||
|
||
config IMAGE_2_BOARD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is all that complexity needed? Zephyr images are meant to be built for a very particular platform and samples & tests are as platforms agnostic as it could be. I.e. user may select whatever combo of board+test he or she prefers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main idea here to build in one run for all cores in case if target is multicore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That I don't understand really. IMHO all the platform-specific things need to belong to that platform. I guess the same could be done for any other test, right? In that sense for your users it will be beneficial to get that functionality from your Zephyr platform/board. At the same time users of that sample on another platform don't want that complexity. Not to mention that simpler code is much easier to maintain.
The sample supports the following development kits: | ||
|
||
+--------------------------------+-----------+------------------------------------------------+------------------------------------+ | ||
|Hardware platforms |PCA |Board name |Build target | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, given the nature of the CoreMark (it only utilizes CPU core) I would strongly suggest removing all the references to particular boards. At best I would accept a comment "Tested on XXX", but given it is a platform-agnostic benchmark we should have it functionally tested in QEMU on all supported architectures. And then interested users will be able to run it on their platform of choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was planning to write the same thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will remove, agree
User interface | ||
************** | ||
|
||
Each target CPU has an assigned button and LED that are responsible for triggering the benchmark's start and ``test in progress``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto
* ``CONFIG_COREMARK_RUN_TYPE_VALIDATION`` - defines ``CONFIG_COREMARK_DATA_SIZE`` to 500 bytes. | ||
|
||
However, you can specify a custom ``CONFIG_COREMARK_DATA_SIZE`` value. | ||
If you want to register your results, see the Submitting Results chapter in the `CoreMark GitHub`_ repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I doubt it's even in theory possible to register results of CoreMark executed on top of Zephyr RTOS as it's quite unlikely is one of officially supported platforms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any requirements for the platforms to be registered there.
Platform is not a part of the report.
In the end, coremark results with zephyr toolchain are the same as with regular gcc.
I considered zephyr only as an environment easily to run the test. In single thread configuration there is no difference with baremetal execution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, there is only one requirement: coremark sources shouldn't be changed. The only files allowed to edit/change are core_portme.c/.h. And this requirement is met.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, thanks for the clarification. And I don't argue that running CoreMark on Zephyr might be very useful for different reasons, but due to the fact that Zephyr is very configurable and potentially the thread running CoreMark might be preempted by at least an interrupt handler, it doesn't look very robust environment for HW performance measurements.
Especially if we're talking about such a sensitive benchmark, which is being widely used for marketing purposes, thus must be really the best of the best :)
|
||
CoreMark can be executed in a multithread mode. | ||
To specify a number of threads, use the ``CONFIG_COREMARK_THREADS_NUMBER`` option. | ||
You can configure up to four threads. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why only 4 max?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is mostly due to a memory usage.
You can setup more if needed and you have enough or RAM.
But each thread requires ~2+K of main stack. And in my setup around 17K RAM is used for configuration with 4 threads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, but there might be a completely different board with more memory and multiple cores as well.
Moreover, why do we need that "multithreaded" mode here especially in such a limited (up to 4 threads max) implementation? I'd suggest we keep that sample as simple and minimalistic as possible and have all the multithreading fun with CoreMark Pro ported to Zephyr RTOS with help of POSIX layer. That way it will be possible to spawn even 100 threads and get them automatically scheduled on all cores of the SMP cluster. That requires though finalization of dynamic stack support for pthread_create()
, see #44727.
Naturally removal of multithreading here will considerably simplify the code here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as i played with this multithreading, there no significant performance drop when executing 4 threads.
Probably, if to increase number of threads, we will see it.
Coremark-pro is aimed to a more advanced MCUs\CPUs. this basic coremark is for simpler targets. How many threads in average MCU app for IoT executes? Not much, really.
And considering this, the sample can demonstrate that there no performance drop when running in multithreading mode for simple apps and motivate user to use zephyr as a base even for simple apps.
|
||
.. zephyr-app-commands:: | ||
:zephyr-app: samples/benchmark/coremark | ||
:board: nrf52840dk_nrf52840 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use QEMU something here so anybody may try it on itw own.
CONFIG_RTT_CONSOLE=y | ||
CONFIG_USE_SEGGER_RTT=y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good not to add RTT as default as a few SoCs do not use it. Can you move this to board specific instead?
#define BUTTON_NODE DT_ALIAS(button) | ||
#define LED_NODE DT_ALIAS(led) | ||
|
||
#if !DT_NODE_HAS_STATUS(BUTTON_NODE, okay) | ||
#error "Unsupported board: /"button/" alias is not defined." | ||
#endif | ||
|
||
#if !DT_NODE_HAS_STATUS(LED_NODE, okay) | ||
#error "Unsupported board: /"led/" alias is not defined." | ||
#endif | ||
|
||
#define BUTTON_LABEL DT_PROP(DT_ALIAS(button), label) | ||
|
||
static K_SEM_DEFINE(start_coremark, 0, 1); | ||
static bool coremark_in_progress; | ||
|
||
static const struct gpio_dt_spec start_button = GPIO_DT_SPEC_GET(BUTTON_NODE, gpios); | ||
static const struct gpio_dt_spec status_led = GPIO_DT_SPEC_GET(LED_NODE, gpios); | ||
|
||
static struct gpio_callback button_cb_data; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lots of boards do not have button or led embedded. Is there a way to remove all this in case CONFIG_APP_MODE_FLASH_AND_RUN
is used? Then any board would be able to run this benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do. Thanks for advice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Frankly, I'd suggest to remove that code at all for the sake of simplicity. The point is CoreMark is a benchmark for a CPU and not really dependent on any peripherals. That's why it's so popular - it's super portable. But adding all these bells and whistles we complicate code and limit its usability.
Moreover, why someone really needs LED indication of the test execution? And if there's really some use of that, I'd suggest to move that code to the wrapper code so that all tests could use that functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Buttons and LED is a simple way to control test externally.
There is CM (iterations/sec) and CM/MHz values, and no buttons/leds etc required to evaluate it.
But there is another one value - CM/mA. And to provide this test you must setup some external equipment (amperemeter etc.) and measure current consumption only for test execution time.
In this case you can trigger test start with button and do measurements only while led indicates test in progress
.
From human point of view - button and led. From automation point of view - that is simple gpio signals for triggering the test and tracking test in progress
state.
So, I would prefer to leave it and provide this external control. We can use flash & run as a default configuration and do not use any leds and gpios - so it would be possible to execute it on any board.
while (true) { | ||
k_sem_take(&start_coremark, K_FOREVER); | ||
LOG_INF("Coremark started!"); | ||
LOG_INF("CPU FREQ: %d Hz", SystemCoreClock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SystemCoreClock
is available for ARM devices. Would be good if changed to common Zephyr API to retrieve that information.
@volodymyr-bondarchuk, thanks for bringing this! |
Closing for now, since we do not plan to follow-up. |
Hi guys, |
I guess that's because nobody had plans to keep working on that sample. If you want to revive that work, please do and contribute your changes addressing above comments. |
Hi all. The question left actually do zephyr needs this sample in this form with multicore targets support and triggering by button, or simplified to flash&run mode and single core only. |
I think the simpler the better. Flash&Run makes more sense. |
As I explain in one of the comments above - there shoul be no platform-specific parts in that sample. If you feel that there needs to be some way of indication of test start & stop - that needs to be done in a more generic way so that:
If and when that generalization is done and that sample becomes clean and platform indeendent I'd be happy to have it merged. |
Pull request brings to zephyr the sample, that runs CoreMark benchmark for MCU performance evaluation.
It has: