{"id":3933,"date":"2018-05-16T11:50:08","date_gmt":"2018-05-16T18:50:08","guid":{"rendered":"https:\/\/visualgdb.com\/w\/?p=3933"},"modified":"2018-05-16T11:50:08","modified_gmt":"2018-05-16T18:50:08","slug":"measuring-the-relative-performance-of-the-stm32h7-devices","status":"publish","type":"post","link":"https:\/\/visualgdb.com\/tutorials\/arm\/stm32\/stm32h7\/","title":{"rendered":"Measuring the Relative Performance of the STM32H7 Devices"},"content":{"rendered":"<p>In this tutorial we will create a basic FreeRTOS-based project for the ultra high-speed STM32H7-Nucleo board and will then\u00a0measure the performance of several critical\u00a0paths comparing it to the STM32F4-Discovery and the STM32F7-Nucleo boards. We will\u00a0measure and compare the performance of 4 different\u00a0actions:<\/p>\n<ul>\n<li>The time to compute a sine value of a hardcoded argument (using the hardware floating point).<\/li>\n<li>The time between\u00a0one thread releasing a FreeRTOS semaphore and a higher-priority thread that was waiting for it executing some meaningful code.<\/li>\n<li>The time to sort a list of 100 elements using the std::list::sort() function. Although this is not the optimal way sort small\u00a0lists on embedded devices, comparing relative performance of the sorting\u00a0gives a basic\u00a0idea about\u00a0the performance of memory-intense algorithms.<\/li>\n<li>Finally we will\u00a0measure the run time of an empty function in order to estimate the overhead of profiling the embedded code.<\/li>\n<\/ul>\n<p>Before you begin, install VisualGDB 5.3 or later and ensure you have the latest version of the STM32 package and OpenOCD.<\/p>\n<ol>\n<li>Start Visual Studio and launch the VisualGDB Embedded Project Wizard:<a href=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/01-prjname.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-3934\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/01-prjname.png\" alt=\"01-prjname\" width=\"885\" height=\"550\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/01-prjname.png 885w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/01-prjname-300x186.png 300w\" sizes=\"(max-width: 885px) 100vw, 885px\" \/><\/a><\/li>\n<li>Select &#8220;New Project -&gt; Embedded Binary&#8221; and click &#8220;Next&#8221;:<a href=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/01-prjname.png\"><br \/>\n<\/a><a href=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/02-newprj.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-3935\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/02-newprj.png\" alt=\"02-newprj\" width=\"856\" height=\"693\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/02-newprj.png 856w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/02-newprj-300x243.png 300w\" sizes=\"(max-width: 856px) 100vw, 856px\" \/><\/a><\/li>\n<li>On the next page\u00a0choose an ARM toolchain and\u00a0select\u00a0the device you want to target. In this\u00a0tutorial we will use the STM32H7-Nucleo board that features the\u00a0STM32H743ZI microcontroller:<a href=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/03-device.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-3936\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/03-device.png\" alt=\"03-device\" width=\"856\" height=\"693\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/03-device.png 856w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/03-device-300x243.png 300w\" sizes=\"(max-width: 856px) 100vw, 856px\" \/><\/a><\/li>\n<li>On the next page select\u00a0&#8220;STM32CubeMX Samples -&gt; FreeRTOS_ThreadCreation&#8221;:<a href=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/04-threads.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-3937\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/04-threads.png\" alt=\"04-threads\" width=\"856\" height=\"693\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/04-threads.png 856w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/04-threads-300x243.png 300w\" sizes=\"(max-width: 856px) 100vw, 856px\" \/><\/a><\/li>\n<li>Connect your STM32H7-Discovery board to\u00a0your computer via USB:<a href=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/board.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-3949\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/board.jpg\" alt=\"board\" width=\"800\" height=\"321\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/board.jpg 800w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/board-300x120.jpg 300w\" sizes=\"(max-width: 800px) 100vw, 800px\" \/><\/a><\/li>\n<li>VisualGDB should automatically recognize the on-board ST-Link programmer and display it in the &#8220;Debug using&#8221; selector:<a href=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/05-stlink.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-3938\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/05-stlink.png\" alt=\"05-stlink\" width=\"856\" height=\"693\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/05-stlink.png 856w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/05-stlink-300x243.png 300w\" sizes=\"(max-width: 856px) 100vw, 856px\" \/><\/a>Once\u00a0the ST-Link is selected,\u00a0press &#8220;Finish&#8221; to\u00a0create the project.<\/li>\n<li>As of v1.2.0, the STM32H7 SDK from ST contains incorrect clock initialization code in the FreeRTOS example. To fix it, replace the SystemClock_Config() function with the following one taken from the HTTP Server example:\n<pre class=\"\">static void SystemClock_Config(void)\r\n{\r\n    RCC_ClkInitTypeDef RCC_ClkInitStruct;\r\n    RCC_OscInitTypeDef RCC_OscInitStruct;\r\n    HAL_StatusTypeDef ret = HAL_OK;\r\n \r\n    \/*!&lt; Supply configuration update enable *\/\r\n    MODIFY_REG(PWR-&gt;CR3, PWR_CR3_SCUEN, 0);\r\n \r\n    \/* The voltage scaling allows optimizing the power consumption when the device is \r\n    clocked below the maximum system frequency, to update the voltage scaling value \r\n    regarding system frequency refer to product datasheet. *\/\r\n    __HAL_PWR_VOLTAGESCALING_CONFIG(PWR_REGULATOR_VOLTAGE_SCALE1);\r\n \r\n    while (!__HAL_PWR_GET_FLAG(PWR_FLAG_VOSRDY)) {}\r\n \r\n    \/* Enable HSE Oscillator and activate PLL with HSE as source *\/\r\n    RCC_OscInitStruct.OscillatorType = RCC_OSCILLATORTYPE_HSE;\r\n    RCC_OscInitStruct.HSEState = RCC_HSE_BYPASS;\r\n    RCC_OscInitStruct.HSIState = RCC_HSI_OFF;\r\n    RCC_OscInitStruct.CSIState = RCC_CSI_OFF;\r\n    RCC_OscInitStruct.PLL.PLLState = RCC_PLL_ON;\r\n    RCC_OscInitStruct.PLL.PLLSource = RCC_PLLSOURCE_HSE;\r\n \r\n    RCC_OscInitStruct.PLL.PLLM = 4;\r\n    RCC_OscInitStruct.PLL.PLLN = 400;\r\n    RCC_OscInitStruct.PLL.PLLP = 2;\r\n    RCC_OscInitStruct.PLL.PLLR = 2;\r\n    RCC_OscInitStruct.PLL.PLLQ = 4;\r\n \r\n    RCC_OscInitStruct.PLL.PLLVCOSEL = RCC_PLL1VCOWIDE;\r\n    RCC_OscInitStruct.PLL.PLLRGE = RCC_PLL1VCIRANGE_2;\r\n    ret = HAL_RCC_OscConfig(&amp;RCC_OscInitStruct);\r\n    if (ret != HAL_OK)\r\n    {\r\n         asm(\"bkpt 255\");\r\n    }\r\n \r\n    \/* Select PLL as system clock source and configure bus clocks dividers *\/\r\n    RCC_ClkInitStruct.ClockType = (RCC_CLOCKTYPE_SYSCLK | RCC_CLOCKTYPE_HCLK | RCC_CLOCKTYPE_D1PCLK1 | RCC_CLOCKTYPE_PCLK1 | \\\r\n    RCC_CLOCKTYPE_PCLK2 | RCC_CLOCKTYPE_D3PCLK1);\r\n \r\n    RCC_ClkInitStruct.SYSCLKSource = RCC_SYSCLKSOURCE_PLLCLK;\r\n    RCC_ClkInitStruct.SYSCLKDivider = RCC_SYSCLK_DIV1;\r\n    RCC_ClkInitStruct.AHBCLKDivider = RCC_HCLK_DIV2;\r\n    RCC_ClkInitStruct.APB3CLKDivider = RCC_APB3_DIV2; \r\n    RCC_ClkInitStruct.APB1CLKDivider = RCC_APB1_DIV2; \r\n    RCC_ClkInitStruct.APB2CLKDivider = RCC_APB2_DIV2; \r\n    RCC_ClkInitStruct.APB4CLKDivider = RCC_APB4_DIV2; \r\n    ret = HAL_RCC_ClockConfig(&amp;RCC_ClkInitStruct, FLASH_LATENCY_4);\r\n    if (ret != HAL_OK)\r\n    {\r\n        asm(\"bkpt 255\");\r\n    }\r\n \r\n    \/*activate CSI clock mondatory for I\/O Compensation Cell*\/ \r\n    __HAL_RCC_CSI_ENABLE();\r\n \r\n    \/* Enable SYSCFG clock mondatory for I\/O Compensation Cell *\/\r\n    __HAL_RCC_SYSCFG_CLK_ENABLE();\r\n \r\n    \/* Enables the I\/O Compensation Cell *\/ \r\n    HAL_EnableCompensationCell(); \r\n}<\/pre>\n<\/li>\n<li>Switch the project to the Release configuration and enable both RTOS tracing and function tracing via VisualGDB Project\u00a0Properties -&gt; Dynamic\u00a0Analysis:<br \/>\n<a href=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/06-addref.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-3939\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/06-addref.png\" alt=\"06-addref\" width=\"1175\" height=\"753\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/06-addref.png 1175w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/06-addref-300x192.png 300w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/06-addref-1024x656.png 1024w\" sizes=\"(max-width: 1175px) 100vw, 1175px\" \/><\/a>Proceed with referencing the profiler framework as VisualGDB suggests.<\/li>\n<li>If you try building the project now, the profiler framework will report\u00a0a missing USE_FREERTOS macro.\u00a0This happens because the sample project we used in this tutorial comes directly from the ST samples and does not contain VisualGDB-specific macros:<br \/>\n<a href=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/07-mising.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-3940\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/07-mising.png\" alt=\"07-mising\" width=\"1187\" height=\"724\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/07-mising.png 1187w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/07-mising-300x183.png 300w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/07-mising-1024x625.png 1024w\" sizes=\"(max-width: 1187px) 100vw, 1187px\" \/><\/a><\/li>\n<li>Add the USE_FREERTOS macro via VisualGDB Project Properties -&gt; MSBuild and the project will build successfully:<a href=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/08-macro.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-3941\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/08-macro.png\" alt=\"08-macro\" width=\"816\" height=\"499\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/08-macro.png 816w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/08-macro-300x183.png 300w\" sizes=\"(max-width: 816px) 100vw, 816px\" \/><\/a><\/li>\n<li>Finally replace the main() function\u00a0and the 2 thread\u00a0functions with the following code:\n<pre class=\"\">#include \"main.h\"\r\n#include \"cmsis_os.h\"\r\n#include &lt;math.h&gt;\r\n\r\nosThreadId LEDThread1Handle, LEDThread2Handle;\r\nstatic void SenderThread(void const *argument);\r\nstatic void ReceiverThread(void const *argument);\r\nstatic void SystemClock_Config(void);\r\nstatic void CPU_CACHE_Enable(void);\r\n\r\nosSemaphoreDef(s_Semaphore);\r\nosSemaphoreId(s_SemaphoreId);\r\n\r\nint main(void)\r\n{\r\n    \/* Enable the CPU Cache *\/\r\n    CPU_CACHE_Enable();\r\n\r\n    HAL_Init();\r\n    SystemClock_Config();\r\n\r\n    osThreadDef(LED1, SenderThread, osPriorityNormal, 0, configMINIMAL_STACK_SIZE);\r\n    osThreadDef(LED2, ReceiverThread, osPriorityHigh, 0, configMINIMAL_STACK_SIZE);\r\n\r\n    LEDThread1Handle = osThreadCreate(osThread(LED1), NULL);\r\n    LEDThread2Handle = osThreadCreate(osThread(LED2), NULL);\r\n\r\n    s_SemaphoreId = osSemaphoreCreate(osSemaphore(s_Semaphore), 32);\r\n    osKernelStart();\r\n}\r\n\r\nvoid __attribute__((noinline)) TestSinf()\r\n{\r\n    volatile float in = 0.1234F, out;\r\n    for (int i = 0; i &lt; 100; i++)\r\n        out = sinf(in);\r\n}\r\n\r\nvoid __attribute__((noinline)) EmptyFunction()\r\n{\r\n    asm(\"nop\");\r\n}\r\n\r\nstatic void SenderThread(void const *argument)\r\n{\r\n    uint32_t count = 0;\r\n\r\n    for (;;)\r\n    {\r\n        TestSinf();\r\n \r\n        osSemaphoreRelease(s_SemaphoreId);\r\n        osDelay(100);\r\n    }\r\n}\r\n\r\nstatic void ReceiverThread(void const *argument)\r\n{\r\n    for (;;)\r\n    {\r\n        osSemaphoreWait(s_SemaphoreId, osWaitForever);\r\n        EmptyFunction();\r\n    }\r\n}<\/pre>\n<p>The code above creates 2 threads:<\/p>\n<ul>\n<li>The sender thread will call the\u00a0TestSinf() function that calls sinf() 100 times and returns.\u00a0Then it will release a semaphore and\u00a0sleep for 100 milliseconds.\u00a0We will measure the time taken by the TestSinf() function\u00a0on different devices to do a basic comparison of their floating-point performance.<\/li>\n<li>The receiver thread will wait on the semaphore released by the sender thread and then will immediately call EmptyFunction(). We will measure the time before\u00a0the call to osSemaphoreWait() and the subsequent invocation of EmptyFunction() to\u00a0estimate the FreeRTOS thread switching latency.<\/li>\n<\/ul>\n<\/li>\n<li>Build the code, set a breakpoint inside SenderThread() and start\u00a0debugging by pressing F5. Once the breakpoint hits, open the Debug-&gt;Windows-&gt;Real-time Watch window:<br \/>\n<a href=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/09-led.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-3942\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/09-led.png\" alt=\"09-led\" width=\"1187\" height=\"724\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/09-led.png 1187w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/09-led-300x183.png 300w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/09-led-1024x625.png 1024w\" sizes=\"(max-width: 1187px) 100vw, 1187px\" \/><\/a><\/li>\n<li>Add both threads (LED1 and LED2) and the TestSinf(), EmptyFunction() and osSemaphoreRelease() functions to real-time watch, then press F5 to continue debugging. VisualGDB will display short bursts of activity every 100\u00a0milliseconds corresponding\u00a0to the thread activity:<a href=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/10-run.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-3943\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/10-run.png\" alt=\"10-run\" width=\"1187\" height=\"724\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/10-run.png 1187w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/10-run-300x183.png 300w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/10-run-1024x625.png 1024w\" sizes=\"(max-width: 1187px) 100vw, 1187px\" \/><\/a><\/li>\n<li>Zoom\u00a0into one of the bursts.\u00a0Hover the mouse over the TestSinf() call to see its run time:<a href=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/11-sinf.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-3944\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/11-sinf.png\" alt=\"11-sinf\" width=\"1187\" height=\"724\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/11-sinf.png 1187w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/11-sinf-300x183.png 300w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/11-sinf-1024x625.png 1024w\" sizes=\"(max-width: 1187px) 100vw, 1187px\" \/><\/a><\/li>\n<li>Similarly check the runtime of EmptyFunction():<a href=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/12-func.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-3945\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/12-func.png\" alt=\"12-func\" width=\"1187\" height=\"724\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/12-func.png 1187w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/12-func-300x183.png 300w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/12-func-1024x625.png 1024w\" sizes=\"(max-width: 1187px) 100vw, 1187px\" \/><\/a><\/li>\n<li>Select the time between the call to osSemaphoreRelease and the invocation of EmptyFunction() in the second thread. This\u00a0represents the time required for the\u00a0threads to switch (that includes changing the thread state, selecting the next thread to run, etc):<a href=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/13-latency.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-3946\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/13-latency.png\" alt=\"13-latency\" width=\"1187\" height=\"724\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/13-latency.png 1187w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/13-latency-300x183.png 300w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/13-latency-1024x625.png 1024w\" sizes=\"(max-width: 1187px) 100vw, 1187px\" \/><\/a><\/li>\n<li>Finally add\u00a0a new C++ source file to the project\u00a0with the following contents:\n<pre class=\"\">#include &lt;list&gt;\r\n#include &lt;stdlib.h&gt;\r\n\r\nextern \"C\" void TestList()\r\n{\r\n    std::list&lt;int&gt; lst;\r\n    for (int i = 0; i &lt; 100; i++)\r\n        lst.push_back(rand());\r\n\r\n    lst.sort();\r\n}<\/pre>\n<p>Then call TestList() from main().<\/li>\n<li>Set a breakpoint in TestList() and once it hits, add the list::sort() method to real-time watch:<a href=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/14-sort.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-3947\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/14-sort.png\" alt=\"14-sort\" width=\"1187\" height=\"724\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/14-sort.png 1187w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/14-sort-300x183.png 300w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/14-sort-1024x625.png 1024w\" sizes=\"(max-width: 1187px) 100vw, 1187px\" \/><\/a><\/li>\n<li>Once the sort method\u00a0finishes running, check its runtime in Real-time watch:<a href=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/14-sorttime.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-3948\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/14-sorttime.png\" alt=\"14-sorttime\" width=\"1187\" height=\"724\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/14-sorttime.png 1187w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/14-sorttime-300x183.png 300w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2018\/05\/14-sorttime-1024x625.png 1024w\" sizes=\"(max-width: 1187px) 100vw, 1187px\" \/><\/a><\/li>\n<li>You can repeat those measurements on other boards to compare their relative performance. Below are the figures we measured:<br \/>\n<table>\n<tbody>\n<tr>\n<td>Device<\/td>\n<td>SystemCoreClock<\/td>\n<td>100x sinf()<\/td>\n<td>Empty function<\/td>\n<td>Latency<\/td>\n<td>list&lt;int&gt;::sort() of 100 elements<\/td>\n<\/tr>\n<tr>\n<td>STM32H746ZI<\/td>\n<td>400 MHz<\/td>\n<td>32 uS (12.6K\u00a0cycles)<\/td>\n<td>232 ns (93 cycles)<\/td>\n<td>5.9 uS (2359 cycles)<\/td>\n<td>104 uS (41.5K\u00a0cycles)<\/td>\n<\/tr>\n<tr>\n<td>STM32F746NG<\/td>\n<td>216 MHz<\/td>\n<td>69 uS (13K cycles)<\/td>\n<td>741 ns (160 cycles)<\/td>\n<td>9.3 uS (2010 cycles)<\/td>\n<td>192 uS (41.5K cycles)<\/td>\n<\/tr>\n<tr>\n<td>STM32F407VG<\/td>\n<td>168 MHz<\/td>\n<td>105 uS (17.6K cycles)<\/td>\n<td>756 ns (127 cycles)<\/td>\n<td>17 uS (2876 cycles)<\/td>\n<td>431 uS (72.8K cycles)<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The high-speed STMH746ZI device shows a\u00a0significant improvement to the older STM32F407VG device in terms of clock cycles due to a\u00a0much more capable\u00a0ARM Cortex-M7 core (compared to Cortex-M4). Combined with a 2.3x increase of the system clock this yields 3x-4x\u00a0speedup.<br \/>\nCompared to the same-core STM32F46NG device, the performance in terms of cycles stays relatively the same, although the 1.85x clock frequency\u00a0increase still provides a\u00a0consistent performance\u00a0boost in all tests.<\/li>\n<li>If you\u00a0would like to compare the performance of your code on different devices, you can\u00a0either use the real-time watch mechanism shown in this tutorial, or try the zero-overhead\u00a0<a href=\"https:\/\/visualgdb.com\/tutorials\/arm\/chronometer\/\">Chronometer<\/a>\u00a0feature\u00a0that will measure the\u00a0clock cycle counts between breakpoints, steps and other debug events.<\/li>\n<\/ol>\n","protected":false},"excerpt":{"rendered":"<p>In this tutorial we will create a basic FreeRTOS-based project for the ultra high-speed STM32H7-Nucleo board and will then\u00a0measure the<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[27],"tags":[53,61,163],"_links":{"self":[{"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/posts\/3933"}],"collection":[{"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/comments?post=3933"}],"version-history":[{"count":4,"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/posts\/3933\/revisions"}],"predecessor-version":[{"id":3988,"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/posts\/3933\/revisions\/3988"}],"wp:attachment":[{"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/media?parent=3933"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/categories?post=3933"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/tags?post=3933"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}