{"id":2518,"date":"2017-02-01T21:27:18","date_gmt":"2017-02-02T05:27:18","guid":{"rendered":"https:\/\/visualgdb.com\/w\/?p=2518"},"modified":"2017-02-01T21:27:18","modified_gmt":"2017-02-02T05:27:18","slug":"using-embedded-profiler-on-platforms-with-no-cycle-counter","status":"publish","type":"post","link":"https:\/\/visualgdb.com\/tutorials\/profiler\/realtime\/cyclecnt\/","title":{"rendered":"Using Embedded Profiler on Platforms with no Cycle Counter"},"content":{"rendered":"<p>This tutorial shows how to use the VisualGDB Instrumenting Profiler and Real-time Watch on devices that do not support debug instruction count registers. We will create a basic project for the STM32F7Discovery board, try to measure function run time using the default real-time watch configuration and show how to fix the problems that will arise.<\/p>\n<p>Before you begin, install VisualGDB 5.2R8 or later.<\/p>\n<ol>\n<li>Start Visual Studio and open the VisualGDB Embedded Project Wizard:<a href=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/01-prjname.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2519\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/01-prjname.png\" alt=\"01-prjname\" width=\"801\" height=\"553\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/01-prjname.png 801w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/01-prjname-300x207.png 300w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/01-prjname-392x272.png 392w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/01-prjname-130x90.png 130w\" sizes=\"(max-width: 801px) 100vw, 801px\" \/><\/a><\/li>\n<li>Proceed with creating the normal application project:<a href=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/02-defaultprj.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2520\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/02-defaultprj.png\" alt=\"02-defaultprj\" width=\"822\" height=\"642\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/02-defaultprj.png 822w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/02-defaultprj-300x234.png 300w\" sizes=\"(max-width: 822px) 100vw, 822px\" \/><\/a><\/li>\n<li>Select the ARM toolchain and your device. In this tutorial we will use the STM32F7-Discovery board that comes with the STM32F746NG chip:<a href=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/03-device.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2521\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/03-device.png\" alt=\"03-device\" width=\"822\" height=\"642\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/03-device.png 822w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/03-device-300x234.png 300w\" sizes=\"(max-width: 822px) 100vw, 822px\" \/><\/a><\/li>\n<li>Select the regular LEDBlink sample:<a href=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/04-sample.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2522\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/04-sample.png\" alt=\"04-sample\" width=\"822\" height=\"642\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/04-sample.png 822w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/04-sample-300x234.png 300w\" sizes=\"(max-width: 822px) 100vw, 822px\" \/><\/a><\/li>\n<li>Finally select the debug method that works with your board. In this example we will use OpenOCD with the on-board ST-Link:<a href=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/05-debug.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2523\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/05-debug.png\" alt=\"05-debug\" width=\"822\" height=\"642\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/05-debug.png 822w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/05-debug-300x234.png 300w\" sizes=\"(max-width: 822px) 100vw, 822px\" \/><\/a><\/li>\n<li>Press &#8220;Finish&#8221; to create the project. Then open VisualGDB Project Properties, go to the Dynamic Analysis page and enable tracing of function calls:<a href=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/06-realtime.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2524\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/06-realtime.png\" alt=\"06-realtime\" width=\"786\" height=\"594\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/06-realtime.png 786w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/06-realtime-300x227.png 300w\" sizes=\"(max-width: 786px) 100vw, 786px\" \/><\/a>Don&#8217;t forget to click &#8220;Add reference automatically&#8221; to automatically add and configure the profiler framework.<\/li>\n<li>Start debugging your program and add &#8220;HAL_Delay&#8221; to real-time watch:<a href=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/07-badtimings.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2525\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/07-badtimings.png\" alt=\"07-badtimings\" width=\"1019\" height=\"700\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/07-badtimings.png 1019w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/07-badtimings-300x206.png 300w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/07-badtimings-130x90.png 130w\" sizes=\"(max-width: 1019px) 100vw, 1019px\" \/><\/a><\/li>\n<li>You will see that real-time watch window will be empty. This happens because the default implementation of the instrumenting profiler and real-time watch relies on the debug cycle count register (DWT_CYCCNT) that is not supported on STM32F7 and is hence always zero:<a href=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/zerotime.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2532\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/zerotime.png\" alt=\"zerotime\" width=\"1019\" height=\"700\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/zerotime.png 1019w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/zerotime-300x206.png 300w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/zerotime-130x90.png 130w\" sizes=\"(max-width: 1019px) 100vw, 1019px\" \/><\/a><\/li>\n<li>We will now replace the original function used to query the cycle counter with a custom one that will use the STM32 timers. First, open VisualGDB Project Properties and enable the &#8220;Use custom performance counter function&#8221; checkbox on the Embedded Frameworks page:<a href=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/08-driver.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2526\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/08-driver.png\" alt=\"08-driver\" width=\"961\" height=\"743\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/08-driver.png 961w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/08-driver-300x232.png 300w\" sizes=\"(max-width: 961px) 100vw, 961px\" \/><\/a><\/li>\n<li>If you try building your project now, it will complain that the SysprogsInstrumentingProfiler_QueryAndResetPerformanceCounter() function is missing:<a href=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/09-missing.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2527\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/09-missing.png\" alt=\"09-missing\" width=\"1019\" height=\"700\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/09-missing.png 1019w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/09-missing-300x206.png 300w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/09-missing-130x90.png 130w\" sizes=\"(max-width: 1019px) 100vw, 1019px\" \/><\/a><\/li>\n<li>Add the following code to your main source file and call StartDelayCountingTimer() after InitializeInstrumentingProfiler():\n<pre class=\"\">static TIM_HandleTypeDef s_TimerInstance = { \r\n\u00a0\u00a0\u00a0 .Instance = TIM3\r\n};\r\n\r\nstatic unsigned g_TimerCounter;\r\n\r\nextern \"C\" void TIM3_IRQHandler()\r\n{\r\n\u00a0\u00a0\u00a0 HAL_TIM_IRQHandler(&amp;s_TimerInstance);\r\n\u00a0\u00a0\u00a0 g_TimerCounter++;\r\n}\r\n\r\nvoid StartDelayCountingTimer()\r\n{\r\n\u00a0\u00a0\u00a0 __TIM3_CLK_ENABLE();\r\n\u00a0\u00a0\u00a0 s_TimerInstance.Init.Prescaler = 1;\r\n\u00a0\u00a0\u00a0 s_TimerInstance.Init.CounterMode = TIM_COUNTERMODE_UP;\r\n\u00a0\u00a0\u00a0 s_TimerInstance.Init.Period = 0xFFFF;\r\n\u00a0\u00a0\u00a0 s_TimerInstance.Init.ClockDivision = TIM_CLOCKDIVISION_DIV1;\r\n\u00a0\u00a0\u00a0 s_TimerInstance.Init.RepetitionCounter = 0;\r\n\u00a0\u00a0\u00a0 HAL_TIM_Base_Init(&amp;s_TimerInstance);\r\n\u00a0\u00a0\u00a0 HAL_TIM_Base_Start_IT(&amp;s_TimerInstance);\r\n\u00a0\u00a0\u00a0 HAL_NVIC_EnableIRQ(TIM3_IRQn);\r\n\u00a0\u00a0\u00a0 HAL_NVIC_SetPriority(TIM3_IRQn, 7, 0);\r\n}\r\n\r\nunsigned long long SysprogsInstrumentingProfiler_ReadTimerValue()\r\n{\r\n\u00a0\u00a0\u00a0 int primask = __get_PRIMASK();\r\n\u00a0\u00a0\u00a0 __set_PRIMASK(1);\r\n\u00a0\u00a0\u00a0 unsigned lowWord = __HAL_TIM_GET_COUNTER(&amp;s_TimerInstance);\r\n\u00a0\u00a0\u00a0 unsigned highWord;\r\n\u00a0\u00a0\u00a0 if (lowWord &lt; 1024)\r\n\u00a0\u00a0\u00a0 {\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 highWord = g_TimerCounter;\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 if (HAL_NVIC_GetPendingIRQ(TIM3_IRQn))\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 highWord++;\u00a0\u00a0 \u00a0\r\n\u00a0\u00a0\u00a0 }\r\n\u00a0\u00a0\u00a0 else\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 highWord = g_TimerCounter;\r\n\u00a0\u00a0\u00a0 __set_PRIMASK(primask);\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \u00a0\r\n\u00a0\u00a0\u00a0 return (((unsigned long long)highWord) &lt;&lt; 16) | lowWord;\r\n}\r\n\r\nextern \"C\" unsigned SysprogsInstrumentingProfiler_QueryAndResetPerformanceCounter()\r\n{\r\n\u00a0\u00a0\u00a0 static unsigned long long s_PrevValue;\r\n\u00a0\u00a0\u00a0 unsigned long long value = SysprogsInstrumentingProfiler_ReadTimerValue();\r\n\u00a0\u00a0\u00a0 unsigned long long elapsed = value - s_PrevValue;\r\n\u00a0\u00a0\u00a0 s_PrevValue = value;\r\n\u00a0\u00a0\u00a0 if (elapsed &gt; UINT32_MAX)\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 return UINT32_MAX;\r\n\u00a0\u00a0\u00a0 else\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 return (unsigned)elapsed;\r\n}<\/pre>\n<p>The StartDelayCountingTimer() function will configure TIM3 to run at half the system clock speed. As the hardware counter is only 16 bits wide, the TIM3_IRQHandler() function will increase the g_TimerCounter value each time an overflow happens to keep the track of the global time. The SysprogsInstrumentingProfiler_ReadTimerValue() function will read the TIM3 counter value and combine it with the g_TimerCounter to get a 48-bit value. Finally SysprogsInstrumentingProfiler_QueryAndResetPerformanceCounter() called by the profiler will return the amount of ticks passed since the last call to it.<\/li>\n<li>Now you can press F5 to start debugging. The Real-time watch should now work:<a href=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/10-period.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2528\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/10-period.png\" alt=\"10-period\" width=\"1019\" height=\"700\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/10-period.png 1019w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/10-period-300x206.png 300w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/10-period-130x90.png 130w\" sizes=\"(max-width: 1019px) 100vw, 1019px\" \/><\/a>If it does not work, ensure you call StartDelayCountingTimer() after InitializeInstrumentingProfiler().<\/li>\n<li>In this example the time intervals shown in real-time watch will be twice shorter than the actual intervals (e.g. 250ms for a 500ms wait). This happens because the timer runs at half of the system clock. You can report the correct amount of ticks per second to VisualGDB by calling the ReportTicksPerSecond() function after starting the timer:\n<pre class=\"\">ReportTicksPerSecond(SystemCoreClock \/ 2);<\/pre>\n<p><a href=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/11-fixperiod.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2529\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/11-fixperiod.png\" alt=\"11-fixperiod\" width=\"1019\" height=\"700\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/11-fixperiod.png 1019w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/11-fixperiod-300x206.png 300w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/11-fixperiod-130x90.png 130w\" sizes=\"(max-width: 1019px) 100vw, 1019px\" \/><\/a><\/li>\n<li>To reduce the overhead caused by the TIM3_IRQHandler() function you can change it to clear the interrupt flag without calling the HAL functions:\n<pre class=\"\">extern \"C\" void TIM3_IRQHandler()\r\n{\r\n\u00a0\u00a0\u00a0 __HAL_TIM_CLEAR_IT(&amp;s_TimerInstance, TIM_IT_UPDATE);\r\n\u00a0\u00a0\u00a0 g_TimerCounter++;\r\n}<\/pre>\n<\/li>\n<li>To further reduce the overhead you can mark it as non-instrumentable via VisualGDB Project Properties: <a href=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/12-noprofile.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2530\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/12-noprofile.png\" alt=\"12-noprofile\" width=\"1009\" height=\"788\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/12-noprofile.png 1009w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2017\/02\/12-noprofile-300x234.png 300w\" sizes=\"(max-width: 1009px) 100vw, 1009px\" \/><\/a><\/li>\n<li>If the overhead caused by the new SysprogsInstrumentingProfiler_QueryAndResetPerformanceCounter() function is still to high, you can limit it to only reading the 16-bit counter and disable the timer interrupt:\n<pre class=\"\">extern \"C\" unsigned SysprogsInstrumentingProfiler_QueryAndResetPerformanceCounter()\r\n{\r\n\u00a0\u00a0\u00a0 unsigned value = __HAL_TIM_GET_COUNTER(&amp;s_TimerInstance);\r\n\u00a0\u00a0\u00a0 __HAL_TIM_SET_COUNTER(&amp;s_TimerInstance, 0);\r\n\u00a0\u00a0\u00a0 return value;\r\n}<\/pre>\n<p>Note that this will break the reported timings if the 16-bit timer overflows between 2 consecutive events, that could be counteracted by lowering the prescaler:<\/p>\n<pre class=\"\">\u00a0\u00a0\u00a0 s_TimerInstance.Init.Period = 0xFFFF;<\/pre>\n<\/li>\n<\/ol>\n","protected":false},"excerpt":{"rendered":"<p>This tutorial shows how to use the VisualGDB Instrumenting Profiler and Real-time Watch on devices that do not support debug<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[107],"tags":[],"_links":{"self":[{"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/posts\/2518"}],"collection":[{"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/comments?post=2518"}],"version-history":[{"count":1,"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/posts\/2518\/revisions"}],"predecessor-version":[{"id":2533,"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/posts\/2518\/revisions\/2533"}],"wp:attachment":[{"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/media?parent=2518"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/categories?post=2518"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/tags?post=2518"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}