{"id":2209,"date":"2017-01-31T20:56:55","date_gmt":"2017-02-01T04:56:55","guid":{"rendered":"http:\/\/visualgdb.com\/w\/?p=2209"},"modified":"2017-01-31T20:56:55","modified_gmt":"2017-02-01T04:56:55","slug":"using-visualgdb-freertos-tracing-to-optimize-real-time-code","status":"publish","type":"post","link":"https:\/\/visualgdb.com\/tutorials\/profiler\/realtime\/freertos\/","title":{"rendered":"Using VisualGDB FreeRTOS Tracing to Optimize Real-time Code"},"content":{"rendered":"<p>This tutorial shows how to use the FreeRTOS tracing feature of VisualGDB to optimize a simple FreeRTOS UART driver based on queues. We will create a basic project using queues to buffer the incoming and outgoing UART data, measure the delays in various components of our setup and show how to optimize them.<\/p>\n<p>Before you begin, install VisualGDB 5.2 or later.<\/p>\n<ol>\n<li>Ensure that you are using a board that can communicate with your computer via the COM port. We will use the STM32F410-Nucleo board that has an on-board ST-Link 2.1 that provides a virtual COM port connected to UART2 on the device, however the steps below will work for any other USB-to-UART bridge as long as it is connected to the device. Plug in the board and take a note of the COM port number in Device Manager:<a href=\"http:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/01-devmgr.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2210\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/01-devmgr.png\" alt=\"01-devmgr\" width=\"657\" height=\"441\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/01-devmgr.png 657w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/01-devmgr-300x201.png 300w\" sizes=\"(max-width: 657px) 100vw, 657px\" \/><\/a><\/li>\n<li>Start Visual Studio and open the VisualGDB Embedded Project Wizard:<a href=\"http:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/02-prjname.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2211\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/02-prjname.png\" alt=\"02-prjname\" width=\"935\" height=\"506\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/02-prjname.png 935w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/02-prjname-300x162.png 300w\" sizes=\"(max-width: 935px) 100vw, 935px\" \/><\/a><\/li>\n<li>Select &#8220;Create a new project -&gt; Embedded Binary&#8221;:<a href=\"http:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/03-device3.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2213\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/03-msbuild.png\" alt=\"03-msbuild\" width=\"738\" height=\"565\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/03-msbuild.png 738w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/03-msbuild-300x230.png 300w\" sizes=\"(max-width: 738px) 100vw, 738px\" \/><\/a><\/li>\n<li>Select the ARM toolchain and your device. We will use the STM32F410RB device installed on the STM32F410-Nucleo board:<a href=\"http:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/03-device3.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2212\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/03-device3.png\" alt=\"03-device\" width=\"738\" height=\"601\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/03-device3.png 738w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/03-device3-300x244.png 300w\" sizes=\"(max-width: 738px) 100vw, 738px\" \/><\/a><\/li>\n<li>Select the &#8220;LEDBlink (FreeRTOS)&#8221; sample and ensure that the FreeRTOS CPU core matches your CPU core and the floating point setting on the previous page: <a href=\"http:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/04-rtos.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2214\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/04-rtos.png\" alt=\"04-rtos\" width=\"738\" height=\"601\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/04-rtos.png 738w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/04-rtos-300x244.png 300w\" sizes=\"(max-width: 738px) 100vw, 738px\" \/><\/a><\/li>\n<li>Select OpenOCD as the debug method and click &#8220;Detect&#8221; to automatically detect the rest of the settings:<a href=\"http:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/05-debug3.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2215\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/05-debug3.png\" alt=\"05-debug\" width=\"738\" height=\"601\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/05-debug3.png 738w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/05-debug3-300x244.png 300w\" sizes=\"(max-width: 738px) 100vw, 738px\" \/><\/a><\/li>\n<li>Press &#8220;Finish&#8221; to generate the project. The first thing we will do is test that the UART connection is working. Change the main source file extension to <strong>.cpp<\/strong> and replace its contents with the following code:\n<pre class=\"\">#include &lt;stm32f4xx_hal.h&gt;\r\n\r\n#ifdef __cplusplus\r\nextern \"C\"\r\n#endif\r\nvoid SysTick_Handler(void)\r\n{\r\n\u00a0\u00a0\u00a0 HAL_IncTick();\r\n\u00a0\u00a0\u00a0 HAL_SYSTICK_IRQHandler();\r\n}\r\n\r\nUART_HandleTypeDef g_UART;\r\n\r\nextern \"C\" void USART2_IRQHandler()\r\n{\r\n\u00a0\u00a0\u00a0 HAL_UART_IRQHandler(&amp;g_UART);\r\n}\r\n\r\nint main(void)\r\n{\r\n\u00a0\u00a0\u00a0 HAL_Init();\r\n\u00a0\u00a0 \u00a0\r\n\u00a0\u00a0\u00a0 __USART2_CLK_ENABLE();\r\n\u00a0\u00a0\u00a0 __GPIOA_CLK_ENABLE();\r\n\u00a0\u00a0 \u00a0\r\n\u00a0\u00a0\u00a0 g_UART.Instance\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 = USART2;\r\n\u00a0\u00a0\u00a0 g_UART.Init.BaudRate\u00a0\u00a0 = 115200;\r\n\u00a0\u00a0\u00a0 g_UART.Init.WordLength = UART_WORDLENGTH_8B;\r\n\u00a0\u00a0\u00a0 g_UART.Init.StopBits\u00a0\u00a0 = UART_STOPBITS_1;\r\n\u00a0\u00a0\u00a0 g_UART.Init.Parity\u00a0\u00a0\u00a0\u00a0 = UART_PARITY_NONE;\r\n\u00a0\u00a0\u00a0 g_UART.Init.HwFlowCtl\u00a0 = UART_HWCONTROL_NONE;\r\n\u00a0\u00a0\u00a0 g_UART.Init.Mode\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 = UART_MODE_TX_RX;\r\n\u00a0\u00a0\u00a0 if (HAL_UART_Init(&amp;g_UART) != HAL_OK)\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 asm(\"bkpt 255\");\r\n\u00a0\u00a0 \u00a0\r\n\u00a0\u00a0\u00a0 GPIO_InitTypeDef\u00a0 GPIO_InitStruct;\r\n\u00a0 \r\n\u00a0\u00a0\u00a0 GPIO_InitStruct.Pin\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 = GPIO_PIN_3;\r\n\u00a0\u00a0\u00a0 GPIO_InitStruct.Mode\u00a0\u00a0\u00a0\u00a0\u00a0 = GPIO_MODE_AF_PP;\r\n\u00a0\u00a0\u00a0 GPIO_InitStruct.Pull\u00a0\u00a0\u00a0\u00a0\u00a0 = GPIO_PULLUP;\r\n\u00a0\u00a0\u00a0 GPIO_InitStruct.Speed\u00a0\u00a0\u00a0\u00a0 = GPIO_SPEED_HIGH;\r\n\u00a0\u00a0\u00a0 GPIO_InitStruct.Alternate = GPIO_AF7_USART2;\r\n\r\n\u00a0\u00a0\u00a0 HAL_GPIO_Init(GPIOA, &amp;GPIO_InitStruct);\r\n\r\n\u00a0\u00a0\u00a0 GPIO_InitStruct.Pin = GPIO_PIN_2;\r\n\u00a0\u00a0\u00a0 GPIO_InitStruct.Alternate = GPIO_AF7_USART2;\r\n\r\n\u00a0\u00a0\u00a0 HAL_GPIO_Init(GPIOA, &amp;GPIO_InitStruct); \r\n\r\n\u00a0\u00a0\u00a0 for (;;)\r\n\u00a0\u00a0\u00a0 {\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 uint8_t tmp;\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 HAL_UART_Receive(&amp;g_UART, &amp;tmp, 1, HAL_MAX_DELAY);\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 tmp++;\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 HAL_UART_Transmit(&amp;g_UART, &amp;tmp, 1, HAL_MAX_DELAY);\r\n\u00a0\u00a0\u00a0 }\r\n}<\/pre>\n<p>If you are using a different board, you will need to find out which UART is connected to the COM port you are using and change the following parts of the code:<\/p>\n<table>\n<tbody>\n<tr>\n<td>Value<\/td>\n<td>New value<\/td>\n<\/tr>\n<tr>\n<td>__USART2_CLK_ENABLE()<\/td>\n<td>__USARTx_CLK_ENABLE() where X is your USART number<\/td>\n<\/tr>\n<tr>\n<td>USART2<\/td>\n<td>USARTx where X is your USART number<\/td>\n<\/tr>\n<tr>\n<td>__GPIOA_CLK_ENABLE()<\/td>\n<td>__GPIOx_CLK_ENABLE() where X is the GPIO port that has your USART pins<\/td>\n<\/tr>\n<tr>\n<td>GPIO_PIN_3 and GPIO_PIN_2<\/td>\n<td>GPIO pins that correspond to the RX and TX signals for your USART<\/td>\n<\/tr>\n<tr>\n<td>GPIO_AF7_USART2<\/td>\n<td>Alternate function numbers that switch the RX and TX pins to USART mode<\/td>\n<\/tr>\n<tr>\n<td>GPIO2<\/td>\n<td>GPIOx where X is the GPIO port that has your USART pins<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/li>\n<li>Build the project and ensure it succeeds:<a href=\"http:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/06-loop.png\"> <img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2216\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/06-loop.png\" alt=\"06-loop\" width=\"954\" height=\"644\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/06-loop.png 954w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/06-loop-300x203.png 300w\" sizes=\"(max-width: 954px) 100vw, 954px\" \/><\/a><\/li>\n<li>Open VisualGDB Project Properties and enable the raw terminal on the COM port connected to your device:<a href=\"http:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/07-terminal.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2217\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/07-terminal.png\" alt=\"07-terminal\" width=\"995\" height=\"755\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/07-terminal.png 995w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/07-terminal-300x228.png 300w\" sizes=\"(max-width: 995px) 100vw, 995px\" \/><\/a><\/li>\n<li>Press F5 to start the program. Try typing some characters in the COM port window in Visual Studio to see that they are echoed back with 1 added to them:<a href=\"http:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/08-echo.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2218\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/08-echo.png\" alt=\"08-echo\" width=\"954\" height=\"644\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/08-echo.png 954w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/08-echo-300x203.png 300w\" sizes=\"(max-width: 954px) 100vw, 954px\" \/><\/a><\/li>\n<li>Now we will modify our code to use FreeRTOS. We will create 2 threads:\n<ol>\n<li>The first thread will read numbers from the COM port and sleep for the amount of milliseconds specified by each number<\/li>\n<li>The second thread will send the messages generated by the first thread character-by-character while the first one is sleeping<\/li>\n<\/ol>\n<p>First modify the SysTickHandler to call the handler from FreeRTOS when the scheduler is active:<\/p>\n<pre class=\"\">#include &lt;FreeRTOS.h&gt;\r\n#include &lt;task.h&gt;\r\n#include &lt;queue.h&gt;\r\n#include &lt;semphr.h&gt;\r\n#include &lt;stdarg.h&gt;\r\n\r\nextern \"C\" void xPortSysTickHandler(void);\r\n\r\nextern \"C\" void SysTick_Handler(void)\r\n{\r\n\u00a0\u00a0\u00a0 HAL_IncTick();\r\n\u00a0\u00a0\u00a0 HAL_SYSTICK_IRQHandler();\r\n\u00a0\u00a0\u00a0 if (xTaskGetSchedulerState() != taskSCHEDULER_NOT_STARTED)\r\n\u00a0\u00a0\u00a0 {\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 xPortSysTickHandler();\r\n\u00a0\u00a0\u00a0 }\r\n}<\/pre>\n<p>Then declare 2 queues that will be used to store sent and received data:<\/p>\n<pre class=\"\">QueueHandle_t g_InQueue, g_OutQueue;\r\nstatic uint8_t s_ReceivedByte;\r\nSemaphoreHandle_t g_SendReadySemaphore;<\/pre>\n<p>Then add a main thread that will read characters from g_InQueue, interpret them as numbers, wait and print status messages via UART_Printf():<\/p>\n<pre class=\"\">void MainThread(void *)\r\n{\r\n\u00a0\u00a0\u00a0 HAL_UART_Receive_IT(&amp;g_UART, &amp;s_ReceivedByte, 1);\r\n\u00a0\u00a0\u00a0 NVIC_SetPriority(USART2_IRQn, configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY);\r\n\u00a0\u00a0\u00a0 NVIC_EnableIRQ(USART2_IRQn);\r\n\u00a0\u00a0 \u00a0\r\n\u00a0\u00a0\u00a0 for (;;)\r\n\u00a0\u00a0\u00a0 {\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 int value = 0;\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 for (;;)\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 {\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 char buf;\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 asm(\"nop\");\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 while (xQueueReceive(g_InQueue, &amp;buf, portMAX_DELAY) != pdPASS)\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 {\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 asm(\"nop\");\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 }\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \u00a0\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 if (buf &gt;= '0' &amp;&amp; buf &lt;= '9')\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 {\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 value *= 10;\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 value += (buf - '0');\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 }\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 else\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 break;\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 }\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \u00a0\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 if (!value)\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 continue;\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \u00a0\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 UART_Printf(\"WAIT %d\\r\\n\", value);\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 vTaskDelay(value \/ portTICK_PERIOD_MS);\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 UART_Printf(\"done\\r\\n\");\r\n\u00a0\u00a0\u00a0 }\r\n}<\/pre>\n<p>Add a UART_Printf() function that will send the output to g_OutQueue:<\/p>\n<pre class=\"\">void UART_Printf(const char *pFormat, ...)\r\n{\r\n\u00a0\u00a0\u00a0 char buffer[128];\r\n\u00a0\u00a0\u00a0 va_list lst;\r\n\u00a0\u00a0\u00a0 va_start(lst, pFormat);\r\n\u00a0\u00a0\u00a0 vsnprintf(buffer, sizeof(buffer), pFormat, lst);\r\n\u00a0\u00a0\u00a0 va_end(lst);\r\n\u00a0\u00a0\u00a0 for (int i = 0; i &lt; sizeof(buffer) &amp;&amp; buffer[i]; i++)\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 xQueueSend(g_OutQueue, &amp;buffer[i], portMAX_DELAY);\r\n}<\/pre>\n<p>Add a handler for the &#8220;UART character received&#8221; event that will put the received byte to g_InQueue:<\/p>\n<pre class=\"\">extern \"C\" void HAL_UART_RxCpltCallback(UART_HandleTypeDef *huart)\r\n{\r\n\u00a0\u00a0\u00a0 BaseType_t higherPriorityTaskWoken = pdFALSE;\r\n\u00a0\u00a0\u00a0 if (xQueueSendFromISR(g_InQueue, &amp;s_ReceivedByte, &amp;higherPriorityTaskWoken) != pdPASS)\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 asm(\"bkpt 255\");\r\n\u00a0\u00a0\u00a0 if (HAL_UART_Receive_IT(&amp;g_UART, &amp;s_ReceivedByte, 1) != HAL_OK)\r\n\u00a0\u00a0\u00a0 {\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 asm(\"bkpt 255\");\r\n\u00a0\u00a0\u00a0 }\r\n}<\/pre>\n<p>Add a similar handler to the &#8220;UART character sent&#8221; event that will notify the sending thread by posting an event to g_SendReadySemaphore:<\/p>\n<pre class=\"\">extern \"C\" void HAL_UART_TxCpltCallback(UART_HandleTypeDef *huart)\r\n{\r\n\u00a0\u00a0\u00a0 BaseType_t higherPriorityTaskWoken = pdFALSE;\r\n\u00a0\u00a0\u00a0 xSemaphoreGiveFromISR(g_SendReadySemaphore, &amp;higherPriorityTaskWoken);\r\n}<\/pre>\n<p>Add the sending thread that will wait until the UART hardware is ready (signaled through g_SendReadySemaphore) and then send the next byte queued in g_OutQueue:<\/p>\n<pre class=\"\">void SendThread(void *)\r\n{\r\n\u00a0\u00a0\u00a0 for (;;)\r\n\u00a0\u00a0\u00a0 {\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 uint8_t buf;\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 xSemaphoreTake(g_SendReadySemaphore, portMAX_DELAY);\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 xQueueReceive(g_OutQueue, &amp;buf, portMAX_DELAY);\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 HAL_UART_Transmit_IT(&amp;g_UART, &amp;buf, 1);\r\n\u00a0\u00a0\u00a0 }\r\n}<\/pre>\n<p>Finally put this all together by creating the threads, queues and the semaphore in the main() function after the last HAL_GPIO_Init():<\/p>\n<pre class=\"\">\u00a0\u00a0\u00a0 TaskHandle_t mainTask, sendTask;\r\n\u00a0\u00a0\u00a0 xTaskCreate(MainThread, \"main\", 1024, 0, tskIDLE_PRIORITY + 1, &amp;mainTask);\r\n\u00a0\u00a0\u00a0 xTaskCreate(SendThread, \"send\", 1024, 0, tskIDLE_PRIORITY + 1, &amp;sendTask);\r\n\u00a0\u00a0\u00a0 g_InQueue = xQueueCreate(1024, 1);\r\n\u00a0\u00a0\u00a0 g_OutQueue = xQueueCreate(1024, 1);\r\n\u00a0\u00a0\u00a0 g_SendReadySemaphore = xSemaphoreCreateCounting(1000, 1);\r\n\u00a0\u00a0\u00a0 vTaskStartScheduler();<\/pre>\n<\/li>\n<li>Build your program, run it and try entering small numbers to get the main thread to sleep for some milliseconds. Ensure that the program responds properly:<a href=\"http:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/09-run.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2219\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/09-run.png\" alt=\"09-run\" width=\"954\" height=\"644\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/09-run.png 954w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/09-run-300x203.png 300w\" sizes=\"(max-width: 954px) 100vw, 954px\" \/><\/a><\/li>\n<li>Now we will use the VisualGDB FreeRTOS tracing to get see whether the driver could be optimized. Open the Dynamic Analysis page of VisualGDB Project Properties and enable the &#8216;Allow tracing various RTOS events&#8217; checkbox. Then click at the &#8216;add reference&#8217; link to automatically reference the profiling framework:<a href=\"http:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/10-profiler.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2220\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/10-profiler.png\" alt=\"10-profiler\" width=\"996\" height=\"756\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/10-profiler.png 996w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/10-profiler-300x228.png 300w\" sizes=\"(max-width: 996px) 100vw, 996px\" \/><\/a><\/li>\n<li>On STM32F410 device the profiler framework will not build because it is missing the TIM2 timer used by the sampling profiler:<a href=\"http:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/11-timer.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2221\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/11-timer.png\" alt=\"11-timer\" width=\"954\" height=\"644\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/11-timer.png 954w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/11-timer-300x203.png 300w\" sizes=\"(max-width: 954px) 100vw, 954px\" \/><\/a><\/li>\n<li>You can easily fix this by adding SAMPLING_PROFILER_TIMER_INSTANCE=5 to preprocessor macros on the MSBuild Settings page:<a href=\"http:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/12-inst.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2222\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/12-inst.png\" alt=\"12-inst\" width=\"795\" height=\"592\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/12-inst.png 795w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/12-inst-300x223.png 300w\" sizes=\"(max-width: 795px) 100vw, 795px\" \/><\/a><\/li>\n<li>Build and start your program; wait until it loads and press &#8216;break all&#8217;. Use the Threads window to display the list of active RTOS threads:<a href=\"http:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/13-threads.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2223\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/13-threads.png\" alt=\"13-threads\" width=\"1096\" height=\"766\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/13-threads.png 1096w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/13-threads-300x210.png 300w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/13-threads-1024x716.png 1024w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/13-threads-130x90.png 130w\" sizes=\"(max-width: 1096px) 100vw, 1096px\" \/><\/a>Note that having the window open will cause VisualGDB to analyze the threads after each stop and will reduce the performance. It is recommended to close the Threads window when not actively using it.<\/li>\n<li>Open the Real-Time watch window and add the threads to the watch. VisualGDB will automatically suggest known thread names:<a href=\"http:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/14-thrnames.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2224\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/14-thrnames.png\" alt=\"14-thrnames\" width=\"1096\" height=\"766\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/14-thrnames.png 1096w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/14-thrnames-300x210.png 300w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/14-thrnames-1024x716.png 1024w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/14-thrnames-130x90.png 130w\" sizes=\"(max-width: 1096px) 100vw, 1096px\" \/><\/a><\/li>\n<li>Next add the <strong>g_InQueue<\/strong> and <strong>g_OutQueue<\/strong> queues. You can either type in their names, or right-click on them in the source code and select &#8216;Add to real-time watch&#8217;:<a href=\"http:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/15-queues.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2225\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/15-queues.png\" alt=\"15-queues\" width=\"1098\" height=\"768\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/15-queues.png 1098w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/15-queues-300x210.png 300w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/15-queues-1024x716.png 1024w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/15-queues-130x90.png 130w\" sizes=\"(max-width: 1098px) 100vw, 1098px\" \/><\/a><\/li>\n<li>Resume the program and type &#8216;100&lt;ENTER&gt;&#8217; in the COM port window. Note how the real-time watch will show several bursts of activity on the input queue (one for each received character) followed by a longer burst on the output queue (when the &#8216;WAIT&#8217; message is printed):<a href=\"http:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/16-out.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2226\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/16-out.png\" alt=\"16-out\" width=\"1096\" height=\"766\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/16-out.png 1096w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/16-out-300x210.png 300w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/16-out-1024x716.png 1024w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/16-out-130x90.png 130w\" sizes=\"(max-width: 1096px) 100vw, 1096px\" \/><\/a><\/li>\n<li>Zoom in on the &#8216;out&#8217; queue activity. You will see how the queue is quickly filled while the <strong>main<\/strong> thread is running and then is slowly emptied by the <strong>send<\/strong> thread. Measure the time between the maximum queue size (start of transmission) and the time when it is fully emptied (end of transmission):<a href=\"http:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/17-queue.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2228\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/17-queue.png\" alt=\"17-queue\" width=\"1096\" height=\"766\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/17-queue.png 1096w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/17-queue-300x210.png 300w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/17-queue-1024x716.png 1024w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/17-queue-130x90.png 130w\" sizes=\"(max-width: 1096px) 100vw, 1096px\" \/><\/a><\/li>\n<li>To make things clearer we will use a custom real-time watch to plot the times of the &#8216;Receive complete&#8217; and &#8216;Send complete&#8217; events on the same scale. Include the &lt;CustomRealTimeWatches.h&gt; file, declare an instance of <strong>EventStreamWatch<\/strong> and post events there from both <strong>HAL_UART_RxCpltCallback() <\/strong>and<strong> HAL_UART_TxCpltCallback():<\/strong><a href=\"http:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/18-events.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2227\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/18-events.png\" alt=\"18-events\" width=\"1096\" height=\"766\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/18-events.png 1096w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/18-events-300x210.png 300w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/18-events-1024x716.png 1024w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/18-events-130x90.png 130w\" sizes=\"(max-width: 1096px) 100vw, 1096px\" \/><\/a><\/li>\n<li>Run the program again, add <strong>g_UARTInterrupts<\/strong> and <strong>g_SendReadySemaphore<\/strong> to real-time watch, then resume it, type &#8216;100&lt;ENTER&gt;&#8217;\u00a0 in the COM port window and observe the timing:<a href=\"http:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/19-tx.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2229\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/19-tx.png\" alt=\"19-tx\" width=\"954\" height=\"644\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/19-tx.png 954w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/19-tx-300x203.png 300w\" sizes=\"(max-width: 954px) 100vw, 954px\" \/><\/a>The graph clearly shows that out of almost 1 ms spent between subsequent timer interrupts most of the time is spent by the <strong>IDLE<\/strong> thread while the <strong>send<\/strong> thread is queued despite the <strong>g_SendReadySemaphore<\/strong> being signaled.<\/li>\n<li>This can be fixed by adding the following line to the end of <strong>HAL_UART_RxCpltCallback() <\/strong>and<strong> HAL_UART_TxCpltCallback()<\/strong>:\n<pre class=\"\">\u00a0\u00a0\u00a0 portYIELD_FROM_ISR(higherPriorityTaskWoken);<\/pre>\n<p><a href=\"http:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/21-yield.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2230\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/21-yield.png\" alt=\"21-yield\" width=\"954\" height=\"644\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/21-yield.png 954w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/21-yield-300x203.png 300w\" sizes=\"(max-width: 954px) 100vw, 954px\" \/><\/a><\/li>\n<li>This forces the thread switch immediately after the interrupt handler returns, makes the <strong>send<\/strong> thread get activated sooner and considerably reduces the delay between the interrupt and reading the output queue:<a href=\"http:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/22-shorter.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2232\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/22-shorter.png\" alt=\"22-shorter\" width=\"954\" height=\"644\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/22-shorter.png 954w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/22-shorter-300x203.png 300w\" sizes=\"(max-width: 954px) 100vw, 954px\" \/><\/a><\/li>\n<li>The 140 microsecond delay caused by the context switch is still relatively large. We can reduce it further by trying to read the queue directly from the &#8216;send complete&#8217; callback and sending it immediately:\n<pre class=\"\">extern \"C\" void HAL_UART_TxCpltCallback(UART_HandleTypeDef *huart)\r\n{\r\n\u00a0\u00a0\u00a0 g_UARTInterrupts.ReportEvent(\"TX\");\r\n\u00a0\u00a0\u00a0 BaseType_t higherPriorityTaskWoken = pdFALSE;\r\n\u00a0\u00a0\u00a0 uint8_t tmp;\r\n\u00a0\u00a0\u00a0 if (xQueueReceiveFromISR(g_OutQueue, &amp;tmp, &amp;higherPriorityTaskWoken))\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 HAL_UART_Transmit_IT(&amp;g_UART, &amp;tmp, 1);\r\n\u00a0\u00a0\u00a0 else\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 xSemaphoreGiveFromISR(g_SendReadySemaphore, &amp;higherPriorityTaskWoken);\r\n\u00a0\u00a0\u00a0 portYIELD_FROM_ISR(higherPriorityTaskWoken);\r\n}<\/pre>\n<\/li>\n<li>This eliminates the extra context switches: <strong>g_OutQueue<\/strong> is now read by the interrupt handler from the context of the <strong>IDLE<\/strong> thread and once all data is sent, the <strong>send<\/strong> thread is activated:<a href=\"http:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/23-mintime.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2233\" src=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/23-mintime.png\" alt=\"23-mintime\" width=\"954\" height=\"644\" srcset=\"https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/23-mintime.png 954w, https:\/\/visualgdb.com\/w\/wp-content\/uploads\/2016\/10\/23-mintime-300x203.png 300w\" sizes=\"(max-width: 954px) 100vw, 954px\" \/><\/a>Congratulations! By analyzing the timings and reducing the unnecessary delays you have managed to make the UART driver almost 5x faster.<\/li>\n<\/ol>\n","protected":false},"excerpt":{"rendered":"<p>This tutorial shows how to use the FreeRTOS tracing feature of VisualGDB to optimize a simple FreeRTOS UART driver based<\/p>\n","protected":false},"author":1,"featured_media":2234,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[135],"tags":[53,67,109,61],"_links":{"self":[{"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/posts\/2209"}],"collection":[{"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/comments?post=2209"}],"version-history":[{"count":1,"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/posts\/2209\/revisions"}],"predecessor-version":[{"id":2235,"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/posts\/2209\/revisions\/2235"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/media\/2234"}],"wp:attachment":[{"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/media?parent=2209"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/categories?post=2209"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/visualgdb.com\/w\/wp-json\/wp\/v2\/tags?post=2209"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}