Wanted to make a separate more detailed post of my previous post item about how the order of things may greatly influence the time performance of code.
There is an app which reads a large text file to find out how many unique words it has. The app also reports on the top 100 unique words in the file, and the count of them.
One file had 2 735 307 words, of which unique were 99 130.
Since the programmer was clever, s/he used a binary search tree to do the job of keeping book of the unique words. Then it transferred the job to a table to sort the words in order of frequency to print out the top-100 list.

When measuring the time performance of the app to handle this one file, the app reported that it could do the job in 2.550396 seconds*).
When looking at the code inserting the word to the tree (above) closely, one can see that a node is allocated and then free’d even when the word is already in the tree, without anything useful done with the newnode. The node is needed only when the word is not already in the tree.
So, why not move the allocation of the node after the loop, when the node is actually needed:

This small, some may say a pedantic change, took off 30% of the execution time of the whole app, when handling that same file mentioned above. After this change, the execution time of the app dropped down to 1.867050 seconds. Repeated runs produced similar timings for both solutions.
Allocating from the heap, done repeatedly, is slow.
*) Measurements done on Apple Mac Mini M1 2020, 16Gb RAM, 8 cores, 1 Tb SSD.