Too busy and stuff

Apparently I am too busy to write anything here. Or … priorities… or workload… Work, life, etc.

During the Summer 2020 I decided that I want an app in the App Store. So I did it.

Slippery Cities (Liukkaat kadut in Finnish, Hala Gator in Swedish) warns pedestrians of slippery weather conditions in six Finnish cities. That photo above of my Apple Watch says that in Oulu there was a slippery alert two hours ago. So wear your spiked shoes. Or be careful out there.

I learned a lot about:

  • Swift programming
  • SwiftUI programming
  • WatchKit development with Xcode
  • Localisation for three languages
  • Accessibility programming in Apple platforms
  • AppStore: submitting apps for review (no issues by the way, app went through without any hassle)
  • Programming in watchOS, including watch faces, complications, background processing, networking,… with beta version of watchOS 7.

Buy it in the App Store, I want to get rich. Mind you, it may not work in the city you live, so consider that. Won’t refund you unless you twist my arm hard.

Then other things keeping me busy. Created a new course on Software Platforms and Ecosystems with two colleagues. That was hard. Luckily also behind us for a while, since the course ended in December.

I also took responsibility for a course on data structures and algorithms. Corona pandemic made that harder than usual, since the course material and arrangements were based on the assumption of face-to-face classroom teaching. Didn’t happen.

Now I am working on a course on server programming, which is a new course for me. I’ve been implementing a HTTPS chat server with Java and writing instructions for students to do the same. Topics include Java, HTTPS, certificates, authentication, JSON, sqlite database, password hashing and salting, UTF, encoding, decoding, HTTP headers,… Maven, Visual Studio Code, git — lots of work, since none of this existed in late December and course started in early January. ?

Then another course where I assist in exercises on advanced software testing and security issues started this week. Lots to learn for me there too.

So I’ve been very, very busy. Maybe in Spring 2021 I’ll be sharing more here.

Oh yes, and some nice things too — got myself a new Mac Mini M1, an Apple Silicon computer. So enjoying doing these things with the new device! My previous iMac 27″was already five years old, luckily in a good condition. I got a decent price for it when I sold it to fund the Mini.

Teaching season started

Fall teaching season has started with one old course already ongoing. Devices and networks. My part is the networks, so currently I can focus on two new courses starting in October. Data structures and algorithms is an old course but I’ll take responsibility for it this year.

Another course, Platforms and ecosystems is a new course. Meaning I have hands full in creating material for the new course together with two other teachers. And at the same time, familiarising myself with the data structures course.

For the data structures course, I’ve worked earlier with the demo sorting app implemented using Swift. No time to add additional sorting methods there, but should build it with new Xcode, Swift and iOS 14 to see if there are any (breaking) changes.

What I did recently is that I implemented simple (and stupid) array and linked list classes, both in C++ and Swift, to demonstrate the effectiveness of creating and accessing arrays in comparison to linked lists. Will be using that when discussing why different data structures have different (preferred) usage situations. And that there may be conflicting requirements for the data structure in an app. Then you just have to make compromises.

Another important thing to show to the students is that you need to build the release version before comparing or measuring performance.

Betas

I wanted to install Apple developer betas on some of my devices to try out something. But then I saw that the bank ID software didn’t work on the beta. Bummer.

Today, week later, I realized that I also have an iPad. And since the bank ID app works also on iPad, I installed public beta of iOS 14 on my phone, as well as developer beta 2 on my Apple Watch. 🙂

New Swift tools

I’ve been working on my sorting methods demonstration app recently when I’ve had time for it. Most of my time currently is spend with evaluating student exercise projects submitted at the end of June for the Software Architectures course.

So, not much new functionality in the sorting app now. I am now mostly trying to make sure code so far is good and documented. Since my plan is to use it as an example and demonstration in the new course I am teaching in Fall, Data structures and Algorithms, I want the demo well documented and of good quality.

To make this happen, I’ve installed several Swift related tools and tried them out.

I am using Swift lint to make sure code is clean and following the Swift coding conventions. Though I had to disable some rules that are too tight for this demo app, using the .swiftlint.yml configuration file.

disabled_rules:
 - multiple_closures_with_trailing_closure
 - line_length
 - todo

For generating html documentation from the comments in the code, I’ve experimented with Jazzy and Swift-doc. Currently Jazzy seems to be in a better shape for my needs, since Swift-doc currently documents only public properties of the project.

Apparently Swift-doc has been used mainly for documenting APIs and libraries and their public APIs. There is a pull request in the GitHub repository, not yet accepted, enabling specifying the level of protection (open, public, fileprivate, private) to document. I tried out the fork which enables protection level configuration, results of which can be seen here. In addition to the protection level issues, there are issues with links. If you try to navigate using the class graphs on the page, the links are not correctly generated by the tool.

The Jazzy generated documentation does not have these issues, and I like the visual apprearance of the Apple themed documentation page (screenshot below).

Well, back to evaluating the SWA exercise work projects, after finishing the study program Zoom meeting I am currently listening to ?

Jazzy generated html documentation of the sorting demo app.

Fun with Swift operator overloading

There is this guy in one forum, writing all his sentences ending with at least two periods, mostly three… Always… This might help him, integrated into some text editor… Saving his keyboard strokes…

extension String {
   static postfix func ... (str: inout String) -> String {
      str = String(str.map {
         $0 == "." ? "…" : $0
      })
      return str
   }
}

var sentence = "Operator dotdotdot. For those. Who want to write. All sentences. With several periods."
let hisStyle = sentence...
print(hisStyle)
// Prints: "Operator dotdotdot… For those… Who want to write… All sentences… With several periods…"

What about adding a copyright at the end of text the user enters, using this brand new copyright operator, <©>

postfix operator <©>
extension String {
   static postfix func <©> (str: inout String) -> String {
      str += " © Antti J."
      return str
   }
}

var originalText = "Life is a b***h and then you die."
var copyrighted = originalText<©>
print(copyrighted)
// Prints: "Life is a b***h and then you die. © Antti J."

Well, back to some more serious work.

Pandemic isolation ramblings

Due to the corona virus pandemic, I’ve been working remotely since the end of February. I felt like having a cold and isolated myself before the University officially recommended that to the personnell. Not having the virus though, but a common cold only. Obviously I cannot be 100% sure since there are no tests available here unless you are critical workforce or seriously ill. So keeping myself isolated just in case, at home.

Luckily the grocery nearby delivers food and their app and website to create the orders works quite well. Today the second delivery is arriving, which should be enough for a week at least. They have a very high load of orders flooding the service. What I usually do is to create an order with a couple of products, select the delivery date about one week later, and then keep filling the order until the day before the delivery. By this day, I already have the next delivery date reserved, with a new growing list of items to order. In this way, we am able to secure the deliveries so that there will not be too long gaps in between.

Since the isolation started, I have continued to offer video sessions via MS Teams and Moodle discussion and chat support to the students in the Software architectures course. Fortunately, the course lectures and exercises were mostly over by the isolation started. Students continue working on their exercise work projects until the end of May. Luckily I have a fast network at home, and even better hardware than at the campus. That large iMac screen has proven to be quite a good a thing to have. Currently I am recoding audio feedback for the first phase of the exercise work projects to the students. The amount of feedback to give could be quite extensive, and text feedback is inferior to audio, in my opinion. Video in this case is not needed since I can easily pinpoint the things I comment by addressing the chapter titles and paragraph and page numbers. Let’s see how this works.

The study program and the research unit are using Zoom video sessions to keep in touch and organize during the pandemic. Probably this will last until summer, but I suspect there will be limitations and exceptional situation even in the Fall semester. Time will tell. University support staff has increased the online training of teachers, providing courses in Zoom on using Moodle, Teams and Zoom itself in teaching.

I’ve started to implement a small app with Swift to learn something new. The app is also something to use as a demonstration in Fall in the Data structures and algorithms course. I will take charge of that course after the summer break, so wanted to do something related to the topic.

Below is a demo video of the app in the early phases. I am planning to implement maybe a couple of more sorting algorithms and improve the graphics and usability of the app. YouTube is full of these kind of videos, so I will not put too much effort on this, like implementing tens of different algos.

Beauty from git logs

I didn’t know — until today — about the existence of Gource. Explained shortly, you can use it to create beautiful animations from the history of git repositories. After installing it, the last couple of hours was spent in watching and recording videos of my projects in git.

The project in this video is called Keywords. It is a demo project I implemented for Software architectures course, implementing a Client/Server app with TCP, session management, client side API to the server and an Android client.

What is interesting to see on the video is how I started the implementation from the Server, then switched to Client, back to Server. Then I implemented a client API for the server as a library (.jar; this is all Java), made modifications to the Client and so on. Every time I needed new functionality:

  • I first implemented the support that on the Server time,
  • then modified the Client API to support that and
  • finally added the support for the feature on the client side.

Basically how incremental development works (may work), doesn’t it?

Installing Gource on macOS was quite simple, with Homebrew:

brew install gource

And then just run it (in some project directory under git):

gource -auto-skip-seconds 1

Most of my projects are ones I work occasionally, some quiet periods in between, so the auto-skip-seconds option is useful to quickly pass these times mostly nothing happened during the project.

Since I had earlier installed ffmpg, it was easy to save the generated animation into a video file:

gource --auto-skip-seconds 1 --title "EasyCrypto demo project (c) Antti Juustila" -1280x720 -o - | ffmpeg -y -r 60 -f image2pipe -vcodec ppm -i - -vcodec libx264 -preset ultrafast -pix_fmt yuv420p -crf 3 -threads 0 -bf 0 gource.mp4

The saved video files can be quite large, so adjusting the -crf option may be a good idea to make the files smaller. Though that also makes the videos not so cool.

I have one project I started 2013 and have worked upon it until January this year. I already watched the evolution of that system with Gource, and it will make a great video. It would be great with subtitles or voice explaining what happens and why. This would be a nostalgic thing to create: the course I have used that system is something I am leaving behind. This Spring is the last time I will teach the course. The video would be kind of a farewell to the system, since it is unlikely I will continue with it without any useful context, like the course has been.

To thread or not to thread

There’s a distributed C++ system I made, used as a “patient” in a course on Software architectures. It includes a command line tool TestDataGenerator, which I implemented to test the performance and reliability of the system. The tool generates random data in memory buffers and then writes four test data files which are read and handled by the system’s distributed nodes. An earlier blog post discussed the tool’s implementation details.

The generator was single threaded, writing the four data files in sequence, in the main thread. But then this stupid idea popped in my head — what if the four test data files are written to disk in parallel? Would it be faster? How much if any?

Threading is absolutely not needed in this case: generating test data for 5000 students takes about 250ms using my MacBook Pro (13-inch, 2018), 2.3 GHz four core Intel Core i5, 1 Tb SSD disk. On machines with HDDs this could be somewhat slower.

However, I wanted to see how much of execution time (if any) I can squeeze off with the four threads, each writing to their own data file from the RAM buffers. Also an opportunity to learn more about threads. Those horrible, evil things everyone is saying nobody should use…

My first implementation where the threads were created and executed when the memory buffer was full, and saving the file done in a lambda function:

 if (bufferCounter >= bufSize) {
   std::thread thread1( [&isFirstWrite, &STUDENT_BASIC_INFO_FILE, &basicInfoBuffer] {
     saveBuffer(isFirstWrite, STUDENT_BASIC_INFO_FILE, basicInfoBuffer);
   });
// ...

But creating a thread takes time. Lots of time, thousands of processor cycles, depending on your setup (see e.g. this blog post). If the tool startup parameters are -s 50000 -b 500 (create 50000 records with buffer size of 500), this would mean 50000/500 = 100 thread creations per file, so 400 threads would be created during the execution of the tool. Not very good for performance.

I changed the implementation to create the four threads only once, before filling and saving the memory buffers:

   // For coordination between main thread and writer threads
   std::atomic<int> threadsFinished{0};
   // Prepare four threads that save the data.
   std::vector<std::thread> savers;
   savers.push_back(std::thread(&threadFuncSavingData, std::ref(threadsFinished), std::cref(STUDENT_BASIC_INFO_FILE), std::ref(basicInfoBuffer)));
   savers.push_back(std::thread(&threadFuncSavingData, std::ref(threadsFinished), std::cref(EXAM_INFO_FILE), std::ref(examInfoBuffer)));
   // ... and same for the remaining two threads.

and then woken up every time the data buffers were full:

if (bufferCounter >= bufSize) {
   if (verbose) std::cout << std::endl << "Activating buffer writing threads..." << std::endl;
   // Prepare variables for the file saving threads.
   startWriting = true;
   threadsFinished = 0;
   int currentlyFinished = 0;
   // And launch the file writing threads.
   launchWrite.notify_all();

And then the main thread waits for the writers to finish their job before filling the memory buffers again.

   // Wait for the writer threads to finish.
   while (threadsFinished < 4) {
      std::unique_lock<std::mutex> ulock(fillBufferMutex);
      writeFinished.wait(ulock, [&] {
         return currentlyFinished != threadsFinished;
      });
      currentlyFinished = threadsFinished;
   }


Obviously the file writing threads notify the main thread about them finishing the file operations using a condition variable and a counter the main thread can use to keep track of if all the writer threads finished:

// Thread function saving data in parallel when notified that buffers are full.
void threadFuncSavingData(std::atomic<int> & finishCount, const std::string & fileName, std::vector<std::string> & buffer) {
   bool firstRound = true;
   while (running) {
      // Wait for the main thread to notify the buffers are ready to be written to disk.
      std::unique_lock<std::mutex> ulock(writeMutex);
      launchWrite.wait(ulock, [&] {
         return startWriting || !running;
      });
      // We are still running and writing, so do it.
      if (buffer.size() > 0 && startWriting && running) {
         saveBuffer(firstRound, fileName, buffer);
         buffer.clear();
         firstRound = false;
         // Update the counter that this thread is now ready.
         // Main thread waits that four threads have finished (count is 4).
         finishCount++;
      }
      // Notify the main thread.
      writeFinished.notify_one();
   }
}

Then to measurements. I created a script which executes the tool 20 times, first using threads and then sequentially; not using threads (command line parameter -z disables the threading code and uses sequential code):

echo "Run this in the build directory of TestDataGenerator."
echo "Removing output files..."
rm test-*.txt
echo "Running threaded tests..."
for ((i = 0; i < 20; i++)); do ./GenerateTestData -s 50000 -e 10 -b 500 >> test-par.txt; done
echo "Running sequential tests..."
for ((i = 0; i < 20; i++)); do ./GenerateTestData -zs 50000 -e 10 -b 500 >> test-seq.txt; done
echo "-- Tests done -- "
open test-*.txt

Just to compare, I executed the tests in two machines. MacBook Pro 2.3 GHz Intel Core i5 with four cores, 1 Tb SSD and iMac 2015 with HDD. Next, I took the output files and from there the amount of milliseconds the tool took each time, to a Numbers file and generated these graphics from the test data:

Comparison of sequential and threaded execution in two machines.
Comparison of sequential and threaded execution in two machines

As you can see, there is no difference in writing in threads (parallel) or writing sequentially. Here you can see how the threads take turns and execute in parallel in the cores of the processor of the MacBook Pro:

Profiler showing threads executing.
Blue areas show when the threads are active, executing.

Profiling the execution shows that having multiple threads doing the work won’t make a difference. In the trace below you can see that most the time the threads are either waiting for their turn to flush the data to disk or actually flushing the data. Most of the time in the selected saveBuffer method is spent in flushing data.

Profiler screenshot shows where time was spent, flushing and waiting.
Selected lines show where the most of the time was spent.

Also, in the sequential execution, where the single main thread does all, time is spend in flushing to disk:

Single threaded execution profile.
Single threaded execution spent most of the time flushing data to disk.

Creating threads to speed up writing to disk — definitely not a good idea in this case. If this would be an app with GUI, then writing large amounts of data in a thread could very well be a good idea. If writing would take more than a couple of hundred milliseconds, user would notice the GUI lagging/not being responsive. So whether to use threads or not to write data to disk, depends on your use case.

This oldish article from DrDobbs is also an interesting read. Writing several files in threads is not necessarily helpful (unless using RAID), and that one should make threading configurable (like the -z parameter in my implementation) because they may in some situations even slow down the app. Also this discussion on when to apply threads is a good one:

Using multiple threads is most helpful when your program is CPU bound and you want to parallelise your program to use multiple CPU cores.

This is not the case for I/O bound problems, such as your scenario. Multiple threads will likely not speed up your system at all.

Using CMake

A while ago I was asking in Slack if anyone knew how to write CMake files for C++ projects that work across many platforms, including macOS, Ubuntu and Windows. One said that he has done something multiplatform with CMake, but doesn’t remember how and added:

“I hate tooling in development.”

I guess he referred to using CMake. I sort of agree with him but also not. For example, tying yourself in some specific IDE is not very useful. Only knowing how to work with some specific tools not available on many platforms limits the applicability of your skills. Moving to a different ecosystem or platform, requiring different tools will become more difficult and time consuming.

It is better to learn the lower level tools well, those tools which are shared across platforms and ecosystems. Then you can apply and use those wherever you are developing. That is why I prefer using git on command line — though I do have several git GUI tools installed.

Another thought that came into my mind is that software development without tools just doesn’t happen. We all use tools, whether it is vim, command line git and makefiles, or CMake, Xcode IDE and Fork for git. I prefer to use the tools that fit the problem. Like, if you are doing development for multiple platforms, with multiple compilers and IDEs, then for solving that problem, CMake is a good choice instead of makefiles. It liberates me from the lower level technical stuff of considering how to create makefiles for different OS’s and compilers, allowing me to focus on the actual problem to solve with those tools — creating some code and working systems.

I eventually found the way to create CMake files so that the project I am working on, can be build from the CMake files in Windows 10, Ubuntu and macOS. Also creating projects for Eclipse CDT, Xcode and Visual Studio is quite easy. I can also easily switch from using make to using Ninja as the build engine.

# Creating Ninja build files for a Qt app on macOS from CMake project
cmake -GNinja -DCMAKE_PREFIX_PATH=/Users/juustila/Qt/5.12.1/clang_64/lib/cmake ..
# Creating MSVC build files on Windows 10 from CMake project
cmake -G"Visual Studio 16" -DCMAKE_PREFIX_PATH=C:\Qt\Qt5.14.1\5.14.1\msvc2017_64\lib\cmake;C:\bin ..
# Creating Xcode project files on macOS from CMake project
cmake -GXcode -DCMAKE_PREFIX_PATH=/Users/juustila/Qt/5.12.1/clang_64/lib/cmake ..

What I learned recently is that it is possible to generate dependency graphs from the CMake project file with GraphViz:

# Create the Xcode project and also a dot file 
# for generating a dependency graph using GraphViz.
cmake -GXcode --graphviz=StudentPassing.dot ..
dot -Tpng -oStudentPassing.png StudentPassing.dot
CMake / GraphViz generated dependency graph of an app.

Resulting in a neat graphical representation of the dependencies of your project. Especially handy if you start working with a large system you do not yet know and want to study what libraries and other components it is built using. And teaching software architecture in practice, as I am currently doing: letting students analyse the structure of a system using available tools.

With CMake, I am able to easily combine document generation with Doxygen into the CMake project:

# In your CMakeLists.txt file:
find_package(Doxygen)
if (DOXYGEN_FOUND)
   configure_file( ${CMAKE_CURRENT_SOURCE_DIR}/doxyfile.in ${CMAKE_CURRENT_BINARY_DIR}/doxyfile @ONLY)
   add_custom_target(doc
      ${DOXYGEN_EXECUTABLE} ${CMAKE_CURRENT_BINARY_DIR}/Doxyfile
      WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}
      COMMENT "Generating API documentation with Doxygen" VERBATIM
   )
endif(DOXYGEN_FOUND)
# And then generate build files (e.g. for Ninja)
cmake -GNinja ..
# Do the build
ninja
# And then generate documentation using Doxygen 
ninja doc

I began this post with the question about multi platform development, where my problem originally was how to make all this work in Windows while Ubuntu and macOS was no problem.

For Windows, using CMake required some changes to the CMakeLists.txt project files; using CMake macros with add_definitions() due to using random generator engines for Boost uuid class that work differently on Windows than on the other platforms:

if (WIN32)
   add_definitions(-DBOOST_UUID_RANDOM_PROVIDER_FORCE_WINCRYPT)
endif(WIN32)

Windows also requires some extra steps in the build process, mainly due to the fact that while *nixes have a “standard” location for headers and libs (/usr/local/include and /usr/local/lib), in Windows you should specify with –prefix (Boost) or -DCMAKE_INSTALL_PREFIX (CMake) where your libraries’ public interface (headers, libs, dlls, cmake config files) should be installed and where they can be found by other components.