lexing – Antti Juustila

In the post last week I mentioned that I’ve added new features to the tool I use in Data Structures and Algorithms course to supervise, support, test and grade the student programming course work. I’ll describe here the new features and provide some (hopefully) interesting implementation details about those.

Background about the course that helps in understand the role of this tool:

students each have their own github or gitlab repository to where they fork the initial status of their course work from the teacher repository,
repository contains the task instructions, skeleton code to implement, unit tests code and other supporting code given ready for students,
repository tests enable testing the student’s implementation of the data structures and algorithms that are required to implement in the course,
teachers have access to the student repositories to assist students with any issues, and finally to grade the work.

Before going into details, a comment about the motivation for this tool to exist in the first place: The course the tool is used in, has 300+ enrolled students and only two full time teachers (sometimes we do have 1-2 assistants with small number of hours, like 50 hrs in total for the whole Autumn), simultaneously teaching 2-3 other 200-300 student courses. That totals around 900+ active students taught simultaneously, so providing individual support and attention to all is practically impossible, at least if no automation (and other support tools) is taken into use.

So the critical thing in managing all this work is, that we do need automated tools to:

keep track of how the students are progressing in the course and help those who desperately need it, and after the last deadline,
to evaluate and grade the students’ programming submissions as fast as possible.

The grading should be done in three weeks after the last deadline, so it cannot be done in time manually. The aim is to use this tool during the course, trying to catch those students that are not doing OK, and then try to contact and help them in getting their coursework going and spotting any possible mistakes in their work so that they can correct their mistakes before the final deadline.

Students next Fall will receive the same unit tests and the same code analysis tools (in Java unit tests) they can also themselves use. However, this is not enough, especially if the students do not contact the teachers about their issues soon enough or at all. A problem that is too common, unfortunately.

The new features added to the teachers’ tool during this Winter and Spring are:

Automatic source code analysis:
- results of the source code analysis are stored in a database table; repository table has a column where problems found in code are flagged,
- tool checks for java imports that are not allowed, notifies about this and sets error flag in the database table,
- tool also uses the Antrl parser generator to check if method implementations in the source code violate any rules and if the code does not do the things it should do.
Remote repo checks for both github and gitlab:
- checks that remote is not public, the rule of the course,
- checks that remote has only allowed contributors.
Testing and grading enhancements
- tests are now grouped to grade level tests,
- when all tests in a grade group passes, grade is stored to repository table,
- grade tests can be executed in whichever order, since grade value is bit flag,
- test for a certain student repository can be disabled if they cause issues like tests never ending or taking way too much time,
- if git pull updates code, grade is set to zero and test results are cleared.
Status reporting
- the tool now generates a textual status report that can be sent to student by email, listing how test pass, if any forbidden things are used / things to use are not used, git statistics etc.

Let’s take a closer look at the code analysis done by the tool in this post. The other features are left for future possible posts to discuss.

Analyzing if the source code uses “forbidden” imports was implemented with simple text parsing techniques. The tool goes through the Java source files in a certain directory where student code is located, reads the .java files and checks if the source uses imports that are not allowed.

For example, when implementing a Stack data structure, student must not import the java.util.Stack and just use it instead of implementing their own stack from scratch, using an array as internal data structure. Or when sorting an array, java.util.Arrays is not imported and Arrays.sort is not used, because student should have used their own implementation of the sorting algorithms.

These kinds of mistakes can be found by analyzing the imports in the student’s .java files. You cannot call Arrays.sort without having import java.util.Arrays in the .java file.

The tool first reads a rules file that contains the import rules for the course:

[allowedImports]
*,import java.util.Comparator
*,import java.util.function.Predicate
*,import oy.interact.tira.
*,import java.lang.Math;
*,import java.util.concurrent.atomic.AtomicInteger;
CodeWordsCounter,import java.io.
CodeWordsCounter,import java.nio.
SetImplementation,import java.util.Iterator

For example, the java.util.Comparator may be used by the students in all of their .java files (*). The same goes with java.util.function.Predicate and all the files in package oy.interact.tira (the course source packages). The class SetImplementation may also import java.util.Iterator.

Et cetera. As you can see, this is an allow list, not a deny list. The imports that are allowed and can be used in the course are small in number, so it is easier to have an allow list than deny list.

Reading the allow list and analyzing the source files was straightforward to implement by simple file handling and text parsing. First read in the rules file and parse the import rules from there, into this structure:

fileprivate
struct SourceRequirementsSpecification {
   var allowedImports: [String: [String]] = [:]

The Swift dictionary object allowedImports has the class as the key (either * or the name of the class rule is about), and then the array of allowed imports the class may contain.

Then the tool starts reading the student .java files and handles all lines beginning with import to check if that class may import that specific thing or not. If the .java file imports something that was not allowed, the tool database table (for the student’s repository) having a column with integer value, a specific bit in that number (a “flag”) is set to 1 (true). This indicates that the student’s code imported something it was supposed to not do.

This is the data structure keeping track of these repository flags, if they are set or not set:

struct Flags: OptionSet {
   typealias RawValue = Int16
   var rawValue: Int16

   static let remoteFails = Flags(rawValue: 1 << 0)
   static let buildFails = Flags(rawValue: 1 << 1)
   static let usesForbiddenFeatures = Flags(rawValue: 1 << 2)
   static let repositoryIsPublic = Flags(rawValue: 1 << 3)
   static let repositoryHasOutsideCollaborators = Flags(rawValue: 1 << 4)
   static let repositoryTestingIsOnHold = Flags(rawValue: 1 << 5)	
}

The Swift OptionSet is a type that presents a mathematical set interface to a bit set. So each flag is a single bit in the 16 bit integer, being either off (0) or on (1). So instead of having six boolean columns in the database table, I can have one 16 bit integer column to store 16 different flags. Most of those still available for future needs.

The flag usesForbiddenFeatures is set in this case of student code using forbidden imports (or other features, discussed below). Other flags are used to:

track if the remote repository cannot be accessed at all (remoteFails) or is empty,
compiling the code fails (buildFails),
repository is public even though it must be private (repositoryIsPublic) and finally
if the repository has other collaborators than the student owning the repo and the course teachers (repositoryHasOutsideCollaborators).

These are all issues that have been met in the course during the previous years.

Final flag repositoryTestingIsOnHold can be set manually from the tool user interface for those repositories that mess up the testing, either because the tests are stuck or take too much time and therefore should not be executed before fixed.

That flag struct is a 16 bit integer, the datatype of the database column storing the flags in each student repository database table. If the code has issues related to these rules, the flag is set. When the git repository is updated from the remote, the user needs to run the code analysis again. If a violation of a rule was fixed, the flag is then reset to zero (false), if no other violations are found.

The imports checking from source code is not enough, though.

Sometimes students do mistakes in method implementations they are not supposed to do. Like when implementing a recursive binary search, they implement an iterative binary search. Or when sorting a table in descending order, they first sort in ascending order and then reverse the table order. Since time efficiency is the major topic of the course, this is not what they are supposed to do. Instead, they should use a comparator to sort the table to descending order in the first place, and then no reverse operation is needed at all, making the operation faster.

For checking these kinds of issues the import rules are not enough since these kind of mistakes can be made without importing something. Therefore, the same rules file contains also rules like this:

[codeRules]
# Algorithms.binarySearchRecursive must contain word binarySearchRecursive
Algorithms.binarySearchRecursive,must,binarySearchRecursive
# Algorithms.binarySearchRecursive must not contain word while
Algorithms.binarySearchRecursive,mustnot,while
Algorithms.binarySearchIterative,must,while
CodeWordsCounter.topCodeWords,must,Algorithms.fastSort
CodeWordsCounter.topCodeWords,mustnot,Algorithms.reverse

Here lines beginning with # are comments and ignored by the tool. Illustrating the examples of mistakes above, the rule:

Algorithms.binarySearchRecursive,mustnot,while

specifies that the algorithm binarySearchRecursive in class Algorithms must not contain the keyword while. Since breaking this rule implies that the student has implemented the recursive algorithm using iterative approach. Therefore, the learning goal of implementing a recursive binary search has not been met, and code needs to be fixed.

These rules are also parsed from the rules file into the above mentioned rules structure, which now looks like this:

enum Rule {
   case must
   case mustNot
}

struct CodeRule {
   let methodName: String
   let rule: Rule
   let stringToMatch: String
}

fileprivate
struct SourceRequirementsSpecification {
   var allowedImports: [String: [String]] = [:]
   var rules: [String: [CodeRule]] = [:] // ClassName : CodeRules
}

In addition to the allowedImports rules, the structure now contains also rules for a class (the Swift dictionary key String has the class name), where the CodeRule structure has the name of the method the rule is about, then if the method must or must not (Rule) contain the string to match.

This is a very simple current implementation, and as I develop this further, it may change to something different later. For example, it should be possible to say that the recursive binary search algorithm must not contain “while” but also not the “for” keyword.

The second example about being able to realize that descending order can be implemented by a specific comparator (instead of ascending sort and then doing reverse, which is more time-consuming operation) is checked by the rule that the class CodeWordsCounter in the algorithm topCodeWords (where the work in the coding task is done) must not call Algorithms.reverse.

If a student uses the Java Collections.reverse or Apache Commons ArrayUtils.reverse for this, that mistake is already caught using the import rules.

For this feature implementation, I used the Antrl parser generator. Antlr has existing grammars for many languages, including Java, and it supports generating parsers also for Swift. So all I needed to do was to get the Antlr tool and the Java grammar files (Java20Lexer.g4 and Java20Parser.g4) and then generate the Java parser and lexer for Swift:

$ antlr -Dlanguage=Swift -message-format gnu -o Autogen Java20Lexer.g4
$ antlr -Dlanguage=Swift -message-format gnu -o Autogen Java20Parser.g4

Those commands place the generated Swift source code files in a subdirectory named Autogen.

Using the generated Swift code from Autogen, the tool can then parse Java code. All I needed was to implement a class that extends the Antlr generated Java20ParserBaseListener, give the rules to this class and start parsing the source string read from the student’s java file, containing the Java source code:

import Antlr4
class Java20MethodBodyChecker: Java20ParserBaseListener {

   private let file: String
   private let rules: [CodeRule]
   private var currentMethod: String = ""
   private var stream: ANTLRInputStream? = nil

   var violations: [String] = []
	
   func start(source: String) throws {
      stream = ANTLRInputStream(source)
      if let stream {
         let lexer = Java20Lexer(stream)
         let parser = try Java20Parser(CommonTokenStream(lexer))
         let parserTree = try parser.compilationUnit()
         let walker = ParseTreeWalker()
         try walker.walk(self, parserTree)
      }
   }
// ...

Analysing the code happens in the overridden handler for parsing Java methods, enterMethodDeclaration. There, the implementation checks, using the rules, if the Java code contains any keywords that should not be there or does not contain keywords that should be here:

override func enterMethodDeclaration(_ ctx: Java20Parser.MethodDeclarationContext) {
 if let identifier = ctx.methodHeader()?.methodDeclarator()?.identifier() {
  currentMethod = identifier.getText()
  let ruleSet = rules.filter( { $0.methodName == currentMethod } )
  if !ruleSet.isEmpty {
   if let body = ctx.methodBody() {
    let bodyText = body.getText()
    for rule in ruleSet {
     switch rule.rule {
      case .must:
       if !bodyText.contains(rule.stringToMatch) {
        violations.append("\(file).\(rule.methodName) ei käytä \(rule.stringToMatch), vaikka pitäisi")
       }
      case .mustNot:
       if bodyText.contains(rule.stringToMatch) {
        violations.append("\(file).\(rule.methodName) käyttää \(rule.stringToMatch),  vaikka ei pitäisi")
       }
     }
    }
   }
  }
 }
}

The violations then contains all the things found to be against the rules.

This worked, except for one thing: if the code has comments that do/do not have the specified keywords of the rules, the code analysis did check those comments also. Even though the comments do not influence the implementation and are therefore totally irrelevant.

So if the comments of recursive binary search algorithm contains the keyword while, the tool would report that as a violation of the rule! Or if the code contains a required keyword, but not in the actual code implementation but only in the comments, the analysis would be satisfied. End result: not satisfactory at all.

First I started to fix this by implementing code myself to remove the comments from the code in a String variable before the analysis, but soon I realized that the Antlr parsing tool should somehow already handle this.

What I did instead, was (after some searching on the internet) to modify the Antrl lexer file for Java so that it does not parse the comments at all, but skips them:

// AJU: COMMENT: '/*' .*? '*/' -> channel(HIDDEN);
COMMENT : '/*' .*? '*/' -> skip;

// AJU: LINE_COMMENT: '//' ~[\r\n]* -> channel(HIDDEN);
LINE_COMMENT : '//' ~[\r\n]* -> skip;

Then I regenerated the Swift parser and lexer source files and added them to the tool project.

Using the new lexer/parsers, the parsed Java code does not anymore include comments and I do not have to consider them at all! This works perfectly.

Without asking anything about any “AI” tools out there. As usual, I do not touch them even with a long stick. I prefer to discover and learn myself, as long as the internet is not polluted with “AI” generated crap which makes everything impossible.

Below you can see the tool reporting about imports in a student repository, imports that should not be in the code, and that the code in CodeWordsCounter.topCodeWords does not call Algorithms.fastSort, even though it should, based on the rules in the rules file.

As you can see, the first violation is because the student has misnamed the class as setImplementation. In Java, classes (and therefore also class files) should always begin with capital letters and the name should be SetImplementation instead. So this is a mistake, but actually does not violate the import rule; the Set data structure implementation may use the Iterator. Therefore, when the rules are found to be violated, it often requires human rechecking to make sure no unfair judgments are made by the automation.

Considering the 300 something students in the course, perhaps each of them making at least 3-5 mistakes the code analysis could find, this results in 900-1500 mistakes in total. For the teacher to comb through all the hundreds of lines of code in each of the student projects, repeatedly during the course, working on this “code quality control” task manually is very time consuming and therefore not a realistic task. Especially considering the available teaching resources mentioned in the beginning.

Hopefully the new features of this tool and the similar tests given to students in Java test code, improve the support the teachers can provide during the course to lead the students to the correct path of implementing the required data structures and algorithms.

The post is getting quite long, so the rest of the new features will perhaps be discussed in future posts. Unless I decide to focus on other more interesting topics. Have a nice day!

Tag: lexing

Multiple Git repository status awareness & grading app – updates