ARM11 Emulator and Assembler

Introduction

As part of my first year undergraduate course, my group and I were tasked with implementing an emulator and assembler for a reduced ARM11 instruction set - as well as an extension of our choice.

In this post, I will go over some reflections on our group work and the effectiveness of our implementation.

To prevent plagarism, this post will not contain any code.

Group Organisation

Before starting work on the emulator, the task was split up into distinct “chunks” of work which could be worked on at the same time without interfering with each other to reduce merge conflicts. These sections can be roughly listed as:

The base code of the emulator (i.e. emulation loop and shared structs)
Fetching instructions from memory
Decoding instructions and parsing their contents
Executing each type of instruction

Though the order given here does not necessarily reflect the order in which tasks were completed. For instance, we found that it was far more logical to complete each separate fetch, decode, and execute section before putting them together in the emulator loop. A similar approach has been taken for Part II.

Work is coordinated by regular group meetings, programming sessions (incl. pair programming for challenging sections), and frequent communication on our group chat. Furthermore, we have made use of project management software Trello which enables us to track what has been completed and assign members to tasks that are to be worked on. This ensures that the volume of work is spread out as evenly as possible, prevents merge conflicts, and gives greater control over the project on the whole.

Reflection

Despite most of the group member’s relative inexperience with project management, version control, and the C language in general, our group has adapted to all of these concepts very quickly. In the present, our group works very smoothly, will very few merge conflicts and effective partitioning of larger tasks. This learning curve has been conquered by effective communication. For example, the creation of a README covering the basics of branching in git brought all of the members to a similar level very rapidly. Frequent and clear communication also plays a vital role in ensuring that consistent code style standards are maintained and code redundancy is minimised.

While we have become proficient in the basics of the tools we use, in future, it would be in our favour to master the finer features, such as the GitLab issue board in order to get the maximum value out of them.

Structure

As touched upon earlier, we structured our emulator into clear sections to allow easy collaboration. Fetching, decoding, and executing is all handled separately. Elements such as the state of the emulator (memory, registers etc.), instruction formats are modelled as structs and utility functions (such as I/O) are in their own files so they can be used wherever needed. We also included header guards in all headers to prevent any issues caused by circular imports.

There are some similarities between the emulator and the assembler; for example, they both require a structured format to handle the ARM instructions, the emulator takes the object code and converts it into the format, whereas the assembler goes the other way, turning assembly code into the format then transforming that back into object code. Therefore, we will try and share the instruction format and related code (such as debugging functions and endian-conversion) between the two parts.

Future challenges

One particularly difficult part of building the emulator (the execution of Data Processing instructions) has highlighted the complexity of information that can be expressed within the instructions. Since it has been fairly laborious to extract this information from the instruction, it may be equally challenging to create the instructions in the first place.

Everybody in our group has more experience with higher level languages, such as Java and Python, compared to C. We’ve found that these higher level languages typically have much richer standard libraries. Thus we anticipate that tasks that involve the parsing of instructions using C (in particular converting assembly to the instruction format) and the writing of our own data structures (e.g. symbol table backend) may prove to be challenging.

Emulator

We split our emulator up into into distinct chunks of work so they could be developed simultaneously using the common code defined in the base emulator. This reduced code duplication and minimised the number of merge conflicts we would encounter during development

The base code of the emulator (i.e. emulation loop and shared structs)
Fetching instructions from memory
Decoding instructions and parsing their contents
Executing each type of instruction

The instructions are fetched from the input file and converted into a big endian form to make them easier to process as they would match the diagrams in the spec. The big endian instruction is then decoded by working out what kind of instruction it is and extracting the information, saving it in to a struct. Then, the instruction gets executed using the data inside the instruction struct.

The ARM Pipeline was easily implemented thanks to us separating the Fetch-Decode-Execute cycle into distinct functions that allowed us to call them whenever we wanted.

For a more detailed overview of the Emulator implementation, please see the attached Interim Checkpoint.

Assembler

Like our Emulator, we split our 2-Pass Assembler into distinct parts so they can be worked on separately:

The base code of the assembler (opening the assembly code file, keeping track of information between Pass 1 and Pass 2)
The decoding of the different types of assembly language instructions in the input file
The encoding of the instruction data into the correct binary sequences
The writing out of said binary sequences into the output file

The assembly file is first opened and read line by line for Pass 1. This builds a symbol table using a hash table associating labels with addresses. It also keeps track of how much memory is going to be needed to store the addresses in order to calculate offsets for ldr instructions for example.

Then, we go though the file again for Pass 2. We first convert the Assembly Language instruction into the same instruction structs used in the emulator. This is very useful as it means we don’t need two different representations of instructions. After the instruction has been decoded, we use the information stored in the struct to build the binary representation of the instruction. This is made very simple by the use of the instruction struct as it is just a matter of taking the data from the struct and masking/shifting it into the correct binary format. The binary instruction is then written out to the output file.

Extension

Overview

Our extension is the classic video game Pong, except instead of playing it using traditional methods like a keyboard or mouse, it is controlled by each player’s movement.

A camera is used, in conjunction with tracking markers on the players, to track each player’s movements. The players movements correspond to the movement of their paddle within in the game. To make sure the tracking works as well as it can, it is run on its own machine, sending movement data over a network to the game itself.

Extension

Design and Implementation

Just like the previous parts of our project, the extension needed to be split up so people could work on it without conflicting. The two main parts of the game are the core game itself, and the marker tracking.

The game has been written in C using the SDL library. This library allows us to easily create windows, display graphics, and accept user input. Instead of writing a full collision detection algorithm, since the paddles only move up and down we can simply check the y positions of the ball and the paddle when the ball reaches the x position of the paddle to see if they would collide or not.

The marker tracking is written in C++ using the OpenCV library. This library has a module built to track the ArUco markers used in our project. Each marker has a unique ID, allowing us to track the positions of the marker for Player 1 and the marker for Player 2 simultaneously.

The position differences are then sent from the computer running the marker tracking software to the machine running the game by sending out UDP network packets of the form (p1_diff, p2_diff)

Testing

To test our extension, we wrote a series of tests that ensure the major parts of our extension are working as intended. We then utilised the Continuous Integration feature built into GitLab to run these tests automatically when new commits get pushed to the repository. This allows us to easily see if our code is working so we don’t accidentally push broken code to master.

Group Reflection

We believe as a group we worked pretty well. Our group chat and group/pair programming sessions proved invaluable for effective communication, and allowed us to easily split work up whilst still making sure each person’s code would work with the others’.

Our method of tackling each larger task (e.g. the emulator) by breaking into many smaller tasks (e.g. fetching, decoding, executing the different instruction types) meant each person could have a clearly defined task with minimal overlap, this is definitely something we would do again as not only did it reduce merge conflicts, it meant everybody had an idea of what they had to do meaning there was much less confusion than there otherwise could’ve been.

Personal Reflection

In reflection it is clear to me that our group has worked exceptionally well together.

Each member has displayed an impressive ability to work both independently on their own task, and also in a cohesive manner when working with other team members. Everyone had a notion of how much work they were expected to complete in order to maintain the progress of the project as a whole and how their time must be managed accordingly.

Personally, I feel that I have developed enormously as a programmer throughout this project. Not only have I expanded my technical skill-set in terms of becoming more familiar with git, but I have also learned the importance of test-driven development and how better to coordinate myself within a group environment.

Interestingly, I felt was my greatest strength was closely related to my greatest weakness. More precisely, I found that I always had something to occupy myself with and work on, but at the same time I often found it difficult to delegate tasks to others. Thankfully this did not prove to be an issue due to the strengths of other team members, but I highlighted this as something I could improve on in the future.

When we started this project my personal goal was simply to make a significant contribution to the project. In spite of my initial fear that my programming skills would not be up to par, I feel that I have rapidly overcome this and have achieved precisely that goal.