The Intel Edison platform is a great piece of hardware for wearables and other mobile computing projects. Intel manages two open source repositories, MRAA and UPM to enable multi-language interfaces to hundreds of different sensors, shields, motors, servos, input devices and displays.

However, the Edison is also still a very new platform and most of this support is coming from the open source community. As a result, many devices have very minimal support and very little documentation. In my case, the Sparkfun OLED Block for Intel Edison happened to be one of these devices. Sparkfun provided an extremely barebones sample, only in C++. The UPM module similarly provided little functionality and an even more barebones sample. Fortunately, we can remedy this situation!

Setting up the environment

UPM is an open-source project and all the development tools needed to build and extend the Edison are already on the device.
This was such a foreign concept to me, I spent an entire day trying to create my own development environment to replicate the Edison. I failed miserably at this. It was only after a good night's sleep that I thought to myself "I wonder if I can compile UPM on the Edison itself?".

The first thing to do is to make sure the YOCTO Linux has the most recent package manifest. The firmware, OS and packages that come preinstalled can vary significantly depending on the manufacturing date. You can follow my blog post on out-of-box configuration for the Edison that walks through this process. It will walk through how to connect to your Edison, update the firmware and the various options to get an SSH terminal session open.

From an SSH terminal, run these commands:

echo "src intel-iotdk http://iotdk.intel.com/repos/1.1/intelgalactic" > /etc/opkg/intel-iotdk.conf

(This pulls the latest package manifest from Intel's server and saves it to the correct conf file)

opkg update

(Updates the opkg manifest using the downloaded conf file)

opkg upgrade

(upgrades any installed packages to the latest available version)

opkg install git

(installs the git client on the Edison)

cd ~/
git clone https://github.com/intel-iot-devkit/upm.git

(changes to the home directory on the Edison, then clones the UPM repository to the device)

cd upm
mkdir build
cd build
cmake .. -i

The -i option on cmake runs interactive mode which will walk you through the advanced options. The only option that needs to be changed is to set the install path option from the default (/usr/local/) to:

/usr/

Which will allow you to quickly make and refresh the app to see changes.

Building the library

You can see the code I contributed here:
https://github.com/intel-iot-devkit/upm/commit/14a3924fb29bbf2e5517636699f8726ec7013b81

Because I was tweaking low-level SPI bus routines, I routinely caused the Edison to completely lockup, forcing hard power cycles. So I started on the hunt to speed things up. Here's what I found:

Partial build/deploy

One trick I only discovered by accident is how to isolate the module you're working on when running the build and deploy. There's some unclear language about this in the UPM build guide that didn't quite work for me in practice.

Normally you'd just go into your source root folder and run

 make install

If you're working on a specific module, you can just build the module's directory with.

 cd src/lcd
 make install

Leveraging this shortcut cut my compile times from about a minute down to a few seconds.

Using remote filesystem mounting

It took real time after every restart to bring back up vim, open my files and get back to the place I left off. Fortunately I found another open-source project called win-sshfs https://github.com/dimov-cz/win-sshfs. It lets you easily map any scp/sftp accessible filesystem as a drive on Windows! One tiny issue was that it was, at the time, broken in Windows 8/10. So I fixed it and now have easy filesystem mounting on Windows. I used Visual Studio Code as my editor, and gained the added benefit of doing all my git actions from Windows, rather than needing to suffer through CPU intensive syncs and merges on the Edison directly.

Using remote mounting made my dev cycle much faster, but there's still one trick left.

Watchman for total automation

The final trick in the magic hat is Watchman: https://github.com/facebook/watchman, an awesome little utility from Facebook. It sits on the Edison and everytime I save a file, auto runs the build and run script I want. In my case, simply killing node if it's already running, then running a make install from my module directory and restarting the node example app.

Using these three speedups, my total time to test a change went from 3-5 minutes down to about 15 seconds.

Improving the OLED Support

The biggest known issue with the Edison in regards to the OLED screen is the poor quality SPI bus. Pushing more than a 16bits of data per message
and/or setting the bus frequency over 10000000 result in the board completely hard locking.

So I had to use a bunch of tricks to batch writes into bytes and batch updates to the screen over the SPI bus. With the current refresh rate being about 2 frames per second, there was a lot of room for improvement!

https://github.com/intel-iot-devkit/upm/blob/master/src/lcd/eboled.hpp
https://github.com/intel-iot-devkit/upm/blob/master/src/lcd/eboled.cxx

Changing the addressing mode.

The Sparkfun OLED unit is a modified SSD1306 OLED display from Solomon Systech. Fortunately I was able to find the full technical data documentation for the display and driver.

I found that the display defaults to using Page Addressing Mode
Page Addressing Mode Diagram

The issue with this is that you have to manually set the page and column start/end pointers every time you want to write to the screen! It's 8 extra calls , plus the latency induced for page switches.

Instead I opted for Horizontal Addressing Mode:

Horizontal Addressing Mode Diagram

 setAddressingMode(HORIZONTAL);

 //Set Page Address range, required for horizontal addressing mode.
 command(CMD_SETPAGEADDRESS); // triple-byte cmd
 command(0x00); //Initial page address
 command(0x05); //Final page address

 //Set Column Address range, required for horizontal addressing mode.
 command(CMD_SETCOLUMNADDRESS); // triple-byte cmd
 command(0x20); // this display has a horizontal offset of 20 columns
 command(0x5f); // 64 columns wide - 0 based 63 offset

Both Horizontal and Vertical addressing modes simplify the calls. They both map each column to a single virtual array to be written. No more interim page sets or column sets. Only specify the total range for pages and columns! This change got me from 2fps to 5fps!

No more single bit writing - bytes and buffers

The rest of the speed comes from the writing strategy. Originally, each pixel was being blitted directly to the display. That's 4 calls per pixel - set page, set column low/high and set bit.

Instead, I opted to store an entire screen buffer in memory. I'd planned to actually use double-buffering, but found that with the ceiling of the SPI bus frequency, it didn't provide any performance increase over the single buffer.

 static uint16_t screenBuffer[BUFFER_SIZE];

By storing the entire screen memory in a byte array and then writing to the screen in words (16bit chunks), I was able to hit the absolute limits of the SPI bus.

 mraa::Result EBOLED::refresh()
 {
   mraa::Result error = mraa::SUCCESS;;

   m_gpioCD.write(1);            // data mode
   for(int i=0; i<BUFFER_SIZE; i++)
   {
     error = data(screenBuffer[i]);
     if(error != mraa::SUCCESS)
       return error;
   }

   return error;
 }

 mraa::Result EBOLED::data(uint16_t data)
 {
   m_spi.writeWord(data);
   return mraa::SUCCESS;
 }
The devil in the details (columns not rows!)

Here's where I hit a roadblock for several days. No matter what I tried, I just couldn't get the display to look correct. Everything was off, but in a consistent pattern. After stepping away for a day, I came back to immediately see the problem. I'd assumed that when using Horizontal addressing, I would be addressing pixels by row position, not column!

Once I realize this, it was just a matter of lots of bit shifting and math to get things aligned. With a 64x48 display, there are 6 pages, with each containing 64 columns. Each column of course contains 8 rows.

Screen Layout

Now this would all be simple to manage if I was writing BYTES (uint8) to the display. But to maximize throughput, I'm writing WORDS (uint16). This means each word in my buffer array actually contains two columns. So a little extra math needed to handle the extra wrinkle:

 void EBOLED::drawPixel(int8_t x, int8_t y, uint8_t color)
 {
   if(x<0 || x>=OLED_WIDTH || y<0 || y>=OLED_HEIGHT)
     return;

   /* Screenbuffer is uint16 array, but pages are 8bit high so each buffer
    * index is two columns.  This means the index is based on x/2 and
    * OLED_WIDTH/2 = VERT_COLUMNS.
    *
    * Then to set the appropriate bit, we need to shift based on the y
    * offset in relation to the page and then adjust for high/low based
    * on the x position.
   */

   switch(color)
   {
     case COLOR_XOR:
       screenBuffer[(x/2) + ((y/8) * VERT_COLUMNS)] ^= (1<<(y%8+(x%2 * 8)));
       return;
     case COLOR_WHITE:
       screenBuffer[(x/2) + ((y/8) * VERT_COLUMNS)] |= (1<<(y%8+(x%2 * 8)));
       return;
     case COLOR_BLACK:
       screenBuffer[(x/2) + ((y/8) * VERT_COLUMNS)] &= ~(1<<(y%8+(x%2 * 8)));
       return;
   }
 }

Once the pixel blitting was working, it was time to start adding useful features. I leaned heavily on the awesome Adafruit GFX Library and implemented enough functions to enable the project we were working on (to great a menuing GUI for the Edison!).

Lines, rectangles, circles, triangles, rounded corners, screen fill/blank, multi-size fonts, string writing, virtual cursor positioning, and filled rectangles, circles, triangles and corners.

Of course to verify that everything worked, I had to make a sweet demo for the examples to replace the barebones one that preceeded it.

https://github.com/intel-iot-devkit/upm/blob/master/examples/javascript/eboled.js

Wrapping things up

With everything sorted, we could finally put together a non-hacky GUI system for the display.

In the end I was able to get the display to push ~28fps. The only big improvement space left would be implementing differential buffer swapping, as most of the time there's very little change from frame to frame and so doing full buffer swaps is overkill. I think this would get the common case up to 50+fps, but it would require quite a bit more logic to implement in a way that is passive for Edison developers.

Hopefully one of these days I'll get around to cleaning up the GUI system and releasing it on a repo of its own. For now, here's a demo of it in action: