A Tutorial on OgmaNeo2

It’s about time we make a tutorial on OgmaNeo2. This tutorial will focus on the Python bindings, PyOgmaNeo2 – which more or less mirrors the C++ API. We will show you how to create a (very) simple “wavy line prediction” example, from which you will learn the key concepts of OgmaNeo2.

What is OgmaNeo2?

In short, it’s a library that implements Sparse Predictive Hierarchies (SPH), a very fast online/incremental sequence predictor. We will go over some concepts as needed, but for a more thorough explanation, we recommend taking a look at our whitepaper. As the name implies, this is the second major iteration of our system, and is the recommended one to use as of May 2020.

OgmaNeo2 learns online, meaning it uses data samples only once as they come in, and then doesn’t store them in a buffer but rather remembers them naturally (unlike most of Deep Learning). It is also extremely fast, and is able to run on things such as a Raspberry Pi Zero. It gets its speed from a combination of the online learning (which removes the need for expensive replay), an exponential memory system, and high degrees of sparsity.

We will now quickly go over some of the basic concepts of OgmaNeo2.

CSDRs

Columnar Sparse Distributed Representations (CSDRs) are the way information (in particular network state) are stored in OgmaNeo2. A CSDR can be thought of as a 3D grid of cells with dimensions sizeX x sizeY x columnSize (sizeZ). We can also equivalently think of it as a 2D grid of “columns”. Cells are binary, but with a restriction – only one cell may be “active” (on, 1) within a column (the last (Z) dimension). This is called a “one-hot” encoding. A CSDR is therefore a 2D grid of one-hot vectors.

Due to the one-hot nature of the columns of CSDRs, we can represent them efficiently as an array of integers. Each integer represents the index of the active cell in the column. This is how CSDRs are stored in both the Python bindings and the C++ API (list and vector).

Since CSDRs are not 1D, you need to be able to map between coordinates of columns (columnX, columnY) and their indices into arrays. OgmaNeo2 uses row-major indexing for this:

index_{array} = col_y + col_x * size_y

col_x = index_{array} / size_y \\ col_y = index_{array} \pmod {size_y}

Each integer in the list/vector represents the index into the column/Z dimension, so it is in the range [0, columnSize).

Mapping to and from CSDRs

We know the basic structure of the CSDR, but how do we get data in arbitrary forms to a CSDR format? Well, there are two basic methods:

Map each element of each datapoint to an index in a column of a CSDR (binning to one hot)
Use a pre-encoder

Some pre-encoders are similar to the encoders used internally in the OgmaNeo2 hierarchy, but may work differently, and can be written to take data from any form to a CSDR. Some pre-encoders have corresponding pre-decoders that go with them, allowing the user to reverse the transformation. OgmaNeo2 by default includes one commonly used pre-encoder, the ImageEncoder (which, as the name implies, encodes images to CSDRs).

If one has a particular insight into how the data can be converted to a CSDR, they may always develop custom pre-encoders, of course.

A commonly used and simple idea for pre-encoding scalars to CSDRs is to simply bin them into one-hot vectors (columns). We will go over this method in particular as it is a good quick-and-dirty way to get information into CSDR form. To convert a single scalar in [ 0, 1 ] into a single one-hot column, one can use:

colIndex = round(scalar * (sizeZ - 1))

And in reverse:

scalar = colIndex / (sizeZ - 1)

A squashing function or simple linear transformation can be used to get scalars into the [0, 1] range for this method.

Inputs and Prediction

OgmaNeo2 works by predicting the next timestep (t + 1) of any CSDRs you give it. It does so in a streaming fashion – you pass in one set of CSDRs per “frame” or “step”, and it will predict what it thinks the next set of CSDRs will be. In general, when we refer to “predictions”, we mean the (t + 1) prediction the hierarchy outputs. OgmaNeo2 can only perform such predictions, and everything (including reinforcement learning) is based on the idea that a (t + 1) prediction is all you need in a temporal setting.

In OgmaNeo2, “the hierarchy” refers to the layered encoder/decoder pairs. Each encoder tries to compress information from one or several timesteps, and the decoders reverse this compression but to the next timestep, (t + 1). Each layer (encoder/decoder pair) predicts the next timestep of the layer directly below it, or the input (lowest layer). When we say the “bottom” of a hierarchy, we are referring to where the inputs and predictions happen; when we say the “top” of the hierarchy, we refer to the highest level of compression.

OgmaNeo2 takes several CSDRs as inputs at the bottom of the hierarchy, and the produces predictions of those same CSDRs at the bottom of the hierarchy as well. The CSDRs can be any arbitrary shape.

The general flow of OgmaNeo2 usage is as follows:

Initialize hierarchy
Convert data sample to CSDR
Feed CSDR into hierarchy
Get the predicted CSDRs and use them as needed
Repeat 2 – 4

Wavy Line

We will now describe how to use OgmaNeo2 to implement the wavy line example.

The Wavy Line task consists of a simple artificial time series. In our case this will just be some sinusoidal curves put together. Our goal is to learn it well enough that we can keep producing the wavy line indefinitely (memorizing).

If you haven’t installed OgmaNeo2 and PyOgmaNeo2, follow their respective README file instructions.

Create a new Python script in your favorite editor, and start by importing pyogmaneo.


import pyogmaneo

import numpy as np

If all goes well, running this should work without errors. We also imported numpy for later.

Next, we need to create a ComputeSystem. This is an object that acts as a context for the computations being performed. It is passed in to many functions, in particular those that require parallelism.


cs = pyogmaneo.ComputeSystem() # Optionally specify seed in ctor

We should also set the number of threads we want to use. This is done globally, however:


pyogmaneo.ComputeSystem.setNumThreads(4) # 4 threads

Now we need to describe the hierarchy we wish to build. To do this, we need to fill out a list of LayerDesc(s). A LayerDesc simply describes the shape and functionality of a layer. It contains several parameters, we will only set some of them here and leave the others at defaults.


lds = []



for i in range(5):

  ld = pyogmaneo.LayerDesc()

  ld.hiddenSize = pyogmaneo.Int3(4, 4, 16) # Size of the layer

  ld.ffRadius = 4 # Feed forward (encoder) radius

  ld.pRadius = 4 # Predictor (decoder) radius



  ld.ticksPerUpdate = 2 # How much slower this layer will be

  ld.temporalHorizon = 4 # Number of timesteps of memory



  lds.append(ld)

Here, we set the size and connectivity radii of the columns. OgmaNeo2 uses sparse matrices to represent connectivity patterns. Each encoder column is projected onto the input CSDR, and then connects to a square radius around that position. The diameter and area of the receptive field is therefore:

diam = 2 * rad + 1 \\ area = diam * diam

Receptive fields always span across the entire column dimension.

The ticksPerUpdate and temporalHorizon parameters describe how we want the memory to work. OgmaNeo2 uses something called “exponential memory” (EM) to create short-term memories. This is essentially a type of bidirectional clockwork RNN architecture – each layer runs some multiplier (typically 2) slower than that directly below. That multiplier is given by ticksPerUpdate. temporHorizon however describes how many timesteps of memory an individual layer has. It must be greater than or equal to ticksPerUpdate. We find that setting it to 4 when ticksPerUpdate is 2 works for just about everything.

In EM, the number of layers dictates how long the hierarchy can remember. Ignoring temporalHorizon, a hierarchy with a ticksPerUpdate of 2 can remember:

memory = 2^{numLayers}

timesteps of memory. This is why it is called “exponential memory” – each additional layer increases the effective memory horizon of the hierarchy by some (not necessarily constant) multiplier. So for our hierarchy, we will have at least 32 timesteps of memory.

Next, we create the hierarchy with the desired input CSDRs:


inputColumnSize = 32 # Resolution of the input/prediction



h = pyogmaneo.Hierarchy(cs, [ pyogmaneo.Int3(1, 1, inputColumnSize) ], [ pyogmaneo.inputTypePrediction ], lds)

Our hierarchy has 1 (1 x 1 x 32) CSDR that it receives as input, and as the pyogmaneo.prediction flag specifies, predicts as well.

If we want to add more CSDRs to predict, we can simply add to both the CSDR size list and the flag list (they must be the same size). Possible flags are:

inputTypeNone
inputTypePrediction
inputTypeAction

inputTypeNone can be used to use the CSDR solely as input (no prediction will be generated for it). inputTypeAction is a special type of prediction used for reinforcement learning – something to cover at a later date.

Now let’s train on some data! First, we can define our time series:


bounds = (-1.0, 1.0)



def wavy(t):

  return np.sin(t * 0.02 * 2.0 * np.pi) * np.sin(t * 0.035 * 2.0 * np.pi + 0.45) # Some wavy line

We multiplied the sine waves together to make sure it stays in the [-1, 1] range, so we can use the binning method to pre-encode it.

Iterating through same samples:


for t in range(1000):

  valueToEncode = wavy(t)

  valueToEncodeBinned = int((valueToEncode - bounds[0]) / (bounds[1] - bounds[0]) * (inputColumnSize - 1) + 0.5)



  h.step(cs, [ [ valueToEncodeBinned ] ], True) # True for enabling learning

The step function performs a simulation step of the hierarchy. It takes a compute system, and list of CSDRs, and a boolean flag to tell it whether learning should be enabled. Our CSDR is just 1 column in this case.

Now let’s try reproducing the wavy line using the hierarchy! To do so, we simply need to loop the predictions produced by the hierarchy back in as input again, and also read out those predictions:


for t in range(500):

  h.step(cs, [ h.getPredictionCs(0) ], False) # Disable learning

  predIndex = h.getPredictionCs(0)[0] # Get the first (and only) prediction

  

  # Decode value (de-bin)

  value = predIndex / float(inputColumnSize - 1) * (bounds[1] - bounds[0]) + bounds[0]



  print(value)

Now if you run this, you should see it output the wavy line in the terminal. This isn’t very exciting, but will do for this tutorial. For a version that plots and compares the actual and predicted lines, see the WavyLineExample.py in PyOgmaNeo2’s examples directory.

This concludes the wavy line tutorial! In the next tutorial, we will check out reinforcement learning with OgmaNeo2!

2 thoughts on “A Tutorial on OgmaNeo2”

Eduardo says:

May 18, 2020 at 9:46 am

Dear Eric,
Thanks a lot for this tutorial. I am playing with ogmaneo2 for reinforcement learning, so I am very much looking forward to the next tutorial!

1. Eric Laukien says:
  
  June 1, 2020 at 1:30 am
  
  Hi Eduardo, the next tutorial on reinforcement learning is now available!
  https://ogma.ai/2020/06/tutorial-rl-with-ogmaneo2/

A Tutorial on OgmaNeo2

What is OgmaNeo2?

CSDRs

Mapping to and from CSDRs

Inputs and Prediction

Wavy Line

Related

2 thoughts on “A Tutorial on OgmaNeo2”

Leave a Reply Cancel reply

What is OgmaNeo2?

CSDRs

Mapping to and from CSDRs

Inputs and Prediction

Wavy Line

Share this:

Related

2 thoughts on “A Tutorial on OgmaNeo2”

Leave a Reply Cancel reply