The Oger Toolbox
The Oger toolbox is the most advanced library for Echo State Network-based Reservoir Computing. Unfortunetly, it is not maintained anymore but is advanced enough to apply ESN to most machine learning problems. I am currently developping an advanced framework for ESN name EchoTorch and based on pyTorch and GPU acceleration. But in this section we will introduce the Oger toolbox to give to the readers a pratical view of this field and to make them capable to apply ESNs. It would be out of this article to describe all Oger’s functionnalities so it will be limited to the installation, the creation of the resevoir, the learning phase and the creation of home-made nodes.
Please note that Oger is no longer supported. You can migrate on new frameworks such as EchoTorch. You will find a tutorial on the third part of this series here.
Installation and requirements
The Oger toolbox is available from my GitHub profile in the Oger repository : https://github.com/nschaetti/Oger. First, the installation requires some libraries :
- >= Python 2.6;
- >= Numpy 1.1;
- >= Scipy 0.7;
- >= Matplotlib 0.99;
- => MDP 3.0;
- Parallel Python (optional)
Once this dependencies installed, Oger can be installed with the conventional setup.py script.
sudo python setup.py install
To use Oger, you only need to import it at the beginning of your script.
The first step consists to create a reservoir, for this Oger supply the object ReservoirNode.
reservoir = Oger.nodes.ReservoirNode(input_dim=1, output_dim=1)
This line creates a reservoir of 100 randomly connected neurons with a 1-dimensional input. Other parameters can be specified as the constructor of the class specifies it.
def __init__(self, input_dim=None, output_dim=None, spectral_radius=0.9, nonlin_func=np.tanh, reset_states=True, bias_scaling=0, input_scaling=1, dtype='float64', _instance=0, w_in=None, w=None, w_bias=None)
Some parameters are important :
- input_dim (integer) : The input dimension;
- output_dim (integer) : The reservoir size;
- nonlin_func (function) : The neurons’ non-linear g function;
- bias_scaling (double) : The scaling of the bias, which is zero by default (no bias);
- input_scaling (double) : The scaling of the of the inputs, the default is one (the inputs stay the same);
- spectral_radius (double) : The spectral radius of the W matrix (reservoir’s internal weights), the default spectral radius is 0.9;
- reset_states (boolean) : Specify if the resevoir’s state should be reset at each call of the execute function;
- w_in (np.array) : The matrix of the input weights. If None, it is generated randomly from values in the set with , with an uniform distribution;
- w (np.array) : The matrix of the reservoir’s internal weights. If None, it is generated randomly with values between -1 and 1 uniformly distributed ();
- w_bias (np.array) : The matrix if the reservoir’s bias. If None, it is generated randomly with values in the s et with uniformly distributed;
The object LeakyReservoirNode supply a reservoir with an additional parameter leaky_rate with allow to modify the reservoir temporal dynamics.
reservoir = Oger.nodes.LeakyReservoirNode(input_dim=1, output_dim=100, leaky_rate=0.5)
Once the reservoir created, with have to add an output linear layer. Oger is based on the MDP library which allow the creation of flow between inputs and outputs of layers and the training of the whole system by calling the train method. Then, the following line,
flow = mdp.Flow([reservoir, readout], verbose=1)
creates a flow between a reservoir and an output layer. It is mandatory that the output dimension of the reservoir matches the input dimension of the output layer. We can then create a linear output for our reservoir :
readout = Oger.nodes.RidgeRegressionNode()
A lot of output nodes are available in the Oger and MDP frameworks and the RidgeRegression node implements the Tikhonov regularisation. Its constructor gives more informations.
def __init__(self, ridge_param=0, eq_noise_var=0, with_bias=True, use_pinv=False, input_dim=None, output_dim=None, dtype=None)
Two parameters are important, the first is ridge_param which specifies the regularisation factor . The node does not optimize this parameter, this optimisation can be done by a grid search for example. The second, is use_pinv which specify if the algorithm will use the Moore-Penrose pseudo-inverse. Once the flow with a reservoir and a linear output created, it is possible to train it on a dataset. Oger supply a set of general function to generate and prepare a dataset.
def narma10(n_samples=10, sample_len=1000) def narma30(n_samples=10, sample_len=1000) def memtest(n_samples=10, sample_len=1000, n_delays=10)
We will use these functions to create a training set composed of the output of the NARMA10 function with a length of 5000, and a test set of length 1000.
u_train, y_train = Oger.dataset.narma10(n_samples=1, sample_len=5000) u_test, y_test = Oger.dataset.narma10(n_sampels=1, sample_len=1000)
The variables u_train and y_train are array with dimensions , and u_test, y_test are array with dimension . With these data, we can create the training set.
data = [None, zip(u_train, y_train)]
The variable data is an array where each element represents the input of the network and the target output for this layer. Here we can see that the reservoir will not be trained (None), but the 5000 NARMA10 data will be injected as input for the network, and the output layer readout will be trained to reach the target output y_train. Each element of data must be in the form . The zip function returns a list of tuple, where the i-th tuple contains the i-th elements of each sequences passed as argument. We can now call the train function to launch the training phase.
We execute the flow on the test set.
y_hat = flow(u_test)
The variable is the network’s outputs, we will compare these to target values to evaluate the ESN’s performance. For this, Oger supply a set of error function such as.
def nrmse(input_signal, target_signal) def nmse(input_signal, target_signal) def rmse(input_signal, target_signal) def mse(input_signal, target_signal) def mem_capacity(input_signal, target_signal)
We can then display the error measures resulting from the test phase.
print "NRMSE: " + str(Oger.utils.nrmse(test_y, y_hat) print "NMSE: " + str(Oger.utils.nmse(test_y, y_hat) print "RMSE: " + str(Oger.utils.rmse(test_y, y_hat)) print "MSE: " + str(Oger.utils.mse(test_y, y_hat))
The complete code is the following.
import Oger import pylab import mdp import numpy as np # Reservoir de demonstration pour la tache # NARMA d'ordre 10. # Genere le jeu d'entrainement et de test u_train,y_train = Oger.datasets.narma10(n_samples = 1, sample_len = 5000) u_test,y_test = Oger.datasets.narma10(n_samples = 1, sample_len = 1000) # Construction du reservoir et de la couche # de sortie. reservoir = Oger.nodes.ReservoirNode(output_dim=100) readout = Oger.nodes.RidgeRegressionNode() # Construit le flux flow = mdp.Flow([reservoir, readout], verbose=1) # Le jeu d'entrainement pour chaque noeud du reseau data = [None, zip(u_train,y_train)] # Entraine le flux flow.train(data) # Applique le reseau sur le jeu de test y_hat = flow(u_test) # Mesure l'erreur du reseau print "NRMSE: " + str(Oger.utils.nrmse(y_test, y_hat)) print "NMSE: " + str(Oger.utils.nmse(y_test, y_hat)) print "RMSE: " + str(Oger.utils.rmse(y_test, y_hat)) print "MSE: " + str(Oger.utils.mse(y_test, y_hat))
This gives the following result.
Training node #0 (ReservoirNode) Training finished Training node #1 (RidgeRegressionNode) Training finished Close the training phase of the last node NRMSE: 0.501576238778 NMSE: 0.251830553861 RMSE: 0.0529388033125 MSE: 0.00280251689616
The MDP framework gives the possibility to create your own kind of nodes which can be then used as part of a flow. This can serve for example to create specific output layer, for example, to classify samples by choosing the greatest output corresponding to a class, or to create nodes joining multiple reservoir states across time. For this, we can create a new class of node by defining a sub-class of the Node class.
This node can not be trained, we can specify that by overriding the function is_trainable and returning False.
def is_trainable(self): return False # end is_trainable
We can then override the execute function which takes as argument the output of the preceding layer. We take here this output to reshape the array of merge several states.
def _execute(self, x): # Nombre d'elements a la sortie nbOut = int(x.shape/self.entrySize) # Pour chaque digit traite en entree x.shape = (nbOut, self.reservoirSize * self.entrySize) return x # end _execute
The complete code of this new node is the following.
# CLASS JoinedStatesNode class JoinedStatesNode(mdp.Node): # Constructor def __init__(self, nb_states = 20, input_dim = 100, dtype = 'float64'): super(JoinedStatesNode, self).__init__(input_dim = input_dim, dtype = dtype) # Variables self.entrySize = nb_states self.reservoirSize = input_dim # end __init__ # Node training def is_trainable(self): return False # end is_trainable # Node execution def _execute(self, x, images): # Nombre d'elements a la sortie nbOut = int(x.shape/self.entrySize) # Pour chaque digit traite en entree x.shape = (nbOut, self.reservoirSize * self.entrySize) return x # end _execute # end JoinedStatesNode
We can then integrate this new node to a flow.
joiner = JoinedStatesNode(input_dim = 100, nb_states = 20) flow = mdp.Flow([reservoir, joiner, readout])