Building an NPU With a Third Party Classifier

Depending on the goals of your model and structure of your data, you may wish to test different classifier types. Here we'll walk through using a logistic regression classifier from Sklearn.

If you create a NIML model using the model object, both an NPU and a classifier will be created through a single step. However, in some situations, you may want to create an NPU but pair it with a different classifier--either of your own design or through an existing package option, such as classifiers from Sklearn.

Step 1: Load the Data

Just as we have done in many of the other demonstrations, load the dataset you wish to use and split it into train and test splits.

import pandas as pd
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split

#Load the dataset
data_file = datasets.load_iris()

#Extract the data, outcome variable, and character labels from the source dataset
X = pd.DataFrame(data_file.data)
Y = pd.DataFrame(data_file.target)

#Combine the features and label into one dataset to work with NIML
data = pd.concat([Y,X], axis=1, ignore_index=True)

#Create train and test splits of the data
train, test = train_test_split(data, test_size=0.20, random_state=451)

Step 2: Encode The Data

Next, set your encoder parameters, configure the encoder, and run the data through to get your training and testing iSDRS. 

from niml.encoder import encoder

# Create an encoder object
iris_encoder = encoder.Encoder(
    set_bits=3, 
    sparsity=0.10,
    field_types= ["N"]* 4, # 4 numeric features
    cyclic_flags=[False]* 4, # None of the fields are cyclic
    spans=       [    0]* 4, # Use simple/basic encoding bit-patterns
    cat_overlaps=[    0]* 4, # N/A, data is numeric, not categorical. Set all features to 0
    cat_values=  [ None]* 4, # N/A, data is numeric, not categorical. Set all features to None
    )

# Configure the encoder according to the training dataset's distribution
iris_encoder.config_encoder(input_data=train)

# Encode the training data
train_labels, train_isdrs, sdr_width = iris_encoder.encode(input_data=train, label_col=0)

# Encode the test data
test_labels, test_isdrs, sdr_width = iris_encoder.encode(input_data=test, label_col=0)

Step 3: Build The NPU

To build just an NPU, you will want to make use of the sp_pooler_c package within niml.model.nispooler (as opposed to importing the model package from niml.model).

The parameter selection for the NPU will make use of identical methods you'd employ to initialize a model.Model object. However, you will not need to include any parameter values for subclass_thresh, history_depth, or min_overlap as these only pertain to creating the built in classifier.

from niml.model.nispooler import sp_pooler_c # Import just the NPU

# Build a NPU object
my_npu = sp_pooler_c.SpatialPooler(
# Encoder/Data parameters
sdr_width = sdr_width,
sdr_set_bits=13,
# NPU
neurons=125,
active_neurons=10,
input_pct=0.75,
learning=True,
synapse_inc=15,
synapse_dec=3,
decay_cnt_target=15,
seed=123,
# Boosting
boost_frequency=3,
boost_strength=0.1,
boost_table_length=21,
boost_bend_factor=0.175,
)

Step 4: Train The NPU

Now that we have successfully loaded and encoded our data, we can begin to train the NPU. Notice the different syntax in the code block below--we can't use fit and evaluate calls to complete training, as those require some sort of classifier to be evoked. Instead, we let the NPU neurons learn by looping through our desired number of epochs. At the end of this loop, the my_npu object will be trained and ready for classification. 

epoch_cnt=15

my_npu._learning = True
for epoch in range(epoch_cnt):
    my_npu.compute(isdrs=train_isdrs)
my_npu._learning = False

Step 5: Create Third Party Classifier

For this demo, we will be using Sklearn's LogisticRegression classifier, but others can be used as well. Initialize your desired classifier with any specific values required for its internal operation. In our example, we simply set a random state and max number of iterations. 

from sklearn.linear_model import LogisticRegression
cls = LogisticRegression(random_state=52, max_iter=10000)

Step 6: Perform Data Conversion

Recall that our model utilizes SDRs (sparse distributed representations) which consolidate sparsely entered information into a concise format. For any third party classifier such as LogisticRegression, we need to expand these representations before the NPU outputs will be compatible. The expanded representations are call positional feature vectors (pfvs). 

Note: This conversion needs to be run on both training and test iSDRs so that both training and testing metrics can be calculated. 

If you don't convert your NPU outputs into expanded positional feature vectors, all relationships between outputs are lost and the classifier will only provide garbage output!

# Use the function below to create positional feature vectors (pfvs)
def create_pfv(setbit_list, width):
pfv = [0] * width
for bitpos in setbit_list:
pfv[bitpos] = 1
return pfv

# Get the train outpt sdrs, convert them to pfvs
train_osdrs = my_npu.compute(isdrs=train_isdrs)
train_pfvs = []
for osdr in train_osdrs:
train_pfvs.append(create_pfv(osdr, my_npu.neurons))

# Get the test output sdrs, convert them to pfvs
test_osdrs = my_npu.compute(isdrs=test_isdrs)
test_pfvs = []
for oosdr in test_osdrs:
test_pfvs.append(create_pfv(oosdr, my_npu.neurons))

Step 7: Predict

Finally, we can pass the training pfvs and the training labels into the classifier to fit the classifier, or we can pass the test_pfvs to get predictions. 

# Fit the classifier and predict
from sklearn import metrics

cls.fit(train_pfvs, train_labels)
preds = cls.predict(test_pfvs)
cls.score(test_pfvs, test_labels)
acc = metrics.accuracy_score(test_labels, preds)
print("accuracy", acc)
accuracy 0.9666666666666667