Depending on the goals of your model and structure of your data, you may wish to test different classifier types. Here we'll walk through using a logistic regression classifier from Sklearn.
If you create a NIML model using the model object, both an NPU and a classifier will be created through a single step. However, in some situations, you may want to create an NPU but pair it with a different classifier--either of your own design or through an existing package option, such as classifiers from Sklearn.
Step 1: Load the Data
Just as we have done in many of the other demonstrations, load the dataset you wish to use and split it into train and test splits.
import pandas as pd import numpy as np from sklearn import datasets from sklearn.model_selection import train_test_split #Load the dataset data_file = datasets.load_iris() #Extract the data, outcome variable, and character labels from the source dataset X = pd.DataFrame(data_file.data) Y = pd.DataFrame(data_file.target) #Combine the features and label into one dataset to work with NIML data = pd.concat([Y,X], axis=1, ignore_index=True) #Create train and test splits of the data train, test = train_test_split(data, test_size=0.20, random_state=451)
Step 2: Encode The Data
Next, set your encoder parameters, configure the encoder, and run the data through to get your training and testing iSDRS.
from niml.encoder import encoder # Create an encoder object iris_encoder = encoder.Encoder( set_bits=3, sparsity=0.10, field_types= ["N"]* 4, # 4 numeric features cyclic_flags=[False]* 4, # None of the fields are cyclic spans= [ 0]* 4, # Use simple/basic encoding bit-patterns cat_overlaps=[ 0]* 4, # N/A, data is numeric, not categorical. Set all features to 0 cat_values= [ None]* 4, # N/A, data is numeric, not categorical. Set all features to None ) # Configure the encoder according to the training dataset's distribution iris_encoder.config_encoder(input_data=train) # Encode the training data train_labels, train_isdrs, sdr_width = iris_encoder.encode(input_data=train, label_col=0) # Encode the test data test_labels, test_isdrs, sdr_width = iris_encoder.encode(input_data=test, label_col=0)
Step 3: Build The NPU
To build just an NPU, you will want to make use of the sp_pooler_c package within niml.model.nispooler (as opposed to importing the model package from niml.model).
The parameter selection for the NPU will make use of identical methods you'd employ to initialize a model.Model object. However, you will not need to include any parameter values for subclass_thresh, history_depth, or min_overlap as these only pertain to creating the built in classifier.
from niml.model.nispooler import sp_pooler_c # Import just the NPU
# Build a NPU object
my_npu = sp_pooler_c.SpatialPooler(
# Encoder/Data parameters
sdr_width = sdr_width,
sdr_set_bits=13,
# NPU
neurons=125,
active_neurons=10,
input_pct=0.75,
learning=True,
synapse_inc=15,
synapse_dec=3,
decay_cnt_target=15,
seed=123,
# Boosting
boost_frequency=3,
boost_strength=0.1,
boost_table_length=21,
boost_bend_factor=0.175,
)
Step 4: Train The NPU
Now that we have successfully loaded and encoded our data, we can begin to train the NPU. Notice the different syntax in the code block below--we can't use fit and evaluate calls to complete training, as those require some sort of classifier to be evoked. Instead, we let the NPU neurons learn by looping through our desired number of epochs. At the end of this loop, the my_npu object will be trained and ready for classification.
epoch_cnt=15 my_npu._learning = True for epoch in range(epoch_cnt): my_npu.compute(isdrs=train_isdrs) my_npu._learning = False
Step 5: Create Third Party Classifier
For this demo, we will be using Sklearn's LogisticRegression classifier, but others can be used as well. Initialize your desired classifier with any specific values required for its internal operation. In our example, we simply set a random state and max number of iterations.
from sklearn.linear_model import LogisticRegression cls = LogisticRegression(random_state=52, max_iter=10000)
Step 6: Perform Data Conversion
Recall that our model utilizes SDRs (sparse distributed representations) which consolidate sparsely entered information into a concise format. For any third party classifier such as LogisticRegression, we need to expand these representations before the NPU outputs will be compatible. The expanded representations are call positional feature vectors (pfvs).
Note: This conversion needs to be run on both training and test iSDRs so that both training and testing metrics can be calculated.
If you don't convert your NPU outputs into expanded positional feature vectors, all relationships between outputs are lost and the classifier will only provide garbage output!
# Use the function below to create positional feature vectors (pfvs)
def create_pfv(setbit_list, width):
pfv = [0] * width
for bitpos in setbit_list:
pfv[bitpos] = 1
return pfv
# Get the train outpt sdrs, convert them to pfvs
train_osdrs = my_npu.compute(isdrs=train_isdrs)
train_pfvs = []
for osdr in train_osdrs:
train_pfvs.append(create_pfv(osdr, my_npu.neurons))
# Get the test output sdrs, convert them to pfvs
test_osdrs = my_npu.compute(isdrs=test_isdrs)
test_pfvs = []
for oosdr in test_osdrs:
test_pfvs.append(create_pfv(oosdr, my_npu.neurons))
Step 7: Predict
Finally, we can pass the training pfvs and the training labels into the classifier to fit the classifier, or we can pass the test_pfvs to get predictions.
# Fit the classifier and predict
from sklearn import metrics
cls.fit(train_pfvs, train_labels)
preds = cls.predict(test_pfvs)
cls.score(test_pfvs, test_labels)
acc = metrics.accuracy_score(test_labels, preds)
print("accuracy", acc)
accuracy 0.9666666666666667