Modeling the Iris Dataset

Let's walk through using the NIML model on of one of the most basic datasets used for machine learning: the Iris dataset

Step 1: Read in the data

First, load the Iris data from the Sklearn module and prepare it for the NIML software.

import pandas as pd
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split

#Load the dataset
data_file = datasets.load_iris()

#Extract the data, outcome variable, and labels from the source dataset
X = pd.DataFrame(data_file.data)
Y = pd.DataFrame(data_file.target)

#Combine the features and label into one dataset to work with NIML
data = pd.concat([Y,X], axis=1, ignore_index=True)

#Create train and test splits of the data
train, test = train_test_split(data, test_size=0.20, random_state=451)

Step 2: Encode The Data

Create an instance of the encoder and configure it from the training data, then encode that training dataset.

from niml.encoder import encoder

# Calculate the num_features given the number of columns of data, minus to label_col
num_features = train.shape[1]-1

# Build the encoder
iris_encoder = encoder.Encoder(set_bits=3, sparsity=0.10,
    field_types= ["N"]*num_features, # 4 numeric features
    cyclic_flags=[False]*num_features, # None of the fields are cyclic
    spans=       [    0]*num_features, # Use simple/basic encoding bit-patterns
    cat_overlaps=[    0]*num_features, # N/A, data is numeric, not categorical. Set all features to 0
    cat_values=  [ None]*num_features, # N/A, data is numeric, not categorical. Set all features to None
    )

# Configure the encoder according to the training dataset's distribution
iris_encoder.config_encoder(input_data=train)

# Encode the training data -> produce encoded inputs to be sent to the NPU for learning
train_labels, train_isdrs, sdr_width = iris_encoder.encode(input_data=train, label_col=0)

Step 3: Create and Train The Model

Use the sdr_width returned by the encoder, and set other model parameters as desired based on your understanding of the data. Use the model.fit() function to train the model.

from niml.model import model
my_model = model.Model(
    sdr_width=sdr_width,
    neurons=1024,
    active_neurons=20,
    input_pct=0.9, #Change this to 0.6 to get 100% accuracy
    synapse_inc=15,
    synapse_dec=3,
    seed=123,

    sdr_set_bits = 9,

    # Boosting
    boost_frequency=6,
    boost_strength=0.09,
    boost_bend_factor=0.175,
    boost_table_length=21,

    subclass_thresh=0.5,
    min_overlap=0.1,
)
# Fit the model to the training data (iSDRs)
my_model.fit(labels=train_labels, isdrs=train_isdrs, epochs=15, verbose = True)

Step 4: Evaluate The Model

Load the test data from the train_test_split performed and encode it using the same encoder object that was made using the training data.

# Encode the test data according to the same settings used with the training dataset
test_labels, test_isdrs, sdr_width = iris_encoder.encode(input_data=test, label_col=0)

# Pass the test data through the trained model and evaluate its performance
results = my_model.evaluate(labels=test_labels, isdrs=test_isdrs)

print("Model metrics on test dataset: ", results)

Model metrics on test dataset:  {'f1_score': 0.9675645342312009, 'accuracy_score': 0.9666666666666667, 'confusion_matrix': [[11, 0, 0], [0, 13, 1], [0, 0, 5]]}

Step 5: Predict

Running predict yields a list of specific predictions against inputs. Note only the iSDRs (not the labels) are given to the predict method.

After running predict, the predictions are printed out and compared to the ground-truth labels.

y_pred = my_model.predict(isdrs=test_isdrs)

# Display the predictions and ground truth labels for the test dtaset
for ground_truth, prediction in zip(test_labels, y_pred):
    print("Truth: %6s prediction: %6s" % (ground_truth, prediction), end=" ")
    if (ground_truth != prediction):
        print(" -- result: MISSED PREDICTION")
    else:
        print(" -- result: correct")

Truth:    0.0 prediction:    0.0  -- result: correct
Truth:    1.0 prediction:    1.0  -- result: correct
Truth:    2.0 prediction:    2.0  -- result: correct
Truth:    1.0 prediction:    2.0  -- result: MISSED PREDICTION
Truth:    0.0 prediction:    0.0  -- result: correct
Truth:    0.0 prediction:    0.0  -- result: correct
Truth:    2.0 prediction:    2.0  -- result: correct
Truth:    1.0 prediction:    1.0  -- result: correct
Truth:    1.0 prediction:    1.0  -- result: correct
Truth:    0.0 prediction:    0.0  -- result: correct
Truth:    2.0 prediction:    2.0  -- result: correct
Truth:    1.0 prediction:    1.0  -- result: correct
Truth:    1.0 prediction:    1.0  -- result: correct
Truth:    0.0 prediction:    0.0  -- result: correct
Truth:    1.0 prediction:    1.0  -- result: correct
Truth:    2.0 prediction:    2.0  -- result: correct
Truth:    1.0 prediction:    1.0  -- result: correct
Truth:    1.0 prediction:    1.0  -- result: correct
Truth:    1.0 prediction:    1.0  -- result: correct
Truth:    0.0 prediction:    0.0  -- result: correct
Truth:    0.0 prediction:    0.0  -- result: correct
Truth:    1.0 prediction:    1.0  -- result: correct
Truth:    0.0 prediction:    0.0  -- result: correct
Truth:    0.0 prediction:    0.0  -- result: correct
Truth:    1.0 prediction:    1.0  -- result: correct
Truth:    1.0 prediction:    1.0  -- result: correct
Truth:    2.0 prediction:    2.0  -- result: correct
Truth:    0.0 prediction:    0.0  -- result: correct
Truth:    1.0 prediction:    1.0  -- result: correct
Truth:    0.0 prediction:    0.0  -- result: correct