Exploring how many neurons are in the NPU and how many of them are active neurons

The learning component of the NIML system, the NPU, is made up of digital neurons. These neurons compete with each other when encoded input observations are sent to the NPU. Only the active neurons (the highest-matching ones) go through the learning process and strengthen their connections to the current input observation. There are only a fixed number of active neurons... as specified by you the user!

One may wonder... how many neurons should be in the NPU? If a digital neuron is the fundamental learning component of the system, isn't more neurons always better? Given a number of neurons, how many active neurons should there be? Again, isn't more better?

The challenge for this week is to select a dataset and then:

Find a good setting for the number of neurons needed in the NPU
Find the right number of active neurons needed to achieve high accuracy

Because each neuron represents a compute unit, there are power, efficiency, and performance reasons for wanting to find the lowest number of neurons and active neurons needed to create a model while still maintaining sufficient accuracy.

Using the scikit-learn wine dataset, this example will lay down a framework and show some initial results from an exploration of these questions.

We challenge you to take the concepts here and apply them to your own dataset, and then explore beyond that!

Step 1 - Data

First, we need to load in the dataset and convert it into a form appropriate for NIML.

import pandas as pd
from sklearn import datasets

data_file=datasets.load_wine()

X=pd.DataFrame(data_file.data)
Y=pd.DataFrame(data_file.target)
data = pd.concat([Y,X], axis=1, ignore_index=True)

# Convert the data into NIML-friendly format: 
# Just a list-of-lists, with the first column (the label) set to a string.
niml_data = []
for row in data.reset_index().values.tolist():
    row.pop(0) # remove the observation number
    row[0] = str(int(row[0])) # convert the label (the come in as floating point numbers) into a string
    niml_data.append(row)

Next, we need to create test and train splits of the dataset.

#create train and test splits of the data
from sklearn.model_selection import train_test_split
train, test = train_test_split(niml_data, test_size=0.15, random_state=2010)
# encode the data by creating an encoder, configuring it, and then performing the encoding on train/test
from niml.encoder import encoder
nf = 13 # 13 features in the 'wine' dataset
my_encoder = encoder.Encoder(set_bits=15, sparsity=.05, 
    field_types= ["N"]  *nf, # all features are numeric
    cyclic_flags=[False]*nf, # none of the fields are cyclic
    spans=       [   0] *nf, # use simple/basic encoding bit-patterns
    cat_overlaps=[   0] *nf, # N/A as data is numeric, not categorical. Set all features to 0
    cat_values=  [None] *nf, # N/A as data is numeric, not categorical. Set all features to None
)
my_encoder.config_encoder(input_data=train, label_col=0)
train_labels, train_isdrs, sdr_width = my_encoder.encode(input_data=train, label_col=0)
test_labels, test_isdrs, sdr_width = my_encoder.encode(input_data=test, label_col=0)

Step 2 - How many columns do we need?

Now we can turn our attention to the question at hand: How many neurons should there be in the NPU?

As a starting point, let's just set all the parameters to reasonable values, but pick different values for how many total neurons are in the system and see what happens.

#%%capture

# comment-out the "%%capture" line above if you want to see STDOUT printed out
import time
from niml.model import model

#######################
# Note - this block takes about 2 minutes to run
###############

# We will go from 25 neurons up to 5000 neurons - starting with small steps and increasing to bigger steps
neuron_counts = list(range(25,250,10)) # number of neurons - starting with small step sizes at first
small_step_count = len(neuron_counts) # used when plotting the results
neuron_counts.extend( list(range(300,1500,50)) ) # adding more neuron options - larger step size
neuron_counts.extend( list(range(1750,5000,250)) ) # again adding more neuron options - even larger step size
start = time.time()
results = [] # holds accuracy scores for each number-of-neurons setting

for num_neurons in neuron_counts:
    # create the model with the current number of neurons
    my_model = model.Model(
    # Endoded Data parameters
    sdr_width=sdr_width, # Recieved from encoding the data
    sdr_set_bits=13,
    # NPU
    neurons=num_neurons,
    active_neurons=int(num_neurons/10),
    input_pct=0.6,
    learning=True,
    synapse_inc=15,
    synapse_dec=3,
    # Boosting
    boost_frequency=9,
    boost_strength=0.09,
    boost_table_length=21,
    boost_bend_factor=0.8,
    # Classifier
    subclass_thresh=0.5,
    min_overlap=0.1,
    seed=632,
    )

    # Fit the model
    dummy = my_model.fit(labels=train_labels, isdrs=train_isdrs, epochs=6)

    # Measure accuracy by calling evaluate() and record the score
    res = my_model.evaluate(labels=test_labels, isdrs=test_isdrs)
    results.append(res["accuracy_score"])
    end = time.time()

print("This step took", end-start, "seconds")

This step took 438.8482096195221 seconds

Step 2 results

Let's plot our accuracy as we set the number of neurons to different values.

import matplotlib.pyplot as plt
plt.figure(figsize=(20,8))
# plot the small increments with skinny bars
plt.bar(neuron_counts[:small_step_count], results[:small_step_count], width=5, label="barlabel", color="#AAAAFF")
# plot the larger increments with wider bars
plt.bar(neuron_counts[small_step_count:], results[small_step_count:], width=30, label="barlabel", color="#AAAAFF")
plt.ylim(0,1)
plt.xlim(0,5000)
plt.xticks(list(range(0,5000,200)))
plt.show

Step 3 - Dig Deeper

From the resulting graph a "sweet spot" is present - too few neurons yielded poor results, and too many neurons also yielded poor results. For the experiment above, it seems like the best number of neurons to use is between about 800 and 1200 neurons.

But is that always the case? Could there be an interaction between the number of neurons and the number of active neurons?

Let's explore changing the number of active neurons, and see if that has an impact on how many total neurons should be in the NPU. To do this, we will have 2 nested loops. The outer loop will iterate over the number of active neurons, while the inner loop iterates over the total number of neurons. We will graph this interaction in a 2D matrix after the data collection has completed.

#%%capture
# comment-out the "%%capture" line above if you want to see STDOUT printed out
import time
from niml.model import model
#######################
# Note - this block takes about 5 minutes to rum
# patience please
###############
# active neurons measurement points
an_points = list(range(10, 51, 5))
# total neurons measurement points
tn_points = list(range(10, 101, 5)) # by 5 up to 100
tn_points.extend(list(range(125, 501, 25))) # by 25 up to 500
tn_points.extend(list(range(750, 5001, 250))) # by 250 up to 5000
tn_points= tn_points[::-1] # Reverse the list to reduce the total # runs
dim2_results = []
# will hold accuracy scores for each number_of_columns setting
start = time.time()
for tn in tn_points:
    temp_results=[]
    for an in an_points:
        # it is invalid to have more active neurons than total neurons
        if an > tn:
            temp_results.append(0) # result = 0% accuracy
        else:
            my_model = model.Model(
                sdr_width=sdr_width,
                sdr_set_bits=11,
                neurons=tn,
                active_neurons=an,
                input_pct=0.6,
                synapse_inc=15,
                synapse_dec=3,
                seed=632,
                subclass_thresh=0.5,
                min_overlap=0.1,
                boost_frequency=6,
                boost_strength=0.09,
                boost_bend_factor=0.175,
                boost_table_length=21,
            )
            # Fit the model to the training data (iSDRs)
            dummy = my_model.fit(labels=train_labels, isdrs=train_isdrs, epochs=1)
            res =my_model.evaluate(labels=test_labels, isdrs=test_isdrs)
            temp_results.append(res["accuracy_score"])
        print("tn=", tn, "an=", an, "acc=", temp_results[-1])
    dim2_results.append(temp_results)
end = time.time()

tn= 5000 an= 10 acc= 0.8148148148148148
tn= 5000 an= 15 acc= 0.4444444444444444
tn= 5000 an= 20 acc= 0.7037037037037037
.
.
.
tn= 10 an= 45 acc= 0
tn= 10 an= 50 acc= 0

print("This step took", end-start, "seconds")

This step took 2061.1380553245544 seconds

Step 3 results

Now to plot the interaction of number of active neurons verses total number of neurons in the NPU.

import numpy as np

fig = plt.figure(figsize=(9.0, 53.0))
ax = fig.add_subplot()

# Generate the heatmap
im = ax.imshow(dim2_results)

# We want to show all ticks...
ax.set_xticks(np.arange(len(an_points)))
ax.set_yticks(np.arange(len(tn_points)))
# ... and label them with the respective list entries
ax.set_xticklabels(an_points)
ax.set_yticklabels(tn_points)
# ... and rotate the X-axis tick labels and set their alignment.
plt.setp(ax.get_xticklabels(), rotation=45, ha="right", rotation_mode="anchor")
for i in range(len(tn_points)):
    for j in range(len(an_points)):
        rounded = "%.3f"% dim2_results[i][j]
        text=ax.text(j,i, rounded, ha="center", va="center", color="w")
plt.show()

Conclusions

There seems to be a relationship between the total number of neurons and the number of active neurons, at least for the dataset explored here and within the context of the other hyperparameters at their current settings. It appears that for a given number of total neurons, we should have between 5% and 10% of them active to achieve the best accuracy.

It would be curious to explore this relationship within other datasets and whether other hyperparameters affect this. Some other questions you may want to explore are:

Do more complex datasets require more neurons overall?
- Is there even a metric of difficulty for different datasets out there?
Does the 5% to 10% rule-of-thumb seen here (ratio of active neurons to total neurons) hold across other datasets?
Does this ratio change if other hyperparameters change?
- What if the input_percent were higher or lower? Does a more-connected or less-connected neuron lead to a different ratio of active-to-total neurons in the system?
How about number of epochs? Could that have an effect on this experiment?