Home

References

Wikipedia: https://en.wikipedia.org/wiki/Neural_network
Python-Course.eu: https://www.python-course.eu/neural_networks.php
WildML : http://www.wildml.com/2015/09/implementing-a-neural-network-from-scratch/
https://iamtrask.github.io/2015/07/12/basic-python-network/

Intro

The idea behind a neural network is borrowed from biological neurons. In biology a network is comprised of a group(s) of chemically connected neurons that affect each other. Data scientists implement this idea this as an Artificial Neural Network (ANN) of data points. Composed of layers (inputs & outputs), an activation function (a way to move between layers), and a set of weights which alter the inputs to produce the desired output.

The middle layer is often hidden from us. This is the body of the cell, or the neuron, which we refer to as the peceptron.

Unfortunately ANNs can be a difficult subject to visualize so we'll jump right into an example

Ex1 Simple

We begin with our usual imports

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns; sns.set()  # for plot styling
%matplotlib inline
In [2]:
#let's create some dummy data
#    the following is simply a logical table illustrating an AND function
np.random.seed(7)
X = np.array([ [0,0],[0,1],[1,0],[1,1] ])    # Inputs  (l0 = Layer 1)
y = np.array([[0,0,0,1]]).T                  # Outputs (l2 = Layer 3)

#Our initial weighting  w_i = 1/N  where N = number of dimensions
init_weights = np.ones(2) * 0.5

#    and a simple activation function a modified step function
def step_activator(x):
    if x > 0.5:
        return 1
    return 0

results = np.zeros(4)
idx = 0
#For each row in our input X we hit it with our weight vector then activate it
for row in X:
    l1_node = np.dot(row,init_weights)
    l1_val = step_activator(l1_node) 
    results[idx] = l1_val
    idx += 1

error = results-y
print('error:' )
print(error.sum())
print('Perfection! ... Sadly this will rarely happen with real life data' )
error:
0.0
Perfection! ... Sadly this will rarely happen with real life data

Ex2 - Using SKLearn

This is probably one of the simplest examples I could find
Credit: https://rolisz.ro/2013/04/18/neural-networks-in-python/

In [3]:
#More imports
from sklearn.cross_validation import train_test_split 
from sklearn.datasets import load_digits
from sklearn.metrics import confusion_matrix, classification_report
from sklearn.preprocessing import LabelBinarizer
from sklearn.datasets.samples_generator import make_moons
import warnings
warnings.filterwarnings("ignore")
    
np.random.seed(0)
X, y = make_moons(1000, noise=0.10)
plt.scatter(X[:,0], X[:,1], s=40, c=y, cmap=plt.cm.Spectral);
C:\ProgramData\Anaconda3\lib\site-packages\sklearn\cross_validation.py:41: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
  "This module will be removed in 0.20.", DeprecationWarning)
In [4]:
# The data is no longer linearly seperable but at least it's clearly delineated
from sklearn.linear_model import Perceptron
from sklearn.neural_network import MLPClassifier

#solver : {‘lbfgs’, ‘sgd’, ‘adam’=default}
#Solves for weight optimization.
#  'lbfgs' is an optimizer in the family of quasi-Newton methods.
#  'sgd' refers to stochastic gradient descent.
#  'adam' refers to a stochastic gradient-based optimizer proposed by Kingma, Diederik, and Jimmy (Not Hendrix)


nn = MLPClassifier(solver='lbfgs', alpha=1e-5, random_state=1, hidden_layer_sizes=(25, 1))


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25)

# print(clf.fit(X, y)) to see the default model params
    # LabelBinarizer() turns our labels into binary columns
labels_train = LabelBinarizer().fit_transform(y_train)
labels_test = LabelBinarizer().fit_transform(y_test)

nn.fit(X_train,labels_train)

predictions = []
for i in range(X_test.shape[0]):  
    o = nn.predict(X_test[i,None])
    predictions.append(o[0])
    #o = nn.predict(X_test[i,None])
    #predictions.append(np.argmax(o))
     
# Set min and max values and give it some padding
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
h = 0.01

xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
# Predict the function value for the whole gid

Z = nn.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)

#Z = np.array(predictions)
#Z = Z.reshape(xx.shape)

# Plot the contour and training examples
plt.contourf(xx, yy, Z, cmap=plt.cm.Spectral)
plt.scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.Spectral)
 
print(confusion_matrix(y_test,predictions))
print(classification_report(y_test,predictions))

# print(predictions)
[[133   0]
 [  1 116]]
             precision    recall  f1-score   support

          0       0.99      1.00      1.00       133
          1       1.00      0.99      1.00       117

avg / total       1.00      1.00      1.00       250