Lab Six: CNNs¶

Prince Ndhlovu and Kirby Cravens¶

1. Data Preparation¶

import pandas as pd
import numpy as np
import seaborn as sns
%matplotlib inline 
from matplotlib import pyplot as plt
import os
import pickle
from PIL import Image
from matplotlib.pyplot import imshow
from IPython.display import display
import sklearn
from sklearn.metrics import roc_curve
from sklearn.metrics import auc
from sklearn.model_selection import train_test_split
from sklearn.model_selection import StratifiedShuffleSplit
from sklearn.model_selection import StratifiedKFold
import cv2
import random 
import keras
from keras import backend as K
from keras.utils import to_categorical
from keras.models import Sequential
from keras.layers import Reshape
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, Conv3D, MaxPooling2D, MaxPooling3D
from keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import EarlyStopping
from sklearn import metrics as mt
from keras.models import load_model
from tensorflow.keras.regularizers import l2

filepath = '/Users/princendhlovu/Downloads/brain_tumor_dataset/'

nontumor = os.listdir(filepath + 'no' )
tumor = os.listdir(filepath + 'yes' )
no_std_path = filepath + 'no/'
yes_std_path = filepath + 'yes/'

tumor_images = []
non_tumor_images = []
target = []

h,w = (80,75)
# read the images into the respective arrays
def addImageToArray(label,std_path,imageFilePath,imageArrays):
    for pic in imageFilePath:
        image_array = cv2.imread(std_path + pic, cv2.IMREAD_GRAYSCALE)
#         print(image_array.shape[0]/image_array.shape[1])
        image_array = cv2.resize(image_array, (w,h))
        imageArrays.append(image_array.flatten())
        if(label == 'yes'):
            target.append(1)
        if(label == 'no'):
            target.append(0)

non_tumor_images.clear()
tumor_images.clear()
target.clear()
addImageToArray('yes', yes_std_path,tumor,tumor_images)
addImageToArray('no', no_std_path,nontumor,non_tumor_images)

X_data = non_tumor_images + tumor_images

# print('frequencies',target.astype(np.int).value_counts())

X_data = np.array(X_data)
target = np.array(target)

rstate = np.random.get_state()
np.random.shuffle(X_data)
np.random.set_state(rstate)
np.random.shuffle(target)


print("Images readIn:", len(X_data))

unique, counts = np.unique(target, return_counts=True)

print(np.asarray((unique, counts)).T)

Images readIn: 253
[[  0  98]
 [  1 155]]

# Class distributions
labels_df = pd.DataFrame(data=target)
labels_df.hist()
plt.title("Class Distribution")

Text(0.5, 1.0, 'Class Distribution')

Evaluating Performance¶

When radiologists are diagnosing patients who might have brain tumors, they have to minimize the risk of "missing" brain tumors which might give their clients false confidence causing them to abandon their medication and other necessary therapy sessions.In most cases if the tumor is unnoticed, it might be malignant and spread to other parts of the brain or body causing more damage. In this dataset were are trying to minimize the number of false negatives where the model declares an image tumor free when it is not as this is very detrimental to the present and future health or wellbeing of a person. We also do not want to diagnose people who are cancer free with tumors that they do not have, that is we want to minimize the number of false positives as they can cause healthy people to take medication that they do not need leading to health complications, substantial stress and even fatalities due to the irrelevant therapy and medication that they would be receiving. False positives are not desirable but the presence of false negatives has far wide reaching effects that cannot be mitigated and often times reduce the human lifespan by a considerable amount. Therefor were are going to use recall as an evaluation metric for our CNN model. Recall gives us a good picture of how good our model is at predicting true positives and reducing the number of false negatives and it is given by:
$$ \begin{align} Recall = \frac{True Positives}{True Positives + False Negatives} \end{align} $$

Splitting the Data¶

We had a small dataset of 253 images and we used image transformations to increase to 1253 images. This data expansion helps the model to experience wide variations of image orientations that would make it good at generalizing to new data that it has not seen before.It is unlikely that our model will see images or scans that have been taken in a certain orientation so we are going to divide our data using the StratifiedKFold with 5 folds. The images would be shuffled and and distributed among the 5 folds to expose our model. This would help reduce the variances during the classification process.Then we would evaluate our model with an 80 - 20 split of training and testing respectively.

2. Modeling¶

Data Expansion¶

#create new sample data
from keras.preprocessing.image import ImageDataGenerator

datagen = ImageDataGenerator(featurewise_center=False,
    samplewise_center=False,
    featurewise_std_normalization=False, #adding to each pixel
    samplewise_std_normalization=False,  #adding to each picture 
    zca_whitening=False,
    rotation_range=5, # used, Int. Degree range for random rotations.                        
    width_shift_range=0.1, # used, Float (fraction of total width). Range for random horizontal shifts.
    height_shift_range=0.1, # used,  Float (fraction of total height). Range for random vertical shifts.
    shear_range=0., # Float. Shear Intensity (Shear angle in counter-clockwise direction as radians)
    zoom_range=0.,
    channel_shift_range=0.,
    fill_mode='nearest',
    cval=0.,
    horizontal_flip=True,
    vertical_flip=True,
    rescale=None)

# X_data = X_data/255 - 0.5

#expand dimensions
X_data = np.expand_dims(X_data.reshape((-1,h, w)), axis=3)

datagen.fit(X_data)



new_images = datagen.flow(X_data, target, batch_size=1)
print('new images',len(new_images))
print('original data',len(X_data))
print(X_data.shape)

# for tmp in new_images:
#     imshow(tmp[0].squeeze(),cmap='bone')
#     break

new images 253
original data 253
(253, 80, 75, 1)

j = 0
for img in new_images:
    if j == 1000:
        break
    j += 1
    if j % 100 == 0:
        print(f'Appended {j} images')
    X_data = np.vstack((X_data,np.expand_dims(img[0][0].squeeze().reshape((-1,h,w)), axis=3)))
    target = np.append(target,img[1][0])

Appended 100 images
Appended 200 images
Appended 300 images
Appended 400 images
Appended 500 images
Appended 600 images
Appended 700 images
Appended 800 images
Appended 900 images
Appended 1000 images

classes = {0:"No Tumor", 1:"Tumor"}
# classes = {0:"Daisy", 1:"Dandelion", 2:"Rose", 3:"Sunflower", 4:"Tulip"}

# select random images to visualize

random.seed(1)

plt.figure(figsize=(2.0 * 7, 2.3 * 3))
plt.subplots_adjust(bottom=0, left=.01, right=.99, top=.90, hspace=.35)
random_sample = random.sample(range(0,X_data.shape[0]), k=21)
for n,i in enumerate(random_sample):
        plt.subplot(3,7, n + 1)
        imshow(X_data[i].reshape((h,w)))
        plt.title(classes[target[i]], size=12)
        plt.xticks(())
        plt.yticks(())

# Class distributions
labels_df = pd.DataFrame(data=target)
labels_df.hist()
plt.title("Class Distribution")

Text(0.5, 1.0, 'Class Distribution')

unique, counts = np.unique(target, return_counts=True)

print(np.asarray((unique, counts)).T)

[[  0 486]
 [  1 767]]

#We could not use the recall from the keras model so, i used this one from:
#https://datascience.stackexchange.com/questions/45165/how-to-get-accuracy-f1-precision-and-recall-for-a-keras-model
    
def recall_m(y_true, y_pred):
    true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
    possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
    recall = true_positives / (possible_positives + K.epsilon())
    return recall

ResNet¶

kfold_data = StratifiedKFold(n_splits=5, random_state=42, shuffle=True).split(X_data, target)
X_train, X_test, y_train, y_test = train_test_split(X_data, target, test_size= 0.20, random_state=42)

y_cat = to_categorical(target)
y_train_cat = to_categorical(y_train)
y_test_cat = to_categorical(y_test)

NUM_CLASSES = 2

%%time

# now lets use the LeNet architecture with batch norm
# We will also use ReLU where approriate and drop out 
from tensorflow.keras.layers import Add, Input
from tensorflow.keras.layers import average, concatenate
from tensorflow.keras.models import Model

l2_lambda = 0.000001
input_holder = Input(shape=(h, w, 1))

# start with a conv layer
x = Conv2D(filters=32,
               input_shape = (h,w,1),
               kernel_size=(3,3),
               kernel_initializer='he_uniform', 
               kernel_regularizer=l2(l2_lambda),
               padding='same', 
               activation='relu', 
               data_format="channels_last")(input_holder)

x = MaxPooling2D(pool_size=(2, 2), data_format="channels_last")(x)

x = Conv2D(filters=32,
               kernel_size=(3,3),
               kernel_initializer='he_uniform', 
               kernel_regularizer=l2(l2_lambda),
               padding='same', 
               activation='relu', 
               data_format="channels_last")(x)

x_split = MaxPooling2D(pool_size=(2, 2), data_format="channels_last")(x)

x = Conv2D(filters=64,
               kernel_size=(1,1),
               kernel_initializer='he_uniform', 
               kernel_regularizer=l2(l2_lambda),
               padding='same', 
               activation='relu', 
               data_format="channels_last")(x_split)

x = Conv2D(filters=64,
               kernel_size=(3,3),
               kernel_initializer='he_uniform', 
               kernel_regularizer=l2(l2_lambda),
               padding='same', 
               activation='relu', 
               data_format="channels_last")(x)

x = Conv2D(filters=32,
               kernel_size=(1,1),
               kernel_initializer='he_uniform', 
               kernel_regularizer=l2(l2_lambda),
               padding='same', 
               activation='relu', 
               data_format="channels_last")(x)

# now add back in the split layer, x_split (residual added in)
x = Add()([x, x_split])
x = Activation("relu")(x)

x = MaxPooling2D(pool_size=(2, 2), data_format="channels_last")(x)

x = Flatten()(x)
x = Dropout(0.25)(x)
x = Dense(256)(x)
x = Activation("relu")(x)
x = Dropout(0.5)(x)
x = Dense(NUM_CLASSES)(x)
x = Activation('softmax')(x)

resnet1 = Model(inputs=input_holder,outputs=x)

resnet1.summary()

Model: "functional_1"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_2 (InputLayer)            [(None, 80, 75, 1)]  0                                            
__________________________________________________________________________________________________
conv2d (Conv2D)                 (None, 80, 75, 32)   320         input_2[0][0]                    
__________________________________________________________________________________________________
max_pooling2d (MaxPooling2D)    (None, 40, 37, 32)   0           conv2d[0][0]                     
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, 40, 37, 32)   9248        max_pooling2d[0][0]              
__________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D)  (None, 20, 18, 32)   0           conv2d_1[0][0]                   
__________________________________________________________________________________________________
conv2d_2 (Conv2D)               (None, 20, 18, 64)   2112        max_pooling2d_1[0][0]            
__________________________________________________________________________________________________
conv2d_3 (Conv2D)               (None, 20, 18, 64)   36928       conv2d_2[0][0]                   
__________________________________________________________________________________________________
conv2d_4 (Conv2D)               (None, 20, 18, 32)   2080        conv2d_3[0][0]                   
__________________________________________________________________________________________________
add (Add)                       (None, 20, 18, 32)   0           conv2d_4[0][0]                   
                                                                 max_pooling2d_1[0][0]            
__________________________________________________________________________________________________
activation (Activation)         (None, 20, 18, 32)   0           add[0][0]                        
__________________________________________________________________________________________________
max_pooling2d_2 (MaxPooling2D)  (None, 10, 9, 32)    0           activation[0][0]                 
__________________________________________________________________________________________________
flatten (Flatten)               (None, 2880)         0           max_pooling2d_2[0][0]            
__________________________________________________________________________________________________
dropout (Dropout)               (None, 2880)         0           flatten[0][0]                    
__________________________________________________________________________________________________
dense (Dense)                   (None, 256)          737536      dropout[0][0]                    
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 256)          0           dense[0][0]                      
__________________________________________________________________________________________________
dropout_1 (Dropout)             (None, 256)          0           activation_1[0][0]               
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, 2)            514         dropout_1[0][0]                  
__________________________________________________________________________________________________
activation_2 (Activation)       (None, 2)            0           dense_1[0][0]                    
==================================================================================================
Total params: 788,738
Trainable params: 788,738
Non-trainable params: 0
__________________________________________________________________________________________________
CPU times: user 108 ms, sys: 21.9 ms, total: 130 ms
Wall time: 147 ms

resnet1.compile(loss='categorical_crossentropy', # 'categorical_crossentropy' 'mean_squared_error'
                optimizer='adam', # 'adadelta' 'rmsprop'
                metrics=[recall_m])



cnn3 = []
for i, (train_data, test_data) in enumerate(kfold_data):
    res_history1 = resnet1.fit(X_data[train_data], y_cat[train_data], batch_size=16, 
                      epochs=50, verbose=1,
                      validation_data=(X_data[test_data],y_cat[test_data]),
                      callbacks=[EarlyStopping(monitor='val_loss', patience=4)]
                     )
    
    cnn3.append(res_history1)

Epoch 1/50
63/63 [==============================] - 5s 84ms/step - loss: 89.8077 - recall_m: 0.5665 - val_loss: 2.0339 - val_recall_m: 0.6665
Epoch 2/50
63/63 [==============================] - 5s 79ms/step - loss: 2.8412 - recall_m: 0.6157 - val_loss: 1.0910 - val_recall_m: 0.6087
Epoch 3/50
63/63 [==============================] - 5s 80ms/step - loss: 1.4096 - recall_m: 0.6065 - val_loss: 0.8228 - val_recall_m: 0.6158
Epoch 4/50
63/63 [==============================] - 5s 80ms/step - loss: 0.9073 - recall_m: 0.6504 - val_loss: 0.6466 - val_recall_m: 0.6822
Epoch 5/50
63/63 [==============================] - 5s 76ms/step - loss: 0.7325 - recall_m: 0.6857 - val_loss: 0.6555 - val_recall_m: 0.6626
Epoch 6/50
63/63 [==============================] - 5s 75ms/step - loss: 0.6024 - recall_m: 0.7141 - val_loss: 0.6513 - val_recall_m: 0.6683
Epoch 7/50
63/63 [==============================] - 5s 75ms/step - loss: 0.5913 - recall_m: 0.7046 - val_loss: 0.6348 - val_recall_m: 0.6683
Epoch 8/50
63/63 [==============================] - 5s 80ms/step - loss: 0.5647 - recall_m: 0.7143 - val_loss: 0.6567 - val_recall_m: 0.6314
Epoch 9/50
63/63 [==============================] - 5s 80ms/step - loss: 0.5567 - recall_m: 0.7373 - val_loss: 0.6325 - val_recall_m: 0.6644
Epoch 10/50
63/63 [==============================] - 6s 89ms/step - loss: 0.4544 - recall_m: 0.7897 - val_loss: 0.6407 - val_recall_m: 0.6548
Epoch 11/50
63/63 [==============================] - 5s 82ms/step - loss: 0.4732 - recall_m: 0.7651 - val_loss: 0.6490 - val_recall_m: 0.6431
Epoch 12/50
63/63 [==============================] - 5s 80ms/step - loss: 0.4502 - recall_m: 0.7895 - val_loss: 0.6590 - val_recall_m: 0.6605
Epoch 13/50
63/63 [==============================] - 5s 81ms/step - loss: 0.4156 - recall_m: 0.7901 - val_loss: 0.6534 - val_recall_m: 0.6566
Epoch 1/50
63/63 [==============================] - 5s 85ms/step - loss: 0.4905 - recall_m: 0.7921 - val_loss: 0.3456 - val_recall_m: 0.9006
Epoch 2/50
63/63 [==============================] - 5s 80ms/step - loss: 0.4598 - recall_m: 0.7786 - val_loss: 0.3799 - val_recall_m: 0.8672
Epoch 3/50
63/63 [==============================] - 5s 79ms/step - loss: 0.4480 - recall_m: 0.8020 - val_loss: 0.3585 - val_recall_m: 0.8516
Epoch 4/50
63/63 [==============================] - 5s 81ms/step - loss: 0.4119 - recall_m: 0.8073 - val_loss: 0.3942 - val_recall_m: 0.8594
Epoch 5/50
63/63 [==============================] - 5s 81ms/step - loss: 0.4080 - recall_m: 0.8141 - val_loss: 0.3713 - val_recall_m: 0.8498
Epoch 1/50
63/63 [==============================] - 5s 78ms/step - loss: 0.3957 - recall_m: 0.8216 - val_loss: 0.2349 - val_recall_m: 0.9297
Epoch 2/50
63/63 [==============================] - 5s 78ms/step - loss: 0.3954 - recall_m: 0.8125 - val_loss: 0.2467 - val_recall_m: 0.8961
Epoch 3/50
63/63 [==============================] - 5s 85ms/step - loss: 0.3412 - recall_m: 0.8473 - val_loss: 0.2727 - val_recall_m: 0.8945
Epoch 4/50
63/63 [==============================] - 5s 86ms/step - loss: 0.3114 - recall_m: 0.8572 - val_loss: 0.2629 - val_recall_m: 0.9000
Epoch 5/50
63/63 [==============================] - 5s 79ms/step - loss: 0.3350 - recall_m: 0.8507 - val_loss: 0.3087 - val_recall_m: 0.8766
Epoch 1/50
63/63 [==============================] - 5s 81ms/step - loss: 0.3572 - recall_m: 0.8379 - val_loss: 0.1978 - val_recall_m: 0.9453
Epoch 2/50
63/63 [==============================] - 5s 76ms/step - loss: 0.3370 - recall_m: 0.8622 - val_loss: 0.2103 - val_recall_m: 0.9469
Epoch 3/50
63/63 [==============================] - 5s 77ms/step - loss: 0.3218 - recall_m: 0.8383 - val_loss: 0.1983 - val_recall_m: 0.9430
Epoch 4/50
63/63 [==============================] - 5s 77ms/step - loss: 0.2790 - recall_m: 0.8776 - val_loss: 0.2168 - val_recall_m: 0.9273
Epoch 5/50
63/63 [==============================] - 5s 76ms/step - loss: 0.2707 - recall_m: 0.8860 - val_loss: 0.2226 - val_recall_m: 0.9258

%matplotlib inline

plt.figure(figsize=(10,4))
plt.subplot(2,2,1)
plt.plot(res_history1.history['recall_m'])

plt.ylabel('Recall %')
plt.title('Training')
plt.subplot(2,2,2)
plt.plot(res_history1.history['val_recall_m'])
plt.title('Validation')

plt.subplot(2,2,3)
plt.plot(res_history1.history['loss'])
plt.ylabel('Training Loss')
plt.xlabel('epochs')

plt.subplot(2,2,4)
plt.plot(res_history1.history['val_loss'])
plt.xlabel('epochs')

Text(0.5, 0, 'epochs')

2nd Variation of ResNet¶

With the following implementation, we will be using nadam as the optimizer

kfold_data = StratifiedKFold(n_splits=5, random_state=42, shuffle=True).split(X_data, target)
X_train, X_test, y_train, y_test = train_test_split(X_data, target, test_size= 0.20, random_state=42)

y_cat = to_categorical(target)
y_train_cat = to_categorical(y_train)
y_test_cat = to_categorical(y_test)

NUM_CLASSES = 2

%%time

input_holder = Input(shape=(h, w, 1))

# start with a conv layer
x = Conv2D(filters=32,
               input_shape = (h,w,1),
               kernel_size=(3,3),
               kernel_initializer='he_uniform', 
               kernel_regularizer=l2(l2_lambda),
               padding='same', 
               activation='relu', 
               data_format="channels_last")(input_holder)

x = MaxPooling2D(pool_size=(2, 2), data_format="channels_last")(x)

x = Conv2D(filters=32,
               kernel_size=(3,3),
               kernel_initializer='he_uniform', 
               kernel_regularizer=l2(l2_lambda),
               padding='same', 
               activation='relu', 
               data_format="channels_last")(x)

x_split = MaxPooling2D(pool_size=(2, 2), data_format="channels_last")(x)

x = Conv2D(filters=64,
               kernel_size=(1,1),
               kernel_initializer='he_uniform', 
               kernel_regularizer=l2(l2_lambda),
               padding='same', 
               activation='relu', 
               data_format="channels_last")(x_split)

x = Conv2D(filters=64,
               kernel_size=(3,3),
               kernel_initializer='he_uniform', 
               kernel_regularizer=l2(l2_lambda),
               padding='same', 
               activation='relu', 
               data_format="channels_last")(x)

x = Conv2D(filters=32,
               kernel_size=(1,1),
               kernel_initializer='he_uniform', 
               kernel_regularizer=l2(l2_lambda),
               padding='same', 
               activation='relu', 
               data_format="channels_last")(x)

# now add back in the split layer, x_split (residual added in)
x = Add()([x, x_split])
x = Activation("relu")(x)

x = MaxPooling2D(pool_size=(2, 2), data_format="channels_last")(x)

x = Flatten()(x)
x = Dropout(0.25)(x)
x = Dense(256)(x)
x = Activation("relu")(x)
x = Dropout(0.5)(x)
x = Dense(NUM_CLASSES)(x)
x = Activation('softmax')(x)

resnet2 = Model(inputs=input_holder,outputs=x)

CPU times: user 87.3 ms, sys: 4.05 ms, total: 91.3 ms
Wall time: 92.4 ms

resnet2.compile(loss='categorical_crossentropy', # 'categorical_crossentropy' 'mean_squared_error'
                optimizer='nadam', # 'adadelta' 'rmsprop'
                metrics=[recall_m])



cnn5 = []
for i, (train_data, test_data) in enumerate(kfold_data):
    res_history2 = resnet2.fit(X_data[train_data], y_cat[train_data], batch_size=16, 
                      epochs=50, verbose=1,
                      validation_data=(X_data[test_data],y_cat[test_data]),
                      callbacks=[EarlyStopping(monitor='val_loss', patience=4)]
                     )
    
    cnn5.append(res_history2)

Epoch 1/50
63/63 [==============================] - 6s 97ms/step - loss: 52.2721 - recall_m: 0.5452 - val_loss: 1.3917 - val_recall_m: 0.5991
Epoch 2/50
63/63 [==============================] - 6s 94ms/step - loss: 1.1065 - recall_m: 0.6183 - val_loss: 0.7065 - val_recall_m: 0.5888
Epoch 3/50
63/63 [==============================] - 6s 99ms/step - loss: 0.8516 - recall_m: 0.6216 - val_loss: 0.6936 - val_recall_m: 0.6158
Epoch 4/50
63/63 [==============================] - 6s 93ms/step - loss: 0.6893 - recall_m: 0.6321 - val_loss: 0.6821 - val_recall_m: 0.6314
Epoch 5/50
63/63 [==============================] - 6s 93ms/step - loss: 0.6491 - recall_m: 0.6462 - val_loss: 0.8122 - val_recall_m: 0.6374
Epoch 6/50
63/63 [==============================] - 6s 90ms/step - loss: 0.6982 - recall_m: 0.6476 - val_loss: 0.6747 - val_recall_m: 0.6449
Epoch 7/50
63/63 [==============================] - 6s 89ms/step - loss: 0.6453 - recall_m: 0.6385 - val_loss: 0.6646 - val_recall_m: 0.6548
Epoch 8/50
63/63 [==============================] - 6s 89ms/step - loss: 0.6327 - recall_m: 0.6575 - val_loss: 0.6720 - val_recall_m: 0.6300
Epoch 9/50
63/63 [==============================] - 6s 91ms/step - loss: 0.6221 - recall_m: 0.6563 - val_loss: 0.7033 - val_recall_m: 0.6509
Epoch 10/50
63/63 [==============================] - 6s 91ms/step - loss: 0.6073 - recall_m: 0.6532 - val_loss: 0.7016 - val_recall_m: 0.6435
Epoch 11/50
63/63 [==============================] - 6s 91ms/step - loss: 0.6033 - recall_m: 0.6641 - val_loss: 0.6180 - val_recall_m: 0.6222
Epoch 12/50
63/63 [==============================] - 6s 91ms/step - loss: 0.6087 - recall_m: 0.6534 - val_loss: 0.6275 - val_recall_m: 0.6605
Epoch 13/50
63/63 [==============================] - 6s 93ms/step - loss: 0.5793 - recall_m: 0.6579 - val_loss: 0.6514 - val_recall_m: 0.6158
Epoch 14/50
63/63 [==============================] - 6s 90ms/step - loss: 0.5743 - recall_m: 0.6742 - val_loss: 0.6326 - val_recall_m: 0.6779
Epoch 15/50
63/63 [==============================] - 6s 90ms/step - loss: 0.6941 - recall_m: 0.6573 - val_loss: 0.6144 - val_recall_m: 0.6470
Epoch 16/50
63/63 [==============================] - 6s 90ms/step - loss: 0.5921 - recall_m: 0.6823 - val_loss: 0.7391 - val_recall_m: 0.6683
Epoch 17/50
63/63 [==============================] - 6s 90ms/step - loss: 0.5679 - recall_m: 0.6571 - val_loss: 0.6354 - val_recall_m: 0.6488
Epoch 18/50
63/63 [==============================] - 6s 92ms/step - loss: 0.5492 - recall_m: 0.6790 - val_loss: 0.6422 - val_recall_m: 0.6467
Epoch 19/50
63/63 [==============================] - 6s 90ms/step - loss: 0.5617 - recall_m: 0.6726 - val_loss: 0.6477 - val_recall_m: 0.6584
Epoch 1/50
63/63 [==============================] - 6s 91ms/step - loss: 0.6188 - recall_m: 0.6780 - val_loss: 0.5000 - val_recall_m: 0.7248
Epoch 2/50
63/63 [==============================] - 6s 92ms/step - loss: 0.5661 - recall_m: 0.6714 - val_loss: 0.5173 - val_recall_m: 0.7170
Epoch 3/50
63/63 [==============================] - 6s 91ms/step - loss: 0.5904 - recall_m: 0.6629 - val_loss: 0.5379 - val_recall_m: 0.7013
Epoch 4/50
63/63 [==============================] - 6s 93ms/step - loss: 0.5438 - recall_m: 0.6802 - val_loss: 0.5371 - val_recall_m: 0.7092
Epoch 5/50
63/63 [==============================] - 6s 93ms/step - loss: 0.6087 - recall_m: 0.6891 - val_loss: 0.5512 - val_recall_m: 0.6857
Epoch 1/50
63/63 [==============================] - 6s 93ms/step - loss: 0.6604 - recall_m: 0.6688 - val_loss: 0.5034 - val_recall_m: 0.7053
Epoch 2/50
63/63 [==============================] - 6s 93ms/step - loss: 0.5462 - recall_m: 0.6887 - val_loss: 0.4723 - val_recall_m: 0.7834
Epoch 3/50
63/63 [==============================] - 6s 91ms/step - loss: 0.5564 - recall_m: 0.6909 - val_loss: 0.4635 - val_recall_m: 0.7812
Epoch 4/50
63/63 [==============================] - 6s 94ms/step - loss: 0.5276 - recall_m: 0.7220 - val_loss: 0.4716 - val_recall_m: 0.7404
Epoch 5/50
63/63 [==============================] - 6s 92ms/step - loss: 0.5606 - recall_m: 0.6960 - val_loss: 0.4835 - val_recall_m: 0.7404
Epoch 6/50
63/63 [==============================] - 6s 99ms/step - loss: 0.5496 - recall_m: 0.6911 - val_loss: 0.4702 - val_recall_m: 0.7504
Epoch 7/50
63/63 [==============================] - 7s 106ms/step - loss: 0.5238 - recall_m: 0.7298 - val_loss: 0.4694 - val_recall_m: 0.7638
Epoch 1/50
63/63 [==============================] - 7s 103ms/step - loss: 0.5504 - recall_m: 0.7254 - val_loss: 0.4318 - val_recall_m: 0.7703
Epoch 2/50
63/63 [==============================] - 6s 96ms/step - loss: 0.5219 - recall_m: 0.7091 - val_loss: 0.4138 - val_recall_m: 0.7680
Epoch 3/50
63/63 [==============================] - 7s 104ms/step - loss: 0.5186 - recall_m: 0.6981 - val_loss: 0.4013 - val_recall_m: 0.8133
Epoch 4/50
63/63 [==============================] - 6s 98ms/step - loss: 0.5168 - recall_m: 0.7119 - val_loss: 0.4210 - val_recall_m: 0.7922
Epoch 5/50
63/63 [==============================] - 7s 109ms/step - loss: 0.4958 - recall_m: 0.7200 - val_loss: 0.4032 - val_recall_m: 0.7797
Epoch 6/50
63/63 [==============================] - 7s 111ms/step - loss: 0.4887 - recall_m: 0.7362 - val_loss: 0.4129 - val_recall_m: 0.7781
Epoch 7/50
63/63 [==============================] - 7s 104ms/step - loss: 0.4723 - recall_m: 0.7319 - val_loss: 0.4570 - val_recall_m: 0.7625
Epoch 1/50
63/63 [==============================] - 6s 94ms/step - loss: 0.4925 - recall_m: 0.7416 - val_loss: 0.3853 - val_recall_m: 0.8453
Epoch 2/50
63/63 [==============================] - 6s 92ms/step - loss: 0.4744 - recall_m: 0.7328 - val_loss: 0.3715 - val_recall_m: 0.8359
Epoch 3/50
63/63 [==============================] - 6s 93ms/step - loss: 0.4591 - recall_m: 0.7561 - val_loss: 0.3440 - val_recall_m: 0.8430
Epoch 4/50
63/63 [==============================] - 6s 94ms/step - loss: 0.4628 - recall_m: 0.7462 - val_loss: 0.3905 - val_recall_m: 0.7977
Epoch 5/50
63/63 [==============================] - 6s 98ms/step - loss: 0.4528 - recall_m: 0.7461 - val_loss: 0.3628 - val_recall_m: 0.8258
Epoch 6/50
63/63 [==============================] - 6s 92ms/step - loss: 0.4449 - recall_m: 0.7471 - val_loss: 0.4018 - val_recall_m: 0.8000
Epoch 7/50
63/63 [==============================] - 6s 94ms/step - loss: 0.4233 - recall_m: 0.7685 - val_loss: 0.3763 - val_recall_m: 0.8477

plt.figure(figsize=(10,4))
plt.subplot(2,2,1)
plt.plot(res_history2.history['recall_m'])

plt.ylabel('Recall %')
plt.title('Training')
plt.subplot(2,2,2)
plt.plot(res_history2.history['val_recall_m'])
plt.title('Validation')

plt.subplot(2,2,3)
plt.plot(res_history2.history['loss'])
plt.ylabel('Training Loss')
plt.xlabel('epochs')

plt.subplot(2,2,4)
plt.plot(res_history2.history['val_loss'])
plt.xlabel('epochs')

Text(0.5, 0, 'epochs')

Comparing between the two variations of ResNet Architecture¶

'First ResNet Architecture AUC & ROC'
y_pred_resnet = np.argmax(resnet1.predict(X_test), axis=1)
# np.argmax(alexnet.predict(X_test), axis=1)

#false positive and true positive rates using roc
fpr_resnet, tpr_resnet, thresholds_resnet = mt.roc_curve(np.argmax(y_test_cat, axis=1), y_pred_resnet)

#area under the curve
auc_resnet = auc(fpr_resnet, tpr_resnet)

'Second ResNet Architecture AUC & ROC'
y_pred_resnet2 = np.argmax(resnet2.predict(X_test), axis=1)

#false positive and true positive rates using roc
fpr_resnet1, tpr_resnet1, thresholds_resnet1 = mt.roc_curve(np.argmax(y_test_cat, axis=1), y_pred_resnet2)

#area under the curve
auc_resnet1 = auc(fpr_resnet1, tpr_resnet1)

plt.figure(figsize=(12,12))

#plot halfway line
plt.plot([0,1], [0,1], 'k--')

#plot for CNN
plt.plot(fpr_resnet, tpr_resnet,label='ResNet1  (area = {:.3f})'.format(auc_resnet))

#plot for CNN1
plt.plot(fpr_resnet1, tpr_resnet1,label='ResNet2  (area = {:.3f})'.format(auc_resnet1))

plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Resnet1 vs Resnet2')
plt.legend(loc='best')
plt.show()

Our first implementation of ResNet is closer to the top left of the graph and has a larger area under the curve (AUC). The AUC of the first implementation is 0.914, which is about 0.8 higher than the second implementation's AUC. Thus, we can conclude that the first ResNet is the better implementation.

print('0:',y_test_cat.shape)
yn = np.argmax(y_test_cat, axis=-1)
print('1:',yn.shape)

contingency = mcnemar_table(y_target=yn ,
                            y_model1=y_pred_resnet,
                            y_model2=y_pred_resnet2)

contingency

0: (251, 2)
1: (251,)

array([[211,  24],
       [  8,   8]])

from mlxtend.plotting import checkerboard_plot

brd = checkerboard_plot(contingency,
                        figsize=(3, 3),
                        fmt='%d',

                        col_labels=['ResNet2 right', 'ResNet2 wrong'],
                        row_labels=['ResNet1 right', 'ResNet1 wrong'])
plt.show()

contingency2 = np.array(contingency)
chiSq, p = mcnemar(ary=contingency2, corrected=True)
print("Chi Squared: ", chiSq)
print("P-val: ", p)

Chi Squared:  7.03125
P-val:  0.008009942329880018

At a significance level of 0.05, our p-value of 0.008 is less than the siginificance level. Therefor we can reject our null hypothesis for the 2 models. We can conclude that the models are significantly different and that the ResNet1 perfomes better than the ResNet2.

Xception¶

kfold_data = StratifiedKFold(n_splits=5, random_state=42, shuffle=True).split(X_data, target)
X_train, X_test, y_train, y_test = train_test_split(X_data, target, test_size= 0.20, random_state=42)

y_cat = to_categorical(target)
y_train_cat = to_categorical(y_train)
y_test_cat = to_categorical(y_test)

NUM_CLASSES = 2

# Xception style architecture
from tensorflow.keras.layers import SeparableConv2D
from tensorflow.keras.layers import BatchNormalization
from tensorflow.keras.layers import Add, Input
from tensorflow.keras.layers import average, concatenate
from tensorflow.keras.models import Model

l2_lambda = 0.000001



input_holder = Input(shape=(h, w, 1))

# start with a conv layer
x = Conv2D(filters=32,
               input_shape = (h,w,1),
               kernel_size=(3,3),
               kernel_initializer='he_uniform', 
               kernel_regularizer=l2(l2_lambda),
               padding='same', 
               activation='relu', 
               data_format="channels_last")(input_holder)

x = MaxPooling2D(pool_size=(2, 2), data_format="channels_last")(x)

x = Conv2D(filters=32,
               kernel_size=(3,3),
               kernel_initializer='he_uniform', 
               kernel_regularizer=l2(l2_lambda),
               padding='same', 
               activation='relu', 
               data_format="channels_last")(x)


x_split = MaxPooling2D(pool_size=(2, 2), data_format="channels_last")(x)

x = SeparableConv2D(filters=32,
               input_shape = (h,w,1),
               kernel_size=(3,3),
               kernel_initializer='he_uniform', 
               kernel_regularizer=l2(l2_lambda),
               padding='same', 
               activation='relu', 
               depth_multiplier = 1, # controls output channels
               data_format="channels_last")(x_split)


x_split = Add()([x, x_split])

x = SeparableConv2D(filters=32,
               input_shape = (h,w,1),
               kernel_size=(3,3),
               kernel_initializer='he_uniform', 
               kernel_regularizer=l2(l2_lambda),
               padding='same', 
               activation='relu', 
               depth_multiplier = 1, # controls output channels
               data_format="channels_last")(x_split)

x_split = Add()([x, x_split])


x = Activation("relu")(x_split)

x = MaxPooling2D(pool_size=(2, 2), data_format="channels_last")(x)

x = Flatten()(x)
x = Dropout(0.25)(x)
x = Dense(256, activation="relu")(x)
x = Dropout(0.5)(x)
x = Dense(NUM_CLASSES,activation="softmax")(x)

xception1 = Model(inputs=input_holder,outputs=x)

xception1.summary()

Model: "functional_13"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_8 (InputLayer)            [(None, 80, 75, 1)]  0                                            
__________________________________________________________________________________________________
conv2d_27 (Conv2D)              (None, 80, 75, 32)   320         input_8[0][0]                    
__________________________________________________________________________________________________
max_pooling2d_18 (MaxPooling2D) (None, 40, 37, 32)   0           conv2d_27[0][0]                  
__________________________________________________________________________________________________
conv2d_28 (Conv2D)              (None, 40, 37, 32)   9248        max_pooling2d_18[0][0]           
__________________________________________________________________________________________________
max_pooling2d_19 (MaxPooling2D) (None, 20, 18, 32)   0           conv2d_28[0][0]                  
__________________________________________________________________________________________________
separable_conv2d_2 (SeparableCo (None, 20, 18, 32)   1344        max_pooling2d_19[0][0]           
__________________________________________________________________________________________________
add_7 (Add)                     (None, 20, 18, 32)   0           separable_conv2d_2[0][0]         
                                                                 max_pooling2d_19[0][0]           
__________________________________________________________________________________________________
separable_conv2d_3 (SeparableCo (None, 20, 18, 32)   1344        add_7[0][0]                      
__________________________________________________________________________________________________
add_8 (Add)                     (None, 20, 18, 32)   0           separable_conv2d_3[0][0]         
                                                                 add_7[0][0]                      
__________________________________________________________________________________________________
activation_16 (Activation)      (None, 20, 18, 32)   0           add_8[0][0]                      
__________________________________________________________________________________________________
max_pooling2d_20 (MaxPooling2D) (None, 10, 9, 32)    0           activation_16[0][0]              
__________________________________________________________________________________________________
flatten_6 (Flatten)             (None, 2880)         0           max_pooling2d_20[0][0]           
__________________________________________________________________________________________________
dropout_12 (Dropout)            (None, 2880)         0           flatten_6[0][0]                  
__________________________________________________________________________________________________
dense_12 (Dense)                (None, 256)          737536      dropout_12[0][0]                 
__________________________________________________________________________________________________
dropout_13 (Dropout)            (None, 256)          0           dense_12[0][0]                   
__________________________________________________________________________________________________
dense_13 (Dense)                (None, 2)            514         dropout_13[0][0]                 
==================================================================================================
Total params: 750,306
Trainable params: 750,306
Non-trainable params: 0
__________________________________________________________________________________________________

# speed up by training by not using augmentation, perhaps there are faster ways??
xception1.compile(loss='categorical_crossentropy', # 'categorical_crossentropy' 'mean_squared_error'
                optimizer='adam', # 'adadelta' 'rmsprop'
                metrics=[recall_m])

cnn4 = []
for i, (train_data, test_data) in enumerate(kfold_data):
    x_history1 = xception1.fit(X_data[train_data], y_cat[train_data], batch_size=16,
                epochs=50, verbose=1,
                validation_data=(X_data[test_data],y_cat[test_data]),
                callbacks=[EarlyStopping(monitor='val_loss', patience=4)]
                )
    
    cnn4.append(x_history1)

Epoch 1/50
63/63 [==============================] - 5s 83ms/step - loss: 58.0825 - recall_m: 0.5605 - val_loss: 1.6026 - val_recall_m: 0.6200
Epoch 2/50
63/63 [==============================] - 5s 75ms/step - loss: 1.9115 - recall_m: 0.5909 - val_loss: 0.7867 - val_recall_m: 0.6005
Epoch 3/50
63/63 [==============================] - 5s 73ms/step - loss: 1.0044 - recall_m: 0.6216 - val_loss: 0.7166 - val_recall_m: 0.6005
Epoch 4/50
63/63 [==============================] - 5s 73ms/step - loss: 0.7614 - recall_m: 0.6528 - val_loss: 0.7193 - val_recall_m: 0.6143
Epoch 5/50
63/63 [==============================] - 5s 75ms/step - loss: 0.6925 - recall_m: 0.6841 - val_loss: 0.7033 - val_recall_m: 0.6161
Epoch 6/50
63/63 [==============================] - 5s 78ms/step - loss: 0.6186 - recall_m: 0.6881 - val_loss: 0.6969 - val_recall_m: 0.6083
Epoch 7/50
63/63 [==============================] - 5s 76ms/step - loss: 0.6106 - recall_m: 0.6867 - val_loss: 0.7055 - val_recall_m: 0.6161
Epoch 8/50
63/63 [==============================] - 5s 74ms/step - loss: 0.5591 - recall_m: 0.7200 - val_loss: 0.6972 - val_recall_m: 0.6491
Epoch 9/50
63/63 [==============================] - 5s 73ms/step - loss: 0.5876 - recall_m: 0.7103 - val_loss: 0.7168 - val_recall_m: 0.6062
Epoch 10/50
63/63 [==============================] - 5s 73ms/step - loss: 0.5585 - recall_m: 0.7129 - val_loss: 0.7071 - val_recall_m: 0.5966
Epoch 1/50
63/63 [==============================] - 5s 76ms/step - loss: 0.5993 - recall_m: 0.7040 - val_loss: 0.5110 - val_recall_m: 0.7287
Epoch 2/50
63/63 [==============================] - 5s 74ms/step - loss: 0.5835 - recall_m: 0.6958 - val_loss: 0.4966 - val_recall_m: 0.7738
Epoch 3/50
63/63 [==============================] - 5s 76ms/step - loss: 0.5545 - recall_m: 0.7179 - val_loss: 0.4901 - val_recall_m: 0.7777
Epoch 4/50
63/63 [==============================] - 5s 75ms/step - loss: 0.5422 - recall_m: 0.7321 - val_loss: 0.4893 - val_recall_m: 0.7717
Epoch 5/50
63/63 [==============================] - 5s 76ms/step - loss: 0.5275 - recall_m: 0.7228 - val_loss: 0.5136 - val_recall_m: 0.7539
Epoch 6/50
63/63 [==============================] - 5s 75ms/step - loss: 0.4998 - recall_m: 0.7482 - val_loss: 0.5308 - val_recall_m: 0.7482
Epoch 7/50
63/63 [==============================] - 5s 78ms/step - loss: 0.4788 - recall_m: 0.7643 - val_loss: 0.5320 - val_recall_m: 0.7230
Epoch 8/50
63/63 [==============================] - 5s 83ms/step - loss: 0.4674 - recall_m: 0.7675 - val_loss: 0.5429 - val_recall_m: 0.7386
Epoch 1/50
63/63 [==============================] - 5s 79ms/step - loss: 0.5082 - recall_m: 0.7546 - val_loss: 0.4003 - val_recall_m: 0.8313
Epoch 2/50
63/63 [==============================] - 5s 76ms/step - loss: 0.4897 - recall_m: 0.7600 - val_loss: 0.3939 - val_recall_m: 0.8422
Epoch 3/50
63/63 [==============================] - 5s 75ms/step - loss: 0.4564 - recall_m: 0.7794 - val_loss: 0.4536 - val_recall_m: 0.7922
Epoch 4/50
63/63 [==============================] - 5s 77ms/step - loss: 0.4776 - recall_m: 0.7650 - val_loss: 0.4212 - val_recall_m: 0.7750
Epoch 5/50
63/63 [==============================] - 5s 76ms/step - loss: 0.4560 - recall_m: 0.7774 - val_loss: 0.4305 - val_recall_m: 0.7922
Epoch 6/50
63/63 [==============================] - 5s 76ms/step - loss: 0.4505 - recall_m: 0.7923 - val_loss: 0.4832 - val_recall_m: 0.7484
Epoch 1/50
63/63 [==============================] - 5s 77ms/step - loss: 0.4732 - recall_m: 0.7715 - val_loss: 0.3448 - val_recall_m: 0.8805
Epoch 2/50
63/63 [==============================] - 5s 77ms/step - loss: 0.4422 - recall_m: 0.7873 - val_loss: 0.3391 - val_recall_m: 0.8687
Epoch 3/50
63/63 [==============================] - 5s 75ms/step - loss: 0.4438 - recall_m: 0.7918 - val_loss: 0.3765 - val_recall_m: 0.8523
Epoch 4/50
63/63 [==============================] - 5s 74ms/step - loss: 0.4586 - recall_m: 0.7814 - val_loss: 0.4039 - val_recall_m: 0.8609
Epoch 5/50
63/63 [==============================] - 5s 76ms/step - loss: 0.4564 - recall_m: 0.7839 - val_loss: 0.4356 - val_recall_m: 0.8211
Epoch 6/50
63/63 [==============================] - 5s 77ms/step - loss: 0.4337 - recall_m: 0.7977 - val_loss: 0.4249 - val_recall_m: 0.8156

plt.figure(figsize=(10,4))
plt.subplot(2,2,1)
plt.plot(x_history1.history['recall_m'])

plt.ylabel('Recall %')
plt.title('Training')
plt.subplot(2,2,2)
plt.plot(x_history1.history['val_recall_m'])
plt.title('Validation')

plt.subplot(2,2,3)
plt.plot(x_history1.history['loss'])
plt.ylabel('Training Loss')
plt.xlabel('epochs')

plt.subplot(2,2,4)
plt.plot(x_history1.history['val_loss'])
plt.xlabel('epochs')

Text(0.5, 0, 'epochs')

2nd Variation of Xception¶

With the following implementation, we will be using nadam as the optimizer

kfold_data = StratifiedKFold(n_splits=5, random_state=42, shuffle=True).split(X_data, target)
X_train, X_test, y_train, y_test = train_test_split(X_data, target, test_size= 0.20, random_state=42)

y_cat = to_categorical(target)
y_train_cat = to_categorical(y_train)
y_test_cat = to_categorical(y_test)

NUM_CLASSES = 2

# start with a conv layer
x = Conv2D(filters=32,
               input_shape = (h,w,1),
               kernel_size=(3,3),
               kernel_initializer='he_uniform', 
               kernel_regularizer=l2(l2_lambda),
               padding='same', 
               activation='relu', 
               data_format="channels_last")(input_holder)

x = MaxPooling2D(pool_size=(2, 2), data_format="channels_last")(x)

x = Conv2D(filters=32,
               kernel_size=(3,3),
               kernel_initializer='he_uniform', 
               kernel_regularizer=l2(l2_lambda),
               padding='same', 
               activation='relu', 
               data_format="channels_last")(x)


x_split = MaxPooling2D(pool_size=(2, 2), data_format="channels_last")(x)

x = SeparableConv2D(filters=32,
               input_shape = (h,w,1),
               kernel_size=(3,3),
               kernel_initializer='he_uniform', 
               kernel_regularizer=l2(l2_lambda),
               padding='same', 
               activation='relu', 
               depth_multiplier = 1, # controls output channels
               data_format="channels_last")(x_split)


x_split = Add()([x, x_split])

x = SeparableConv2D(filters=32,
               input_shape = (h,w,1),
               kernel_size=(3,3),
               kernel_initializer='he_uniform', 
               kernel_regularizer=l2(l2_lambda),
               padding='same', 
               activation='relu', 
               depth_multiplier = 1, # controls output channels
               data_format="channels_last")(x_split)

x_split = Add()([x, x_split])


x = Activation("relu")(x_split)

x = MaxPooling2D(pool_size=(2, 2), data_format="channels_last")(x)

x = Flatten()(x)
x = Dropout(0.25)(x)
x = Dense(256, activation="relu")(x)
x = Dropout(0.5)(x)
x = Dense(NUM_CLASSES,activation="softmax")(x)

xception2 = Model(inputs=input_holder,outputs=x)

# speed up by training by not using augmentation, perhaps there are faster ways??
xception2.compile(loss='categorical_crossentropy', # 'categorical_crossentropy' 'mean_squared_error'
                optimizer='nadam', # 'adadelta' 'rmsprop'
                metrics=[recall_m])

cnn6 = []
for i, (train_data, test_data) in enumerate(kfold_data):
    x_history2 = xception2.fit(X_data[train_data], y_cat[train_data], batch_size=8,
                epochs=50, verbose=1,
                validation_data=(X_data[test_data],y_cat[test_data]),
                callbacks=[EarlyStopping(monitor='val_loss', patience=4)]
                )
    
    cnn6.append(x_history2)

Epoch 1/50
126/126 [==============================] - 6s 48ms/step - loss: 2.3709 - recall_m: 0.5942 - val_loss: 0.7029 - val_recall_m: 0.6315
Epoch 2/50
126/126 [==============================] - 6s 45ms/step - loss: 0.7149 - recall_m: 0.6369 - val_loss: 0.6728 - val_recall_m: 0.6120
Epoch 3/50
126/126 [==============================] - 6s 45ms/step - loss: 0.6462 - recall_m: 0.6498 - val_loss: 0.6801 - val_recall_m: 0.6081
Epoch 4/50
126/126 [==============================] - 5s 42ms/step - loss: 0.6359 - recall_m: 0.6786 - val_loss: 0.6888 - val_recall_m: 0.5990
Epoch 5/50
126/126 [==============================] - 5s 41ms/step - loss: 0.5854 - recall_m: 0.6984 - val_loss: 0.6379 - val_recall_m: 0.6146
Epoch 6/50
126/126 [==============================] - 5s 42ms/step - loss: 0.5774 - recall_m: 0.7113 - val_loss: 0.6771 - val_recall_m: 0.6211
Epoch 7/50
126/126 [==============================] - 5s 43ms/step - loss: 0.5455 - recall_m: 0.7034 - val_loss: 0.6501 - val_recall_m: 0.6185
Epoch 8/50
126/126 [==============================] - 6s 44ms/step - loss: 0.5437 - recall_m: 0.7133 - val_loss: 0.6932 - val_recall_m: 0.5977
Epoch 9/50
126/126 [==============================] - 6s 44ms/step - loss: 0.5773 - recall_m: 0.7083 - val_loss: 0.6689 - val_recall_m: 0.6198
Epoch 1/50
126/126 [==============================] - 6s 51ms/step - loss: 0.5738 - recall_m: 0.7034 - val_loss: 0.4828 - val_recall_m: 0.7578
Epoch 2/50
126/126 [==============================] - 7s 52ms/step - loss: 0.5594 - recall_m: 0.7153 - val_loss: 0.5199 - val_recall_m: 0.7435
Epoch 3/50
126/126 [==============================] - 6s 48ms/step - loss: 0.5516 - recall_m: 0.7083 - val_loss: 0.5196 - val_recall_m: 0.7539
Epoch 4/50
126/126 [==============================] - 6s 47ms/step - loss: 0.5515 - recall_m: 0.7192 - val_loss: 0.6276 - val_recall_m: 0.6758
Epoch 5/50
126/126 [==============================] - 6s 45ms/step - loss: 0.5447 - recall_m: 0.7113 - val_loss: 0.5473 - val_recall_m: 0.7005
Epoch 1/50
126/126 [==============================] - 6s 47ms/step - loss: 0.5500 - recall_m: 0.7183 - val_loss: 0.4385 - val_recall_m: 0.7695
Epoch 2/50
126/126 [==============================] - 6s 44ms/step - loss: 0.5231 - recall_m: 0.7288 - val_loss: 0.4453 - val_recall_m: 0.7852
Epoch 3/50
126/126 [==============================] - 5s 43ms/step - loss: 0.5115 - recall_m: 0.7411 - val_loss: 0.4768 - val_recall_m: 0.7812
Epoch 4/50
126/126 [==============================] - 5s 43ms/step - loss: 0.4766 - recall_m: 0.7609 - val_loss: 0.4997 - val_recall_m: 0.7656
Epoch 5/50
126/126 [==============================] - 6s 44ms/step - loss: 0.5609 - recall_m: 0.7424 - val_loss: 0.5707 - val_recall_m: 0.7227
Epoch 1/50
126/126 [==============================] - 6s 44ms/step - loss: 0.5852 - recall_m: 0.7378 - val_loss: 0.5029 - val_recall_m: 0.7773
Epoch 2/50
126/126 [==============================] - 5s 44ms/step - loss: 0.5013 - recall_m: 0.7589 - val_loss: 0.4408 - val_recall_m: 0.7656
Epoch 3/50
126/126 [==============================] - 6s 44ms/step - loss: 0.4785 - recall_m: 0.7817 - val_loss: 4.4344 - val_recall_m: 0.6953
Epoch 4/50
126/126 [==============================] - 5s 44ms/step - loss: 0.5470 - recall_m: 0.7758 - val_loss: 0.4448 - val_recall_m: 0.7812
Epoch 5/50
126/126 [==============================] - 6s 45ms/step - loss: 0.4943 - recall_m: 0.7718 - val_loss: 0.5252 - val_recall_m: 0.7344
Epoch 6/50
126/126 [==============================] - 6s 45ms/step - loss: 0.4888 - recall_m: 0.7672 - val_loss: 0.4562 - val_recall_m: 0.7812

plt.figure(figsize=(10,4))
plt.subplot(2,2,1)
plt.plot(x_history2.history['recall_m'])

plt.ylabel('Recall %')
plt.title('Training')
plt.subplot(2,2,2)
plt.plot(x_history2.history['val_recall_m'])
plt.title('Validation')

plt.subplot(2,2,3)
plt.plot(x_history2.history['loss'])
plt.ylabel('Training Loss')
plt.xlabel('epochs')

plt.subplot(2,2,4)
plt.plot(x_history2.history['val_loss'])
plt.xlabel('epochs')

Text(0.5, 0, 'epochs')

Comparing between the two variations of Xception Architecture¶

'First Xception Architecture AUC & ROC'
y_pred_xception = np.argmax(xception1.predict(X_test), axis=1)
# np.argmax(alexnet.predict(X_test), axis=1)

#false positive and true positive rates using roc
fpr_xception, tpr_xception, thresholds_xception = mt.roc_curve(np.argmax(y_test_cat, axis=1), y_pred_xception)

#area under the curve
auc_xception = auc(fpr_xception, tpr_xception)

'Second Xception Architecture AUC & ROC'
y_pred_xception1 = np.argmax(xception2.predict(X_test), axis=1)

#false positive and true positive rates using roc
fpr_xception1, tpr_xception1, thresholds_xception1 = mt.roc_curve(np.argmax(y_test_cat, axis=1), y_pred_xception1)

#area under the curve
auc_xception1 = auc(fpr_xception1, tpr_xception1)

plt.figure(figsize=(12,12))

#plot halfway line
plt.plot([0,1], [0,1], 'k--')

#plot for CNN
plt.plot(fpr_xception, tpr_xception,label='xception1  (area = {:.3f})'.format(auc_xception))

#plot for CNN1
plt.plot(fpr_xception1, tpr_xception1,label='xception2  (area = {:.3f})'.format(auc_xception1))

plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('xception1 vs xception2')
plt.legend(loc='best')
plt.show()

Our first implementation of the Xception architecture is better than the second implementation as shown by their AUC's. The first implementation has a AUC of 0.811, which is about 0.05 higher than the second implementation.

print('0:',y_test_cat.shape)
yn = np.argmax(y_test_cat, axis=-1)
print('1:',yn.shape)

contingency = mcnemar_table(y_target=yn ,
                            y_model1=y_pred_xception,
                            y_model2=y_pred_xception1)

contingency

0: (251, 2)
1: (251,)

array([[194,  20],
       [ 12,  25]])

from mlxtend.plotting import checkerboard_plot

brd = checkerboard_plot(contingency,
                        figsize=(3, 3),
                        fmt='%d',

                        col_labels=['Xception2 right', 'Xception2 wrong'],
                        row_labels=['Xception1 right', 'Xception1 wrong'])
plt.show()

contingency2 = np.array(contingency)
chiSq, p = mcnemar(ary=contingency2, corrected=True)
print("Chi Squared: ", chiSq)
print("P-val: ", p)

Chi Squared:  1.53125
P-val:  0.21592493894013678

At a significance level of 0..05, our p-value of 0.21592493894013678 is greaeer than the siginificance level. Therefore we can accept our null hypothesis for the 2 models. We can conclude that these two models are similar.

Comparing ResNet to Xception¶

from mlxtend.evaluate import mcnemar_table, mcnemar

print('0:',y_test_cat.shape)
yn = np.argmax(y_test_cat, axis=-1)
print('1:',yn.shape)

contingency = mcnemar_table(y_target=yn ,
                            y_model1=y_pred_resnet,
                            y_model2=y_pred_xception)

contingency

0: (251, 2)
1: (251,)

array([[207,  28],
       [  7,   9]])

from mlxtend.plotting import checkerboard_plot

brd = checkerboard_plot(contingency,
                        figsize=(3, 3),
                        fmt='%d',

                        row_labels=['ResNet1 right', 'ResNet1 wrong'],
                        col_labels=['Xception1 right', 'Xception1 wrong'])
plt.show()

contingency2 = np.array(contingency)
chiSq, p = mcnemar(ary=contingency2, corrected=True)
print("Chi Squared: ", chiSq)
print("P-val: ", p)

Chi Squared:  11.428571428571429
P-val:  0.0007232327164301936

At a significance level of 0.05, our p-value of 0.0007 is less than the siginificance level. Therefore we reject our null hypothesis for the 2 models. We can conclude that the models are significantly different and that the ResNet performs better than the Xception architecture.

Comparing the CNN to the Standard MLP¶

flattened_train_data = np.asarray([x.flatten() for x in X_train ])
flattened_test_data = np.asarray([x.flatten() for x in X_test])
flattened_train_data.shape

(1002, 6000)

mlp = Sequential()
mlp.add( Dense(input_dim=flattened_train_data.shape[1], units=30, activation='relu') )
mlp.add( Dense(units=15, activation='relu') )
mlp.add( Dense(2) )
mlp.add( Activation('softmax') )

mlp.compile(loss='mean_squared_error',
              optimizer='rmsprop',
              metrics=['Precision'])

mlp.fit(flattened_train_data, y_train_cat, 
        batch_size=32, epochs=150, 
        shuffle=True, verbose=0)

<tensorflow.python.keras.callbacks.History at 0x1488e7dc0>

'ROC and AUC for RESNET'
y_pred_cnn = np.argmax(resnet1.predict(X_test), axis=1)


#false positve and true postive rates using roc
fpr_cnn, tpr_cnn, thresholds_cnn = roc_curve(np.argmax(y_test_cat,axis=1), y_pred_cnn)

#area under the curve
auc_cnn = auc(fpr_cnn, tpr_cnn)

'ROC and AUC for MLP'
y_pred_mlp = np.argmax(mlp.predict(flattened_test_data), axis=1)

#false positve and true postive rates using roc
fpr_mlp, tpr_mlp, thresholds_mlp = roc_curve(np.argmax(y_test_cat, axis=1), y_pred_mlp)

#area under the curve
auc_mlp = auc(fpr_mlp, tpr_mlp)

plt.figure(figsize=(12,12))

#plot halfway line
plt.plot([0,1], [0,1], 'k--')

#plot model 1 ROC
plt.plot(fpr_cnn, tpr_cnn, label='Resnet (area = {:.3f})'.format(auc_cnn))

#plot model 2 ROC
plt.plot(fpr_mlp, tpr_mlp, label='MLP (area = {:.3f})'.format(auc_mlp))


plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('CNN vs MLP')
plt.legend(loc='best')
plt.show()

We can conclude that our ResNet implementation performed better than the Multi Layer Perceptron (mlp). The ROC curve of the ResNet approaches the top left of the graph, while the MLP stays along the diagonal. The ResNet's AUC is 0.914, which is much better than the MLP's AUC of 0.5

from mlxtend.evaluate import mcnemar_table, mcnemar

print('0:',y_test_cat.shape)
yn = np.argmax(y_test_cat, axis=-1)
print('1:',yn.shape)
print('2:',y_pred_cnn.shape)
print('3:',y_pred_mlp.shape )

contingency = mcnemar_table(y_target=yn ,
                            y_model1=y_pred_cnn,
                            y_model2=y_pred_mlp)

contingency

0: (251, 2)
1: (251,)
2: (251,)
3: (251,)

array([[161,  74],
       [  2,  14]])

from mlxtend.plotting import checkerboard_plot

brd = checkerboard_plot(contingency,
                        figsize=(3, 3),
                        fmt='%d',
                        col_labels=['MLP right', 'MLP wrong'],
                        row_labels=['CNN right', 'CNN wrong'])
plt.show()

contingency2 = np.array(contingency)
chiSq, p = mcnemar(ary=contingency2, corrected=True)
print("Chi Squared: ", chiSq)
print("P-val: ", p)

Chi Squared:  66.32894736842105
P-val:  3.816130163654068e-16

At a significance level of 0..05, our p-value of 3.8-16 is less than the siginificance level. Therefor we can reject our null hypothesis for the 2 models. We can conclude that the models are significantly different and that the CNN perfomes better than the MLP.

3. Exceptional Work¶

We will be implementing another ResNet, but this time we will add back into the residual layer twice and then compare the performance to our first implementation of ResNet

kfold_data = StratifiedKFold(n_splits=5, random_state=42, shuffle=True).split(X_data, target)
X_train, X_test, y_train, y_test = train_test_split(X_data, target, test_size= 0.20, random_state=42)

y_cat = to_categorical(target)
y_train_cat = to_categorical(y_train)
y_test_cat = to_categorical(y_test)

NUM_CLASSES = 2

%%time

# start with a conv layer
x = Conv2D(filters=32,
               input_shape = (h,w,1),
               kernel_size=(3,3),
               kernel_initializer='he_uniform', 
               kernel_regularizer=l2(l2_lambda),
               padding='same', 
               activation='relu', 
               data_format="channels_last")(input_holder)

x = MaxPooling2D(pool_size=(2, 2), data_format="channels_last")(x)

x = Conv2D(filters=32,
               kernel_size=(3,3),
               kernel_initializer='he_uniform', 
               kernel_regularizer=l2(l2_lambda),
               padding='same', 
               activation='relu', 
               data_format="channels_last")(x)

x_split = MaxPooling2D(pool_size=(2, 2), data_format="channels_last")(x)

x = Conv2D(filters=64,
               kernel_size=(1,1),
               kernel_initializer='he_uniform', 
               kernel_regularizer=l2(l2_lambda),
               padding='same', 
               activation='relu', 
               data_format="channels_last")(x_split)

x = Conv2D(filters=64,
               kernel_size=(3,3),
               kernel_initializer='he_uniform', 
               kernel_regularizer=l2(l2_lambda),
               padding='same', 
               activation='relu', 
               data_format="channels_last")(x)

x = Conv2D(filters=32,
               kernel_size=(1,1),
               kernel_initializer='he_uniform', 
               kernel_regularizer=l2(l2_lambda),
               padding='same', 
               activation='relu', 
               data_format="channels_last")(x)

# now add back in the split layer, x_split (residual added in)
x = Add()([x, x_split])
x = Add()([x, x_split])
x = Activation("relu")(x)

x = MaxPooling2D(pool_size=(2, 2), data_format="channels_last")(x)

x = Flatten()(x)
x = Dropout(0.25)(x)
x = Dense(256)(x)
x = Activation("relu")(x)
x = Dropout(0.5)(x)
x = Dense(NUM_CLASSES)(x)
x = Activation('softmax')(x)

resnet3 = Model(inputs=input_holder,outputs=x)

resnet3.summary()

Model: "functional_17"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_8 (InputLayer)            [(None, 80, 75, 1)]  0                                            
__________________________________________________________________________________________________
conv2d_31 (Conv2D)              (None, 80, 75, 32)   320         input_8[0][0]                    
__________________________________________________________________________________________________
max_pooling2d_24 (MaxPooling2D) (None, 40, 37, 32)   0           conv2d_31[0][0]                  
__________________________________________________________________________________________________
conv2d_32 (Conv2D)              (None, 40, 37, 32)   9248        max_pooling2d_24[0][0]           
__________________________________________________________________________________________________
max_pooling2d_25 (MaxPooling2D) (None, 20, 18, 32)   0           conv2d_32[0][0]                  
__________________________________________________________________________________________________
conv2d_33 (Conv2D)              (None, 20, 18, 64)   2112        max_pooling2d_25[0][0]           
__________________________________________________________________________________________________
conv2d_34 (Conv2D)              (None, 20, 18, 64)   36928       conv2d_33[0][0]                  
__________________________________________________________________________________________________
conv2d_35 (Conv2D)              (None, 20, 18, 32)   2080        conv2d_34[0][0]                  
__________________________________________________________________________________________________
add_11 (Add)                    (None, 20, 18, 32)   0           conv2d_35[0][0]                  
                                                                 max_pooling2d_25[0][0]           
__________________________________________________________________________________________________
add_12 (Add)                    (None, 20, 18, 32)   0           add_11[0][0]                     
                                                                 max_pooling2d_25[0][0]           
__________________________________________________________________________________________________
activation_19 (Activation)      (None, 20, 18, 32)   0           add_12[0][0]                     
__________________________________________________________________________________________________
max_pooling2d_26 (MaxPooling2D) (None, 10, 9, 32)    0           activation_19[0][0]              
__________________________________________________________________________________________________
flatten_8 (Flatten)             (None, 2880)         0           max_pooling2d_26[0][0]           
__________________________________________________________________________________________________
dropout_16 (Dropout)            (None, 2880)         0           flatten_8[0][0]                  
__________________________________________________________________________________________________
dense_19 (Dense)                (None, 256)          737536      dropout_16[0][0]                 
__________________________________________________________________________________________________
activation_20 (Activation)      (None, 256)          0           dense_19[0][0]                   
__________________________________________________________________________________________________
dropout_17 (Dropout)            (None, 256)          0           activation_20[0][0]              
__________________________________________________________________________________________________
dense_20 (Dense)                (None, 2)            514         dropout_17[0][0]                 
__________________________________________________________________________________________________
activation_21 (Activation)      (None, 2)            0           dense_20[0][0]                   
==================================================================================================
Total params: 788,738
Trainable params: 788,738
Non-trainable params: 0
__________________________________________________________________________________________________
CPU times: user 101 ms, sys: 6.87 ms, total: 108 ms
Wall time: 101 ms

resnet3.compile(loss='categorical_crossentropy', # 'categorical_crossentropy' 'mean_squared_error'
                optimizer='adam', # 'adadelta' 'rmsprop'
                metrics=[recall_m])



cnn10 = []
for i, (train_data, test_data) in enumerate(kfold_data):
    res_history3 = resnet3.fit(X_data[train_data], y_cat[train_data], batch_size=16, 
                      epochs=50, verbose=1,
                      validation_data=(X_data[test_data],y_cat[test_data]),
                      callbacks=[EarlyStopping(monitor='val_loss', patience=4)]
                     )
    
    cnn10.append(res_history3)

Epoch 1/50
63/63 [==============================] - 6s 94ms/step - loss: 179.3074 - recall_m: 0.5581 - val_loss: 4.9705 - val_recall_m: 0.5966
Epoch 2/50
63/63 [==============================] - 5s 83ms/step - loss: 4.3932 - recall_m: 0.5937 - val_loss: 2.1495 - val_recall_m: 0.6005
Epoch 3/50
63/63 [==============================] - 5s 81ms/step - loss: 2.0691 - recall_m: 0.6244 - val_loss: 1.3789 - val_recall_m: 0.5973
Epoch 4/50
63/63 [==============================] - 5s 82ms/step - loss: 1.2702 - recall_m: 0.6236 - val_loss: 1.0043 - val_recall_m: 0.6026
Epoch 5/50
63/63 [==============================] - 5s 82ms/step - loss: 0.8666 - recall_m: 0.6901 - val_loss: 0.9171 - val_recall_m: 0.6104
Epoch 6/50
63/63 [==============================] - 5s 79ms/step - loss: 0.8118 - recall_m: 0.6877 - val_loss: 0.8356 - val_recall_m: 0.6165
Epoch 7/50
63/63 [==============================] - 5s 79ms/step - loss: 0.7189 - recall_m: 0.7135 - val_loss: 0.7925 - val_recall_m: 0.6101
Epoch 8/50
63/63 [==============================] - 5s 79ms/step - loss: 0.6442 - recall_m: 0.7095 - val_loss: 0.7724 - val_recall_m: 0.6065
Epoch 9/50
63/63 [==============================] - 5s 80ms/step - loss: 0.5673 - recall_m: 0.7224 - val_loss: 0.7502 - val_recall_m: 0.6474
Epoch 10/50
63/63 [==============================] - 5s 80ms/step - loss: 0.5573 - recall_m: 0.7367 - val_loss: 0.7696 - val_recall_m: 0.6413
Epoch 11/50
63/63 [==============================] - 5s 83ms/step - loss: 0.6070 - recall_m: 0.7200 - val_loss: 0.6992 - val_recall_m: 0.6396
Epoch 12/50
63/63 [==============================] - 5s 81ms/step - loss: 0.5644 - recall_m: 0.7355 - val_loss: 0.7202 - val_recall_m: 0.6669
Epoch 13/50
63/63 [==============================] - 5s 78ms/step - loss: 0.4914 - recall_m: 0.7615 - val_loss: 0.7171 - val_recall_m: 0.6751
Epoch 14/50
63/63 [==============================] - 5s 79ms/step - loss: 0.4688 - recall_m: 0.7722 - val_loss: 0.7044 - val_recall_m: 0.6651
Epoch 15/50
63/63 [==============================] - 5s 82ms/step - loss: 0.4655 - recall_m: 0.7911 - val_loss: 0.7576 - val_recall_m: 0.6587
Epoch 1/50
63/63 [==============================] - 5s 80ms/step - loss: 0.5916 - recall_m: 0.7480 - val_loss: 0.3701 - val_recall_m: 0.8945
Epoch 2/50
63/63 [==============================] - 5s 79ms/step - loss: 0.5046 - recall_m: 0.7518 - val_loss: 0.4057 - val_recall_m: 0.8420
Epoch 3/50
63/63 [==============================] - 5s 78ms/step - loss: 0.4905 - recall_m: 0.7621 - val_loss: 0.4092 - val_recall_m: 0.7969
Epoch 4/50
63/63 [==============================] - 5s 81ms/step - loss: 0.4739 - recall_m: 0.7718 - val_loss: 0.4114 - val_recall_m: 0.7955
Epoch 5/50
63/63 [==============================] - 5s 79ms/step - loss: 0.4782 - recall_m: 0.7825 - val_loss: 0.4311 - val_recall_m: 0.8111
Epoch 1/50
63/63 [==============================] - 5s 77ms/step - loss: 0.4809 - recall_m: 0.7653 - val_loss: 0.3382 - val_recall_m: 0.8867
Epoch 2/50
63/63 [==============================] - 5s 78ms/step - loss: 0.4449 - recall_m: 0.7821 - val_loss: 0.3349 - val_recall_m: 0.9123
Epoch 3/50
63/63 [==============================] - 5s 81ms/step - loss: 0.4087 - recall_m: 0.8058 - val_loss: 0.3267 - val_recall_m: 0.9240
Epoch 4/50
63/63 [==============================] - 5s 78ms/step - loss: 0.4085 - recall_m: 0.8073 - val_loss: 0.3494 - val_recall_m: 0.8654
Epoch 5/50
63/63 [==============================] - 5s 79ms/step - loss: 0.4066 - recall_m: 0.8032 - val_loss: 0.3469 - val_recall_m: 0.8732
Epoch 6/50
63/63 [==============================] - 5s 82ms/step - loss: 0.3651 - recall_m: 0.8234 - val_loss: 0.3249 - val_recall_m: 0.8714
Epoch 7/50
63/63 [==============================] - 5s 87ms/step - loss: 0.3713 - recall_m: 0.8236 - val_loss: 0.3430 - val_recall_m: 0.8402
Epoch 8/50
63/63 [==============================] - 5s 84ms/step - loss: 0.3587 - recall_m: 0.8208 - val_loss: 0.3625 - val_recall_m: 0.8402
Epoch 9/50
63/63 [==============================] - 5s 79ms/step - loss: 0.3242 - recall_m: 0.8571 - val_loss: 0.3534 - val_recall_m: 0.8306
Epoch 10/50
63/63 [==============================] - 5s 79ms/step - loss: 0.3076 - recall_m: 0.8669 - val_loss: 0.3883 - val_recall_m: 0.8168
Epoch 1/50
63/63 [==============================] - 5s 78ms/step - loss: 0.3798 - recall_m: 0.8240 - val_loss: 0.2139 - val_recall_m: 0.9648
Epoch 2/50
63/63 [==============================] - 5s 78ms/step - loss: 0.3700 - recall_m: 0.8295 - val_loss: 0.2000 - val_recall_m: 0.9453
Epoch 3/50
63/63 [==============================] - 5s 79ms/step - loss: 0.3395 - recall_m: 0.8493 - val_loss: 0.2052 - val_recall_m: 0.9492
Epoch 4/50
63/63 [==============================] - 5s 81ms/step - loss: 0.3433 - recall_m: 0.8473 - val_loss: 0.2321 - val_recall_m: 0.9273
Epoch 5/50
63/63 [==============================] - 5s 79ms/step - loss: 0.3499 - recall_m: 0.8571 - val_loss: 0.2417 - val_recall_m: 0.9195
Epoch 6/50
63/63 [==============================] - 5s 80ms/step - loss: 0.3501 - recall_m: 0.8493 - val_loss: 0.2473 - val_recall_m: 0.9062
Epoch 1/50
63/63 [==============================] - 5s 80ms/step - loss: 0.3451 - recall_m: 0.8329 - val_loss: 0.1877 - val_recall_m: 0.9609
Epoch 2/50
63/63 [==============================] - 5s 79ms/step - loss: 0.3091 - recall_m: 0.8562 - val_loss: 0.1894 - val_recall_m: 0.9430
Epoch 3/50
63/63 [==============================] - 5s 79ms/step - loss: 0.2996 - recall_m: 0.8686 - val_loss: 0.2034 - val_recall_m: 0.9453
Epoch 4/50
63/63 [==============================] - 5s 79ms/step - loss: 0.3097 - recall_m: 0.8597 - val_loss: 0.1819 - val_recall_m: 0.9508
Epoch 5/50
63/63 [==============================] - 5s 80ms/step - loss: 0.2817 - recall_m: 0.8766 - val_loss: 0.1952 - val_recall_m: 0.9414
Epoch 6/50
63/63 [==============================] - 5s 80ms/step - loss: 0.3162 - recall_m: 0.8667 - val_loss: 0.2187 - val_recall_m: 0.9414
Epoch 7/50
63/63 [==============================] - 5s 79ms/step - loss: 0.2998 - recall_m: 0.8711 - val_loss: 0.2131 - val_recall_m: 0.9414
Epoch 8/50
63/63 [==============================] - 5s 79ms/step - loss: 0.2207 - recall_m: 0.8974 - val_loss: 0.2010 - val_recall_m: 0.9039

plt.figure(figsize=(10,4))
plt.subplot(2,2,1)
plt.plot(res_history3.history['recall_m'])

plt.ylabel('Recall %')
plt.title('Training')
plt.subplot(2,2,2)
plt.plot(res_history3.history['val_recall_m'])
plt.title('Validation')

plt.subplot(2,2,3)
plt.plot(res_history3.history['loss'])
plt.ylabel('Training Loss')
plt.xlabel('epochs')

plt.subplot(2,2,4)
plt.plot(res_history3.history['val_loss'])
plt.xlabel('epochs')

Text(0.5, 0, 'epochs')

'First ResNet Architecture AUC & ROC'
y_pred_resnet = np.argmax(resnet1.predict(X_test), axis=1)
# np.argmax(alexnet.predict(X_test), axis=1)

#false positive and true positive rates using roc
fpr_resnet, tpr_resnet, thresholds_resnet = mt.roc_curve(np.argmax(y_test_cat, axis=1), y_pred_resnet)

#area under the curve
auc_resnet = auc(fpr_resnet, tpr_resnet)

'New ResNet Architecture AUC & ROC'
y_pred_resnet2 = np.argmax(resnet3.predict(X_test), axis=1)

#false positive and true positive rates using roc
fpr_resnet1, tpr_resnet1, thresholds_resnet1 = mt.roc_curve(np.argmax(y_test_cat, axis=1), y_pred_resnet2)

#area under the curve
auc_resnet1 = auc(fpr_resnet1, tpr_resnet1)

plt.figure(figsize=(12,12))

#plot halfway line
plt.plot([0,1], [0,1], 'k--')

#plot for CNN
plt.plot(fpr_resnet, tpr_resnet,label='ResNet1  (area = {:.3f})'.format(auc_resnet))

#plot for CNN1
plt.plot(fpr_resnet1, tpr_resnet1,label='ResNet_New  (area = {:.3f})'.format(auc_resnet1))

plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Resnet1 vs Resnet_New')
plt.legend(loc='best')
plt.show()

As shown by the graph above, the new ResNet approaches the top left corner of the graph much faster than our original implementation. The new ResNet also has a larger AUC of 0.944, allowing us to conclude that it is a slighlty better implementation.

print('0:',y_test_cat.shape)
yn = np.argmax(y_test_cat, axis=-1)
print('1:',yn.shape)

contingency = mcnemar_table(y_target=yn ,
                            y_model1=y_pred_resnet,
                            y_model2=y_pred_resnet2)

contingency

0: (251, 2)
1: (251,)

array([[225,  10],
       [ 11,   5]])

brd = checkerboard_plot(contingency,
                        figsize=(3, 3),
                        fmt='%d',

                        col_labels=['ResNet_New right', 'ResNet_New wrong'],
                        row_labels=['ResNet1 right', 'ResNet1 wrong'])
plt.show()

contingency2 = np.array(contingency)
chiSq, p = mcnemar(ary=contingency2, corrected=True)
print("Chi Squared: ", chiSq)
print("P-val: ", p)

Chi Squared:  0.0
P-val:  1.0

However, at a significance level of 0.05, our p-value of 1.0 is greater than the siginificance level. Therefor we can accept our null hypothesis for the 2 models. With a pvalue of 1, we can conclude that these models are almost no differences.