Update README.md

2a83f255 · BRIGGAN2 · 03d89ab9 · 2a83f255
Commit 2a83f255 authored Oct 27, 2020 by BRIGGAN2
--- a/README.md
+++ b/README.md
@@ -2,115 +2,175 @@

 This repository cotains:
 * cifar10.py: subroutines for loading and downloading cifar10 data
-* omp.py: STARTER CODE for OMP whitening subroutine
-* dictlearn.py: STARTER CODE for dictlearn subroutine
+* cnn_helper.py: helper functions for CNN code
+* activations.py: contains activations functions and useful derivatives
+* cnnff.py: STARTER CODE for CNN feed-forward subroutine
+* cnnbp.py: STARTER CODE for CNN backpropagation subroutine
+* cnn.py: STARTER CODE for training CNNs


 ## Instructions
-### 1. Complete zca.py:
+### 1. Complete cnnff.py:

 `zca.py` contains the following subroutine

 ```python
-def zca_white(x):
-  """ perform zca whitening 
+def cnnff(x, net):
+    ''' cnnff
+    
+        perform feed-forward pass for convolutional network
        
        inputs: 
-        x: numpy array of images
+            x: batch of input images (N x H x W x Cin numpy array)
+            net: List structure describing the network architecture (see cnn.py for details)
        
        outputs:
-        y: numpy array of whitened images        
-  """
+            net: updated net data structure that stores outputs from each layer
+    '''
    
-  # *** put code for zca whitening here ***
+    # set input layer
    
-  return y
-```
+    # loop over layers 1...L
+    for n in range(1,len(net)):
+        # current input
+        inp = net[n-1]['output'] 
+        # current layer
+        layer = net[n] 

-which need to completed using only the `numpy` package.  
+        # if layer type is Conv
+        if layer['type'] is 'Conv':
+            # conv followed by activation function
+            ''' *** put code here *** '''    
+        # if layer type is Pool
+        elif layer['type'] is 'Pool':
+            ''' *** put code here *** '''
+    
+    return net
+```
+which need to completed using only the `numpy` package and the `correlate` or `convolve` functions from `scipy.signal`.  
 **No other packages should be used**   

-### 2. Complete ica.py
+### 2. Complete cnnbp.py

-`ica.py` includes the following subroutines that need to be completed. The first
-subroutine `sample` should sample patches from images.  
+`cnnbp.py` includes the following subroutine 

 ```python
-def sample(x, patch_size=16, num_patches=10):
-    ''' randomly sample patches from x 
-    
-        inputs:
-            x: numpy array of images
-            patch_size: patch dims for patch_size x patch_size images 
-                        (default 16)
-            num_patches: number of patches to sample (default 10)
-
-        outputs:
-            y: numpy array of patches
-    '''
-
-    return y
+def cnnbp(labels, net):
+    
+    # batch size
+    batch_size = net[0]['output'].shape[0]
+      
+    # local gradient final layer:
+    # derivative of softmax loss w.r.t presynaptic response
+    ''' *** put code here *** '''
+   
+    # compute local gradients other layers
+    for n in range(len(net)-2,0,-1):
+        
+        # current layer
+        layer = net[n] 
+        # next_layer
+        layer_ = net[n+1]
+        
+        # if next layer type is Conv
+        if layer_['type'] is 'Conv':
+            ''' *** put code here *** '''
+        # if next layer type is Pool
+        elif layer_['type'] is 'Pool':
+            ''' *** put code here *** '''
+    
+    # compute gradW and gradb for each layer
+    for n in range(1,len(net)):
+        # current
+        layer = net[n]
+        # prev
+        layer_ = net[n-1]
+        
+        if layer['type'] is 'Conv':        
+            # compute gradient wrt convolutional filters
+            ''' *** put code here *** '''
+            # compute gradient wrt biases           
+            ''' *** put code here *** '''
+            # save for gradient update (see cnn.py)
+            layer['gradW'] = gradW
+            layer['gradb'] = gradb 
+    
+    return net
 ```
+which need to completed using only the `numpy` package and the `correlate` or `convolve` functions from `scipy.signal`.  
+**No other packages should be used**  

-The second subroutine `ica` should perform gradient descent with the backtracking 
-line search to adapted the learning ratefor the ICA objective function. 
+### 3. Complete cnn.py
+
+`cnnbp.py` includes the following subroutine

 ```python
-def ica(x, **args):
-    ''' perform independent component analysis (ICA)
+def trainCNN(X, Y, **args):
+    ''' trainCNN
    
        inputs:
-            x: numpy array of images
+            X: images (n x 32 x 32 x 3 array)
+            Y: labels (n x 10 one_hot array)
            args:
-                lr: learning rate (default 1e-3)
-                nsteps: maximum iterations (default 1000)
-                k: number of latent variables (defualt 20)
+                nepochs: number of epochs (defualt 100)
+                batch_size: batch_size (default 32)
+                lr: learning rate (default 0.001)
        
        returns:
-            L: numpy array of loss function value for all iterations
-            W: numpy array of ICA basis vectors
+            L: loss per epoch
+            A: accuracy per epoch 
+            
    '''

    # default parameters
    if not len(args):
-        args['lr'] = 1
-        args['nsteps'] = 200
-        args['k'] = 64
+        args['nepochs'] = 100
+        args['bsize'] = 32
+        args['lr'] = 0.001
    
+    nepochs = args['nepochs']
+    bsize = args['bsize']
    lr = args['lr']
-    nsteps = args['nsteps']
-    k = args['k']

-    # ***initialize variables here***
+    # define CNN
+    net = [ {'type': 'Input', 'output': None}, # Layer 0 
+            {'type': 'Conv', 'shape': (16, 5, 5, 3), 'stride': 1, 'activation': 'ReLU', 'W': None, 'b': None, 'd': None, 'gradW': None, 'gradb': None, 'output': None}, # Layer 1
+            {'type': 'Pool', 'shape': (1, 2, 2, 1), 'stride': 2, 'activation': None, 'd': None, 'output': None}, # Layer 2
+            {'type': 'Conv', 'shape': (32, 5, 5, 16), 'stride': 1, 'activation': 'ReLU', 'W': None, 'b': None, 'd': None, 'gradW': None, 'gradb': None, 'output': None}, # Layer 3
+            {'type': 'Pool', 'shape': (1, 2, 2, 1), 'stride': 2, 'activation': None, 'd': None, 'output': None}, # Layer 4
+            {'type': 'Conv', 'shape': (64, 5, 5, 32), 'stride': 1, 'activation': 'ReLU', 'W': None, 'b': None, 'd': None, 'gradW': None, 'gradb': None, 'output': None}, # Layer 5
+            {'type': 'Conv', 'shape': (10, 1, 1, 64), 'stride': 1, 'activation': 'softmax', 'W': None, 'b': None, 'd': None, 'gradW': None, 'gradb': None, 'output': None}] # Layer 6
    
-    '''training loop using graident descent'''
-    for step in range(nsteps):
-        # ***insert gradient descent code here***
-        ''' use backtracking line search '''
+    # initialize CNN
+    net = initCNN(net)
    
-        # print loss
-        print('step: {} / {}, L: {}'.format(step, nsteps, L[step]))
+    for epoch in range(nepochs):
+        # shuffle images and labels
        
-    return L, W
-```
+        # compute cross_entropy_loss and accuracy over all images
+        #*** feed foward
+        #*** loss
+        #*** accuracy
        
-`ica` and `sample` need to completed only the following packages:
-* `numpy`
-* `scipy.linalg`
-**No other packages should be used**
+        # print loss and accuracy every epoch
+        print('epoch: ', epoch, 'loss: ', loss, 'acc: ', acc)
        
-### Parameters
-`ica.py` also provides a sample main that loads cifar10 using `cifar10.py`,  
-whitens the images using `zca.py`, performs ICA using the `ica(x,**args)`, and displays
-and displays the learned basis images `W`.
+        # for each batch of images
+        for i in range(np.floor(X.shape[0] / bsize)):
+            # batch i
            
-**Note that values of parameters such as the learning rate `lr`, number of basis
- images `k`, and number of optimization steps `nsteps` may need to be changed**
+            # feedfoward
            
-### Example
+            # backprop           
            
-The following image show an example result of applying ICA to whitened cifar10 data
+            # apply / update gradients
            
-![Test Image 1](cifar_ica_basis_64.png)
+        
+    return L, A
+```
+which can be completed using
+* `numpy`
+* `scipy.linalg`
+**No other packages should be used**