Update model.ipynb

8b26d097 · Zeynep Hakguder · 0745b868 · 8b26d097
Commit 8b26d097 authored 6 years ago by Zeynep Hakguder
--- a/ProgrammingAssignment_1/model.ipynb
+++ b/ProgrammingAssignment_1/model.ipynb
@@ -122,7 +122,12 @@
    "    Returns:\n",
    "        dist: float\n",
    "    '''\n",
-
+    "if metric == 'Euclidean':\n",
+        "raise NotImplementedError",
+    "elif metric == 'Manhattan':\n",
+        "raise NotImplementedError\n",
+    "else:\n",
+        "raise ValueError('{} is not a valid metric.'.format(metric))\n",
    "    return dist # scalar distance btw x and y"
   ]
  },

 %% Cell type:markdown id: tags:

 # JUPYTER NOTEBOOK TIPS

 Each rectangular box is called a cell.
 * Ctrl+ENTER evaluates the current cell; if it contains Python code, it runs the code, if it contains Markdown, it returns rendered text.
 * Alt+ENTER evaluates the current cell and adds a new cell below it.
 * If you click to the left of a cell, you'll notice the frame changes color to blue. You can erase a cell by hitting 'dd' (that's two "d"s in a row) when the frame is blue.

 %% Cell type:markdown id: tags:

 # Supervised Learning Model Skeleton

 We'll use this skeleton for implementing different supervised learning algorithms.

 %% Cell type:code id: tags:

 ``` python
 class Model:

    def fit(self):

        raise NotImplementedError

    def predict(self, test_points):
        raise NotImplementedError
 ```

 %% Cell type:code id: tags:

 ``` python
 def preprocess(feature_file, label_file):
    '''
    Args:
        feature_file: str
            file containing features
        label_file: str
            file containing labels
    Returns:
        features: ndarray
            nxd features
        labels: ndarray
            nx1 labels
    '''

    # read in features and labels
    return features, labels
 ```

 %% Cell type:code id: tags:

 ``` python
 def partition(size, t, v = 0):
    '''
    Args:
        size: int
            number of examples in the whole dataset
        t: float
            proportion kept for test
        v: float
            proportion kept for validation
    Returns:
        test_indices: ndarray
            1D array containing test set indices
        val_indices: ndarray
            1D array containing validation set indices
    '''

    # number of test and validation examples

    return test_indices, val_indices, train_indices
 ```

 %% Cell type:markdown id: tags:

 ## TASK 1: Implement `distance` function

 %% Cell type:markdown id: tags:

 "distance" function will be used in calculating cost of *k*-NN. It should take two data points and the name of the metric and return a scalar value.

 %% Cell type:code id: tags:

 ``` python
 #TODO: Programming Assignment 1
 def distance(x, y, metric):
    '''
    Args:
        x: ndarray
            1D array containing coordinates for a point
        y: ndarray
            1D array containing coordinates for a point
        metric: str
            Euclidean, Manhattan
    Returns:
        dist: float
    '''
+if metric == 'Euclidean':
+raise NotImplementedError
+elif metric == 'Manhattan':
+raise NotImplementedError
+else:
+raise ValueError('{} is not a valid metric.'.format(metric))
    return dist # scalar distance btw x and y
 ```

 %% Cell type:markdown id: tags:

 ## General supervised learning performance related functions

 %% Cell type:markdown id: tags:

 Implement the "conf_matrix" function that takes as input an array of true labels (*true*) and an array of predicted labels (*pred*). It should output a numpy.ndarray.

 %% Cell type:code id: tags:

 ``` python
 # TODO: Programming Assignment 1

 def conf_matrix(true, pred, n_classes):
    '''
    Args:
        true:  ndarray
            nx1 array of true labels for test set
        pred: ndarray
            nx1 array of predicted labels for test set
        n_classes: int
    Returns:
        result: ndarray
            n_classes x n_classes array confusion matrix
    '''
    raise NotImplementedError
    result = np.ndarray([n_classes, n_classes])


    # returns the confusion matrix as numpy.ndarray
    return result
 ```

 %% Cell type:markdown id: tags:

 ROC curves are a good way to visualize sensitivity vs. 1-specificity for varying cut off points. "ROC" takes a list containing different *threshold* parameter values to try and returns two arrays; one where each entry is the sensitivity at a given threshold and the other where entries are 1-specificities.

 %% Cell type:code id: tags:

 ``` python
 # TODO: Programming Assignment 1

 def ROC(true_labels, preds, value_list):
    '''
    Args:
        true_labels: ndarray
            1D array containing true labels
        preds: ndarray
            1D array containing thresholded value (e.g. proportion of neighbors in kNN)
        value_list: ndarray
            1D array containing different threshold values
    Returns:
        sens: ndarray
            1D array containing sensitivities
        spec_: ndarray
            1D array containing 1-specifities
    '''

    # calculate sensitivity, 1-specificity
    # return two arrays

    raise NotImplementedError

    return sens, spec_
 ```