-
- Downloads
Update model.ipynb
%% Cell type:markdown id: tags: | ||
# JUPYTER NOTEBOOK TIPS | ||
Each rectangular box is called a cell. | ||
* Ctrl+ENTER evaluates the current cell; if it contains Python code, it runs the code, if it contains Markdown, it returns rendered text. | ||
* Alt+ENTER evaluates the current cell and adds a new cell below it. | ||
* If you click to the left of a cell, you'll notice the frame changes color to blue. You can erase a cell by hitting 'dd' (that's two "d"s in a row) when the frame is blue. | ||
%% Cell type:markdown id: tags: | ||
# Supervised Learning Model Skeleton | ||
We'll use this skeleton for implementing different supervised learning algorithms. | ||
%% Cell type:code id: tags: | ||
``` python | ||
class Model: | ||
def fit(self): | ||
raise NotImplementedError | ||
def predict(self, test_points): | ||
raise NotImplementedError | ||
``` | ||
%% Cell type:code id: tags: | ||
``` python | ||
def preprocess(feature_file, label_file): | ||
''' | ||
Args: | ||
feature_file: str | ||
file containing features | ||
label_file: str | ||
file containing labels | ||
Returns: | ||
features: ndarray | ||
nxd features | ||
labels: ndarray | ||
nx1 labels | ||
''' | ||
# read in features and labels | ||
return features, labels | ||
``` | ||
%% Cell type:code id: tags: | ||
``` python | ||
def partition(size, t, v = 0): | ||
''' | ||
Args: | ||
size: int | ||
number of examples in the whole dataset | ||
t: float | ||
proportion kept for test | ||
v: float | ||
proportion kept for validation | ||
Returns: | ||
test_indices: ndarray | ||
1D array containing test set indices | ||
val_indices: ndarray | ||
1D array containing validation set indices | ||
''' | ||
# number of test and validation examples | ||
return test_indices, val_indices, train_indices | ||
``` | ||
%% Cell type:markdown id: tags: | ||
## TASK 1: Implement `distance` function | ||
%% Cell type:markdown id: tags: | ||
"distance" function will be used in calculating cost of *k*-NN. It should take two data points and the name of the metric and return a scalar value. | ||
%% Cell type:code id: tags: | ||
``` python | ||
#TODO: Programming Assignment 1 | ||
def distance(x, y, metric): | ||
''' | ||
Args: | ||
x: ndarray | ||
1D array containing coordinates for a point | ||
y: ndarray | ||
1D array containing coordinates for a point | ||
metric: str | ||
Euclidean, Manhattan | ||
Returns: | ||
dist: float | ||
''' | ||
if metric == 'Euclidean': | ||
raise NotImplementedError | ||
raise NotImplementedError | ||
elif metric == 'Manhattan': | ||
raise NotImplementedError | ||
raise NotImplementedError | ||
else: | ||
raise ValueError('{} is not a valid metric.'.format(metric)) | ||
return dist # scalar distance btw x and y | ||
raise ValueError('{} is not a valid metric.'.format(metric)) | ||
return dist # scalar distance btw x and y | ||
``` | ||
%% Cell type:markdown id: tags: | ||
## General supervised learning performance related functions | ||
%% Cell type:markdown id: tags: | ||
Implement the "conf_matrix" function that takes as input an array of true labels (*true*) and an array of predicted labels (*pred*). It should output a numpy.ndarray. | ||
%% Cell type:code id: tags: | ||
``` python | ||
# TODO: Programming Assignment 1 | ||
def conf_matrix(true, pred, n_classes): | ||
''' | ||
Args: | ||
true: ndarray | ||
nx1 array of true labels for test set | ||
pred: ndarray | ||
nx1 array of predicted labels for test set | ||
n_classes: int | ||
Returns: | ||
result: ndarray | ||
n_classes x n_classes array confusion matrix | ||
''' | ||
raise NotImplementedError | ||
result = np.ndarray([n_classes, n_classes]) | ||
# returns the confusion matrix as numpy.ndarray | ||
return result | ||
``` | ||
%% Cell type:markdown id: tags: | ||
ROC curves are a good way to visualize sensitivity vs. 1-specificity for varying cut off points. "ROC" takes a list containing different *threshold* parameter values to try and returns two arrays; one where each entry is the sensitivity at a given threshold and the other where entries are 1-specificities. | ||
%% Cell type:code id: tags: | ||
``` python | ||
# TODO: Programming Assignment 1 | ||
def ROC(true_labels, preds, value_list): | ||
''' | ||
Args: | ||
true_labels: ndarray | ||
1D array containing true labels | ||
preds: ndarray | ||
1D array containing thresholded value (e.g. proportion of neighbors in kNN) | ||
value_list: ndarray | ||
1D array containing different threshold values | ||
Returns: | ||
sens: ndarray | ||
1D array containing sensitivities | ||
spec_: ndarray | ||
1D array containing 1-specifities | ||
''' | ||
# calculate sensitivity, 1-specificity | ||
# return two arrays | ||
raise NotImplementedError | ||
return sens, spec_ | ||
``` | ||
... | ... |
Please register or sign in to comment