Newer
Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# JUPYTER NOTEBOOK TIPS\n",
"\n",
"Each rectangular box is called a cell. \n",
"* Ctrl+ENTER evaluates the current cell; if it contains Python code, it runs the code, if it contains Markdown, it returns rendered text.\n",
"* Alt+ENTER evaluates the current cell and adds a new cell below it.\n",
"* If you click to the left of a cell, you'll notice the frame changes color to blue. You can erase a cell by hitting 'dd' (that's two \"d\"s in a row) when the frame is blue."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Supervised Learning Model Skeleton\n",
"\n",
"We'll use this skeleton for implementing different supervised learning algorithms."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"class Model:\n",
" \n",
" def fit(self):\n",
" \n",
" raise NotImplementedError\n",
" \n",
" def predict(self, test_points):\n",
" raise NotImplementedError"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"def preprocess(feature_file, label_file):\n",
" '''\n",
" Args:\n",
" feature_file: str \n",
" file containing features\n",
" label_file: str\n",
" file containing labels\n",
" Returns:\n",
" features: ndarray\n",
" nxd features\n",
" labels: ndarray\n",
" nx1 labels\n",
" '''\n",
" \n",
" # read in features and labels\n",
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
" return features, labels"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"def partition(size, t, v = 0):\n",
" '''\n",
" Args:\n",
" size: int\n",
" number of examples in the whole dataset\n",
" t: float\n",
" proportion kept for test\n",
" v: float\n",
" proportion kept for validation\n",
" Returns:\n",
" test_indices: ndarray\n",
" 1D array containing test set indices\n",
" val_indices: ndarray\n",
" 1D array containing validation set indices\n",
" '''\n",
" \n",
" # number of test and validation examples\n",
" \n",
" return test_indices, val_indices, train_indices"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## TASK 1: Implement `distance` function"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\"distance\" function will be used in calculating cost of *k*-NN. It should take two data points and the name of the metric and return a scalar value."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"#TODO: Programming Assignment 1\n",
"def distance(x, y, metric):\n",
" '''\n",
" Args:\n",
" x: ndarray \n",
" 1D array containing coordinates for a point\n",
" y: ndarray\n",
" 1D array containing coordinates for a point\n",
" metric: str\n",
" Euclidean, Manhattan \n",
" Returns:\n",
" dist: float\n",
" '''\n",
"if metric == 'Euclidean':\n",
"raise NotImplementedError",
"elif metric == 'Manhattan':\n",
"raise NotImplementedError\n",
"else:\n",
"raise ValueError('{} is not a valid metric.'.format(metric))\n",
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
" return dist # scalar distance btw x and y"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## General supervised learning performance related functions "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Implement the \"conf_matrix\" function that takes as input an array of true labels (*true*) and an array of predicted labels (*pred*). It should output a numpy.ndarray."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"# TODO: Programming Assignment 1\n",
"\n",
"def conf_matrix(true, pred, n_classes):\n",
" '''\n",
" Args: \n",
" true: ndarray\n",
" nx1 array of true labels for test set\n",
" pred: ndarray \n",
" nx1 array of predicted labels for test set\n",
" n_classes: int\n",
" Returns:\n",
" result: ndarray\n",
" n_classes x n_classes array confusion matrix\n",
" '''\n",
" raise NotImplementedError\n",
" result = np.ndarray([n_classes, n_classes])\n",
" \n",
" \n",
" # returns the confusion matrix as numpy.ndarray\n",
" return result"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"ROC curves are a good way to visualize sensitivity vs. 1-specificity for varying cut off points. \"ROC\" takes a list containing different *threshold* parameter values to try and returns two arrays; one where each entry is the sensitivity at a given threshold and the other where entries are 1-specificities."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# TODO: Programming Assignment 1\n",
"\n",
"def ROC(true_labels, preds, value_list):\n",
" '''\n",
" Args:\n",
" true_labels: ndarray\n",
" 1D array containing true labels\n",
" preds: ndarray\n",
" 1D array containing thresholded value (e.g. proportion of neighbors in kNN)\n",
" value_list: ndarray\n",
" 1D array containing different threshold values\n",
" Returns:\n",
" sens: ndarray\n",
" 1D array containing sensitivities\n",
" spec_: ndarray\n",
" 1D array containing 1-specifities\n",
" '''\n",
" \n",
" # calculate sensitivity, 1-specificity\n",
" # return two arrays\n",
" \n",
" raise NotImplementedError\n",
" \n",
" return sens, spec_"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}