diff --git a/ProgrammingAssignment1.ipynb b/ProgrammingAssignment1.ipynb
index 0951185f36643a0d18b3b8e31866d574cd73b20e..3dc81dcb07c601505ac7eb145934dbc6cb254760 100644
--- a/ProgrammingAssignment1.ipynb
+++ b/ProgrammingAssignment1.ipynb
@@ -6,7 +6,12 @@
    "source": [
     "# $k$-Nearest Neighbor\n",
     "\n",
-    "We'll implement $k$-Nearest Neighbor ($k$-NN) algorithm for this assignment. A skeleton of a general supervised learning model is provided in \"model.ipynb\". Please look through it and complete the \"preprocess\" and \"partition\" methods.\n",
+    "We'll implement $k$-Nearest Neighbor ($k$-NN) algorithm for this assignment. We recommend using [Madelon](https://archive.ics.uci.edu/ml/datasets/Madelon) dataset, although it is not mandatory. If you choose to use a different dataset, it should meet the following criteria:\n",
+    "* dependent variable should be binary (suited for binary classification)\n",
+    "* number of features (attributes) should be at least 50\n",
+    "* number of examples (instances) should be at least 1,000\n",
+    "\n",
+    "A skeleton of a general supervised learning model is provided in \"model.ipynb\". Please look through it and complete the \"preprocess\" and \"partition\" methods. \n",
     "\n",
     "### Assignment Goals:\n",
     "In this assignment, we will:\n",
@@ -53,7 +58,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Choice of distance metric plays an important role in the performance of $k$-NN. Let's start by implementing a distance method in the \"distance\" function below. It should take two data points and the name of the metric and return a scalar value."
+    "Choice of distance metric plays an important role in the performance of $k$-NN. Let's start with implementing a distance method in the \"distance\" function below. It should take two data points and the name of the metric and return a scalar value."
    ]
   },
   {
@@ -84,7 +89,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "We can start implementing our $k$-NN classifier. $k$-NN class inherits Model class. You'll need to implement \"fit\" and \"predict\" methods. Use the \"distance\" function you defined above. \"fit\" method takes $k$ as an argument. \"predict\" takes as input the feature vector for a single test point and outputs the predicted class and the proportion of predicted class labels in $k$ nearest neighbors."
+    "We can start implementing our $k$-NN classifier. $k$-NN class inherits Model class. You'll need to implement \"fit\" and \"predict\" methods. Use the \"distance\" function you defined above. \"fit\" method takes $k$ as an argument. \"predict\" takes as input an $mxd$ array containing $d$-dimensional $m$ feature vectors for examples and outputs the predicted class and the proportion of predicted class labels in $k$ nearest neighbors."
    ]
   },
   {
@@ -97,32 +102,32 @@
     "    '''\n",
     "    Inherits Model class. Implements the k-NN algorithm for classification.\n",
     "    '''\n",
-    "    def __init__(self, preprocessor_f, partition_f, distance_f):\n",
-    "        super().__init__(preprocessor_f, partition_f)\n",
-    "        \n",
-    "        # set self.distance_f and self.distance_metric\n",
-    "        \n",
-    "        \n",
-    "    def fit(self, k):\n",
+    "       \n",
+    "    def fit(self, k, distance_f, **kwargs):\n",
     "        '''\n",
     "        Fit the model. This is pretty straightforward for k-NN.\n",
     "        '''\n",
-    "        \n",
+    "        # set self.k, self.distance_f, self.distance_metric\n",
     "        raise NotImplementedError\n",
     "        \n",
     "        return\n",
     "    \n",
     "    \n",
-    "    def predict(self, test_point):\n",
+    "    def predict(self, test_indices):\n",
     "        \n",
     "        raise NotImplementedError\n",
     "        \n",
+    "        pred = []\n",
+    "        # for each point in test points\n",
+    "        # use your implementation of distance function\n",
+    "        #  distance_f(..., distance_metric)\n",
+    "        # to find the labels of k-nearest neighbors. \n",
     "        \n",
-    "        # use self.distance_f(...,self.distance_metric)\n",
+    "        # Find the ratio of the positive labels\n",
+    "        # and append to pred with pred.append(ratio).\n",
     "        \n",
-    "        # return the predicted class label and the following ratio: \n",
-    "        # number of points that have the same label as the test point / k\n",
-    "        return predicted_label, ratio\n",
+    "\n",
+    "        return np.array(pred)\n",
     "    "
    ]
   },
@@ -147,32 +152,16 @@
    "outputs": [],
    "source": [
     "# populate the keyword arguments dictionary kwargs\n",
-    "kwargs = {'p': 0.3, 'v': 0.1, 'file_path': 'mnist_test.csv', 'metric': 'Euclidean'}\n",
+    "kwargs = {'p': 0.3, 'v': 0.1, seed: 123, 'file_path': 'madelon_train'}\n",
     "# initialize the model\n",
-    "my_model = kNN(preprocessor_f=preprocess, partition_f=partition, distance=distance, **kwargs)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Assign a value to $k$ and fit the $k$-NN model."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "my_model.fit(k=10)"
+    "my_model = kNN(preprocessor_f=preprocess, partition_f=partition, **kwargs)"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "You can use \"predict_batch\" function below to evaluate your model on the test data. You do not need to change the value of the threshold yet."
+    "Assign a value to $k$ and fit the kNN model."
    ]
   },
   {
@@ -181,29 +170,15 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "def predict_batch(model, indices, threshold=0.5):\n",
-    "    '''\n",
-    "    model: a fitted k-NN model\n",
-    "    indices: for data points to predict\n",
-    "    threshold: lower limit on the ratio for a point to be considered positive\n",
-    "    '''\n",
-    "    \n",
-    "    predicted_labels = []\n",
-    "    true_labels = []\n",
-    "\n",
-    "    for index in indices:\n",
-    "        # vary the threshold value for ROC analysis\n",
-    "        predicted_classes.append(model.predict(model.features[index], threshold))\n",
-    "        true_classes.append(model.labels[index])\n",
-    "\n",
-    "    return predicted_labels, true_labels"
+    "kwargs_f = {'metric': 'Euclidean'}\n",
+    "my_model.fit(k = 10, distance_f=distance, **kwargs_f)"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Use \"predict_batch\" function above to report your model's accuracy on the test set. Also, calculate and report the confidence interval on the generalization error estimate."
+    "Evaluate your model on the test data and report your accuracy. Also, calculate and report the confidence interval on the generalization error estimate."
    ]
   },
   {
@@ -212,7 +187,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "predict_batch(my_model, my_model.test_indices)\n",
+    "final_labels = my_model.predict(my_model.test_indices)\n",
     "# Calculate accuracy and generalization error with confidence interval here."
    ]
   },
@@ -220,9 +195,9 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# TODO: leaa \n",
+    "# TODO: learning curve \n",
     "\n",
-    "Now that we have the true labels and the predicted ones from our model, we can build a confusion matrix and see how accurate our model is. Implement the \"conf_matrix\" function that takes as input an array of true labels ($true$) and an array of predicted labels ($pred$). It should output a numpy.ndarray. "
+    "Now that we have the true labels and the predicted ones from our model, we can build a confusion matrix and see how accurate our model is. Implement the \"conf_matrix\" function (in model.ipynb) that takes as input an array of true labels ($true$) and an array of predicted labels ($pred$). It should output a numpy.ndarray. You do not need to change the value of the threshold parameter yet."
    ]
   },
   {
@@ -231,14 +206,8 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "def conf_matrix(true, pred):\n",
-    "    '''\n",
-    "    true: nx1 array of true labels for test set\n",
-    "    pred: nx1 array of predicted labels for test set\n",
-    "    '''\n",
-    "    raise NotImplementedError\n",
-    "    # returns the confusion matrix as numpy.ndarray\n",
-    "    return c_mat"
+    "# You should see array([ 196, 106,  193, 105]) with seed 123\n",
+    "conf_matrix(my_model.labels[my_model.test_indices], final_labels, threshold= 0.5)"
    ]
   },
   {
@@ -259,7 +228,7 @@
    "outputs": [],
    "source": [
     "# Change values of $k. \n",
-    "# Calculate accuracies and confusion matrices for the validation set.\n",
+    "# Calculate accuracies for the validation set.\n",
     "# Report a good k value that you'll use in the following analyses."
    ]
   },
diff --git a/ProgrammingAssignment2.ipynb b/ProgrammingAssignment2.ipynb
index 428855aa73245f97c4298e2c421df8e0737a1f45..41badf7aafb60b063f53955f8e76c2e87c35f603 100644
--- a/ProgrammingAssignment2.ipynb
+++ b/ProgrammingAssignment2.ipynb
@@ -31,7 +31,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 1,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -48,7 +48,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 2,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -57,7 +57,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 3,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -176,7 +176,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "If the data is better suited for quadratic/cubic regression, regions of positive and negative residuals will alternate in the plot. Regardless, modify the fit and predict in the class definition to raise the feature values to $polynomial\\_degree$. You can directly make the modification in the above definition, do not repeat. Use the validation set to find among the degree of polynomial that results in lowest \"mse\"."
+    "If the data is better suited for quadratic/cubic regression, regions of positive and negative residuals will alternate in the plot. Regardless, modify fit\" and \"predict\" in the class definition to raise the feature values to $polynomial\\_degree$. You can directly make the modification in the above definition, do not repeat. Use the validation set to find the degree of polynomial that results in lowest \"mse\"."
    ]
   },
   {
@@ -211,7 +211,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 27,
+   "execution_count": null,
    "metadata": {},
    "outputs": [],
    "source": [
diff --git a/model-Solution.ipynb b/model-Solution.ipynb
index 943c73d631331e5d1aa8c6584e909a21c74a1149..ccdf6101e2a4027f83ac529c3f66d6e214faed01 100644
--- a/model-Solution.ipynb
+++ b/model-Solution.ipynb
@@ -74,7 +74,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 8,
+   "execution_count": 9,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -97,6 +97,39 @@
     "    def predict(self):\n",
     "        raise NotImplementedError"
    ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def conf_matrix(true_l, pred, threshold):\n",
+    "    tp = tn = fp = fn = 0\n",
+    "    \n",
+    "    for i in range(len(true_l)):\n",
+    "        tmp = -1\n",
+    "        \n",
+    "        if pred[i] > threshold:\n",
+    "            tmp = 1\n",
+    "        if tmp == true_l[i]:\n",
+    "        \n",
+    "            if true_l[i] == 1:\n",
+    "                tp += 1\n",
+    "            else:\n",
+    "                tn += 1\n",
+    "        else:\n",
+    "            if true_l[i] == 1:\n",
+    "                fn += 1\n",
+    "            else:\n",
+    "                fp += 1\n",
+    "    \n",
+    "    return np.array([tp,tn, fp, fn])\n",
+    "    \n",
+    "    \n",
+    "    # returns the confusion matrix as numpy.ndarray\n",
+    "    #raise NotImplementedError"
+   ]
   }
  ],
  "metadata": {
diff --git a/model.ipynb b/model.ipynb
index 7ba1ea9653ea6209dbc9a2f7c20506e67130ab4a..fa80a4feb258f1a42087db329c474f7fb7ed3d73 100644
--- a/model.ipynb
+++ b/model.ipynb
@@ -71,6 +71,9 @@
     "    # np.random.choice might come in handy. Do not sample with replacement!\n",
     "    # Be sure to not use the same indices in test and validation sets!\n",
     "    \n",
+    "    # use the first np.ceil(size*p) for test, \n",
+    "    # the following np.ceil(size*v) for validation set.\n",
+    "    \n",
     "    raise NotImplementedError\n",
     "    \n",
     "    # return two 1d arrays: one keeping validation set indices, the other keeping test set indices \n",
@@ -79,7 +82,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": 3,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -103,6 +106,41 @@
     "    def predict(self, testpoint):\n",
     "        raise NotImplementedError"
    ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## General supervised learning related functions"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Implement the \"conf_matrix\" function that takes as input an array of true labels ($true$) and an array of predicted labels ($pred$). It should output a numpy.ndarray."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def conf_matrix(true, pred):\n",
+    "    '''\n",
+    "    true: nx1 array of true labels for test set\n",
+    "    pred: nx1 array of predicted labels for test set\n",
+    "    '''\n",
+    "    raise NotImplementedError\n",
+    "    \n",
+    "    tp = tn = fp = fn = 0\n",
+    "    # calculate true positives (tp), true negatives(tn)\n",
+    "    # false positives (fp) and false negatives (fn)\n",
+    "    \n",
+    "    # returns the confusion matrix as numpy.ndarray\n",
+    "    return np.array([tp,tn, fp, fn])"
+   ]
   }
  ],
  "metadata": {