diff --git a/ProgrammingAssignment_1/ProgrammingAssignment1.ipynb b/ProgrammingAssignment_1/ProgrammingAssignment1.ipynb index 62d7633202aea1bc6377448376d54f1f0b250b19..60b7992745ddac818bc5d878b136409b069fe2c8 100644 --- a/ProgrammingAssignment_1/ProgrammingAssignment1.ipynb +++ b/ProgrammingAssignment_1/ProgrammingAssignment1.ipynb @@ -341,56 +341,7 @@ "print('Confidence interval: {}-{}'.format(lower_bound, upper_bound))" ] }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - " ## TASK 4: Plotting a learning curve\n", - " \n", - "A learning curve shows how error changes as the training set size increases. For more information, see [learning curves](https://www.dataquest.io/blog/learning-curves-machine-learning/).\n", - "We'll plot the error values for training and validation data while varying the size of the training set. Report a good size for training set for which there is a good balance between bias and variance." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Rubric:\n", - "* Correct training error calculation for different training set sizes +8, +8\n", - "* Correct validation error calculation for different training set sizes +8, +8\n", - "* Reasonable learning curve +4, +4" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# train using %10, %20, %30, ..., 100% of training data\n", - "training_proportions = np.arange(0.10, 1.01, 0.10)\n", - "train_size = len(train_indices)\n", - "training_sizes = np.int(np.ceil(train_size*proportion))\n", - "\n", - "# TODO\n", - "error_train = []\n", - "error_val = []\n", - "\n", - "# For each size in training_sizes\n", - "for size in training_sizes:\n", - " # fit the model using \"size\" data point\n", - " # Calculate error for training and validation sets\n", - " # populate error_train and error_val arrays. \n", - " # Each entry in these arrays\n", - " # should correspond to each entry in training_sizes.\n", - "\n", - "# plot the learning curve\n", - "plt.plot(training_sizes, error_train, 'r', label = 'training_error')\n", - "plt.plot(training_sizes, error_val, 'g', label = 'validation_error')\n", - "plt.legend()\n", - "plt.show()" - ] - }, + { "cell_type": "markdown", "metadata": {},