Module 1 · Quiz PrepQuiz #1 Topics and Structure
Structure
- 60 minutes to complete the quiz
- 20 multiple choice questions (or true/false) [4 points each]
- Two (2) short answer questions [30 and 10 points]
- DO NOT leave any questions blank.
Topics
- AI and ML
- Artificial intelligence (overall field/science)
- ML: specific approach to AI (machine learning)
- Types of ML
- Supervised
- Labels ("answers") are given
- Unsupervised
- Reinforcement
- feedback from environment
- Supervised Learning
- Classification
- Discrete/fixed set of target variable categories (e.g., dog/not dog)
- Ex: kNN, decision tree, random forest, neural net
- Regression
- Continuous/infinite possible values for target variable (e.g., house prices)
- linear regression
- logistic regression--almost always used for binary classification
- Unsupervised Learning
- Clustering
- k-means
- hyperparameter 'k' is the number of clusters you want to algo to find
- starts with random positions for each 'centroid' (i.e., the center of each cluster); then moves them over time to optimize actual cluster centers
- Hyperparameters vs. Parameters
- Parameters: variables inside the model that must be trained/optimized; trained/set by training algorithm
- Hyperparameters: configuration of training algos; set by us
- Metrics
- Accuracy -- # correct / # total samples
- Confusion matrix -- grid of correct vs predicted classifications
- Generalization (overfitting, underfitting, bias and variance)
- Generalization: how well does the model extend to unseen examples (things not in the training set)
- Overfit: 'memorizes' training set; high variance, low bias
- Underfit: does not capture data complexity enough; high bias, low variance
- training vs. test accuracy
-
- training: accuracy from the training data
- test: accuracy from test data
- High training accuracy, low test accuracy --> overfit / variance too high
- Low training accuracy, low or high test accuracy --> underfit / bias too high