rubik's cube

Day 1

Identify the following machine learning task:

This task predicts a numerical value given some input. To solve this task, the learning algorithm is asked to output a function:


Resources


Day 2

Distinguish between these two machine learning algorithms:

1 2
In this task, the algorithm is given a new example , but with some entries missing In this algorithm, the input is given a corrupted example . The algorithm should predict a clean example

Resources


Day 3

Which of the following is not a machine learning technique:

Resources


Day 4

The ability for a machine learning model to perform well on previously unobserved inputs is called:

Resources


Day 5

Define Regularization

Resources


Day 6

Which of the following are true about a validation set?

Value Description
1 A validation set contains examples that the training algorithm does not observe
2 No example from the test set can be used in the validation set
3 A validation set should always be constructed from the training set
4 A validation set should always be constructed from the test set

Resources


Day 7

Which of the following techniques can be used to help compensate for small datasets?

Resources


Day 8

It is common to say that algorithm A is better than algorithm B if the upper bound of the 95 percent confidence interval for the error of algorithm A is less than the lower bound of the 95 percent confidence interval for the error of algorithm B

Resources


Day 9

Distinguish between these two statistical approaches:

1 2
This approach is based on estimating a single value of , then making all predictions thereafter based on one estimate This approach is to consider all possible values of when making a prediction.

Resources


Day 10

Unlike logistic regression, the support vector machine (SVM) does not provide probabilities, but only outputs a class identity.

Resources


Day 11

The category of algorithms that employ the kernel trick is known as:

Resources


Day 12

When can supervised learning algorithms be useful?

Resources


Day 13

Which of the following could be accomplished with unsupervised learning?

Resources


Day 14

Which of the following techniques are not useful towards simplifying data representation:

Resources


Day 15

Which of the following are true about Principal Component Analysis?

Value Description
1 This algorithm provides a means of compressing data
2 It is an unsupervised learning algorithm that learns a representation of data
3 It learns a representation that has lower dimensionality than the original input
4 It learns a representation whose elements have no linear correlation with each other

Resources


Day 16

To achieve full independence, a representation learning algorithm must also remove the nonlinear relationships between variables


Day 17

Which of these are true about p-value?


Day 18

Which of the following is true regarding singular value decomposition (SVD) and PCA?

Resources


Day 19

Which of the following are advantages to using sparse representation(e.g. one hot encoding) in a clustering algorithm?

Value Description
1 A natural conveyance of the idea that all examples in the same cluster are similar to each other
2 Confers the computational advantage that the entire representation may be captured by a single integer

Resources


Day 20

Which of the following are common difficulties pertaining to clustering?

Value Description
1 There is no single criterion that measures how well a clustering of the data corresponds to the real world
2 There may be many different clusterings that all correspond well to some property
3 It is possible to obtain a different, equally valid clustering not relveant to the task
4 It is difficult to measure Euclidean distance from a cluster centroid to the member of the cluster

Resources


Day 21

Nearly all of deep learning is powered by one very important algorithm: stochastic gradient descent (SGD).

Resources


Day 22

A recurring problem in machine learning is that large training sets are necessary for good generalization, but large training sets are also more computationally expensive.

Resources