Choose a topic to test your knowledge and improve your Machine Learning (ML) skills
Lasso can be interpreted as least-squares linear regression where
How can we best represent βsupportβ for the following association rule: βIf X and Y, then Zβ.
Choose the correct statement with respect to βconfidenceβ metric in association rules
What are tree based classifiers?
What is gini index?
Which of the following sentences are correct in reference to Information gain? a. It is biased towards single-valued attributes b. It is biased towards multi-valued attributes c. ID3 makes use of information gain d. The approact used by ID3 is greedy
his clustering approach initially assumes that each data instance represents a single cluster.
Which statement is true about the K-Means algorithm?
KDD represents extraction of
The most general form of distance is
Which of the following algorithm comes under the classification
Hierarchical agglomerative clustering is typically visualized as?
The _______ step eliminates the extensions of (k-1)-itemsets which are not found to be frequent,from being considered for counting support
The distance between two points calculated using Pythagoras theorem is
Which one of these is not a tree based learner?
Which one of these is a tree based learner?
What is the approach of basic algorithm for decision tree induct
Which of the following classifications would best suit the student performance classification systems?
This clustering algorithm terminates when mean values computed for the current iteration of the algorithm are identical to the computed mean values for the previous iteration
The number of iterations in apriori ___________ Select one:
Frequent item sets is
A good clustering method will produce high quality clusters with
Which Association Rule would you prefer
In a Rule based classifier, If there is a rule for each combination of attribute values, what do you called that rule set R
The apriori property means
If an item set βXYZβ is a frequent item set, then all subsets of that frequent item set are
Clustering is ___________ and is example of ____________learning
To determine association rules from frequent item sets
If {A,B,C,D} is a frequent itemset, candidate rules which is not possible is
Which Association Rule would you prefer
Classification rules are extracted from _____________
What does K refers in the K-Means algorithm which is a non-hierarchical clustering approach?
How will you counter over-fitting in decision tree?
What are two steps of tree pruning work?
Which of the following sentences are true?
Assume that you are given a data set and a neural network model trained on the data set. You are asked to build a decision tree model with the sole purpose of understanding/interpreting the built neural network model. In such a scenario, which among the following measures would you concentrate most on optimising?
Which of the following properties are characteristic of decision trees? (a) High bias (b) High variance (c) Lack of smoothness of prediction surfaces (d) Unbounded parameter set
To control the size of the tree, we need to control the number of regions. One approach to do this would be to split tree nodes only if the resultant decrease in the sum of squares error exceeds some threshold. For the described method, which among the following are true? (a) It would, in general, help restrict the size of the trees (b) It has the potential to affect the performance of the resultant regression/classification model (c) It is computationally infeasible
Which among the following statements best describes our approach to learning decision trees
Having built a decision tree, we are using reduced error pruning to reduce the size of the tree. We select a node to collapse. For this particular node, on the left branch, there are 3 training data points with the following outputs: 5, 7, 9.6 and for the right branch, there are four training data points with the following outputs: 8.7, 9.8, 10.5, 11. What were the original responses for data points along the two branches (left & right respectively) and what is the new response after collapsing the node?
Suppose on performing reduced error pruning, we collapsed a node and observed an improvement in the prediction accuracy on the validation set. Which among the following statements are possible in light of the performance improvement observed? (a) The collapsed node helped overcome the effect of one or more noise affected data points in the training set (b) The validation set had one or more noise affected data points in the region corresponding to the collapsed node (c) The validation set did not have any data points along at least one of the collapsed branches (d) The validation set did have data points adversely affected by the collapsed node
Time Complexity of k-means is given by
In Apriori algorithm, if 1 item-sets are 100, then the number of candidate 2 item-sets are
Machine learning techniques differ from statistical techniques in that machine learning methods
The probability that a person owns a sports car given that they subscribe to automotive magazine is 40%. We also know that 3% of the adult population subscribes to automotive magazine. The probability of a person owning a sports car given that they donΓ’β¬β’t subscribe to automotive magazine is 30%. Use this information to compute the probability that a person subscribes to automotive magazine given that they own a sports car
What is the final resultant cluster size in Divisive algorithm, which is one of the hierarchical clustering approaches?
Given a frequent itemset L, If |L| = k, then there are
Which Statement is not true statement.
which of the following cases will K-Means clustering give poor results? 1. Data points with outliers 2. Data points with different densities 3. Data points with round shapes 4. Data points with non-convex shapes
What is Decision Tree?
What are two steps of tree pruning work?
A database has 5 transactions. Of these, 4 transactions include milk and bread. Further, of the given 4 transactions, 2 transactions include cheese. Find the support percentage for the following association rule βif milk and bread are purchased, then cheese is also purchasedβ.
Which of the following option is true about k-NN algorithm?
How to select best hyperparameters in tree based models?
What is true about K-Mean Clustering? 1. K-means is extremely sensitive to cluster center initializations 2. Bad initialization can lead to Poor convergence speed 3. Bad initialization can lead to bad overall clustering