You are approached by the marketing director of a local comp…

You are approached by the marketing director of a local company, who believes that he has devised a foolproof way to measure customer satisfaction.He explains his scheme as follows: “It’s so simple that I can’t believe that no one has thought of it before. I just keep track of the number of customer complaints about each product. I read in a data mining book that counts are ratio attributes, and so, my measure of product satisfaction must be a ratio attribute. But when I rated the products based on my new customer satisfaction measure and showed them to my boss, he told me that I had overlooked the obvious and that my measure was worthless. I think that he was just mad because our best-selling product had the worst satisfaction since it had the most complaints. Could you help me set him straight?”   What can you say about the attribute type of the original product satisfaction attribute?

Let us say the cost of a False Positive is 6 units and the c…

Let us say the cost of a False Positive is 6 units and the cost of a False-negative is 2 units. In this context answer the following. (a) (6) How would you adjust the decision tree algorithm so that its performance on unseen cases minimizes the total expected cost instead of maximizing the accuracy?   (b) (6) Recall the Support Vector Machine formulation discussed in class, specifically the case in which we minimize the cost of misclassifications using a constant parameter C. Suggest a solution for learning an SVM classifier in which the cost of the two types of errors are different. Do not write any formulas, describe your ideas in language. 

Consider the following training data for a perceptron: X.  Y…

Consider the following training data for a perceptron: X.  Y.  Z.  Class 0.  3.  5.     1 1.  4.  8.     0 7.  1.  2.     1 -1. 5. 5.     0 2.  6.  7.     0   Use (3 1 3 2) as the initial weight vector. Execute the perceptron training algorithm as discussed in class and report the following: 1. (4) The updated weight vector after the first data point is processed.   2. (4) The updated weight vector after the second data point is processed.   3. (6) The updated weight vector after the fifth data point is processed.

You are approached by the marketing director of a local comp…

You are approached by the marketing director of a local company, who believes that he has devised a foolproof way to measure customer satisfaction.He explains his scheme as follows: “It’s so simple that I can’t believe that no one has thought of it before. I just keep track of the number of customer complaints about each product. I read in a data mining book that counts are ratio attributes, and so, my measure of product satisfaction must be a ratio attribute. But when I rated the products based on my new customer satisfaction measure and showed them to my boss, he told me that I had overlooked the obvious and that my measure was worthless. I think that he was just mad because our best-selling product had the worst satisfaction since it had the most complaints. Could you help me set him straight?”   Who is right, the marketing director or his boss? If you answered, his boss, what would you do to fix the measure of satisfaction?  

10) Consider the context of selecting the best attribute for…

10) Consider the context of selecting the best attribute for decision tree construction. Explain briefly the difference between “information gain” and “gain ratio” metrics for selecting the best attributes.  Is any one of these two better than the other – explain why? Do not write any formulas. Explain in words only.

Answer the following in the context of the Adaboost algorith…

Answer the following in the context of the Adaboost algorithm. (No formulas, only language description). (a) (4) Which points are given higher/lower weights after learning each weak-classifier?   (b) (4) How is the weight assigned to each data point used by the algorithm?   (c) (4) How do we assign weights to weak classifiers for their contribution in the global decision?