This refers to the Camping affinity analysis script and output.What is the very best rule to use? Give the rule number, tell me the rule text, what it says in plain English, and give me an outline of your reasoning. Make sure you tell me which areas of the output you used to derive your answer.
What is unsupervised learning?
What is unsupervised learning?
You have a data set of 1000 records, representing the income…
You have a data set of 1000 records, representing the income and age data of 1000 different survey respondents. You have standardized your income and age variables.If you run a k-means algorithm with k=10 on your standardized data, what will your output be?
We evaluate the quality of a classification model (note: not…
We evaluate the quality of a classification model (note: not a classification decision tree model) using
You have a data set with the following variables: Househol…
You have a data set with the following variables: Household income, which runs from $30,000 to $120,000 Number of children in family, from 0 to 5 Number of years in current location, from 0 to 20+ Whether they rent or own, as a text field Number of miles driven per year, from 0 to 50,000 Population in their location, from 100 to 1 million + Where possible, you have standardized the variables, and you have recoded the rent/own text field into a binary 0 (for rent) and 1 (for own). You want to run a k-means algorithm on this entire data set, to try to determine different demographic niches. For example, you may want to separate out urban apartment-dwellers from rural retirees. Is it possible to run k-means clustering on all of these data fields? Say you have loaded the standardized variables above into columns 1 through 6 in a data frame called responses. Let’s say you want to try for 7 clusters. In particular, can you do something like this? > fit
You are running the website for a camping store, and you’ve…
You are running the website for a camping store, and you’ve downloaded transaction shopping cart data.You have run some affinity analysis to determine which items might be most highly associated with other items. Attached is the R script you wrote, and then the output when you ran it on your camping data. (The camping data field names sometimes have a ___ at the end. That’s just to keep them all the same length; please ignore those in your analysis.)You are welcome to download these files to your desktop and use Notepad, Excel, or any other program you like to view them. Please delete them at the end of your exam.Here is the script:Camping_Script.R Here is the output: Camping_Script_Output.txt
You have run a data set for speeding tickets. The first few…
You have run a data set for speeding tickets. The first few rows are shown to you below: RecordID Car_color Actual_speed Occupant_age Number_occupants Speeding_ticket 1 Other 63 79 1 No_ticket 2 Red 67 88 2 Got_ticket 3 Other 73 34 1 Got_ticket 4 Red 57 60 1 Got_ticket 5 Other 70 52 1 No_ticket Here is the classification tree. Note it uses the left branch as a ‘yes’ and the right branch as a ‘no.’ Even if the yes/no are not displayed, you can assume a left branch is a yes. speedingticketstreev03.png Based only on what you can see from the tree above, In which node would you classify a person with an actual speed of 50 and a car color of “Other”? Enter just the node number. If you want to choose Node 1 as your answer, type in just a 1.Why is Node 3 (outlined in red) marked as No Ticket? Enter the number of your answer below.1 – Because 85% of its cases had no ticket2 – Because 32% of its cases received a ticket3 – Because it has 13 cases in it4 – It randomly chooses the labels and this is just lucky5 – Cannot tell from available information
This question refers to the Classification Tree information…
This question refers to the Classification Tree information presented above. Your boss doesn’t understand R code, and has asked you to give a general overview of what is going on here. Write a 1-2 paragraph overview in response to the question. Make sure you include the overall objective of this analysis, and any dead ends or good results.
This question refers to the Classification Tree information…
This question refers to the Classification Tree information presented above. What are the apriori probabilities (for the entire data set) that a car will be acceptable? Note where in the script you found the information which you used for your conclusions. (Just copy the portion of the script you are drawing from.)
This question refers to the Classification Tree information…
This question refers to the Classification Tree information presented above. Using all the output available to you here, what sort of car would you recommend this company look to acquire, if it is looking to acquire acceptable cars? Be specific and tell which R commands, which output text, and/or which images you are using when you write this answer.