Instructions The R Markdown/Jupyter Notebook file includes t…

Instructions The R Markdown/Jupyter Notebook file includes the questions, the empty code chunk sections for your code, and the text blocks for your responses. Answer the questions below by completing the R Markdown/Jupyter Notebook file. You may make slight adjustments to get the file to knit/convert but otherwise keep the formatting the same. Once you’ve finished answering the questions, submit your responses in a single knitted file as HTML only. Partial credit may be given if your code is correct but your conclusion is incorrect or vice versa. Next Steps: 1. Save the .Rmd/.ipnyb in your working directory – the same directory where you will download the “diabetes_dataset.csv” data file into. Having both files in the same directory will help in reading the “diabetes_dataset.csv” file.  2. Read the question and create the code necessary within the code chunk section immediately below each question. Knitting this file will generate the output and insert it into the section below the code chunk.  3. Type your answer to the questions in the text block provided immediately after the response prompt.  4. Once you’ve finished answering all questions, knit this file and submit the knitted file as HTML on Canvas.  Mock Example Question  This will be the exam question – each question is already copied from Canvas and inserted into individual text blocks below, you do not need to copy/paste the questions from the online Canvas exam. “`{r}# Example code chunk area. Enter your code below the comment““Mock Response to Example Question:  This is the section where you type your written answers to the question. Depending on the question asked, your typed response may be a number, a list of variables, a few sentences, or a combination of these elements.  Ready? Let’s begin. We wish you the best of luck! Data Set diabetes_dataset.csv Starter TemplatesYou may use either the R Markdown or Jupyter Notebook Starter Template: R Markdown Starter Template: Final_Exam_starter_template_Fall24_R.Rmd Jupyter Notebook Python Starter Template: Final Exam_starter_template_Fall24_Python.ipynb Jupyter Notebook R starter Template: Final_Exam_starter_template_Fall24_R.ipynb

Background The dataset includes 9 baseline numeric variables…

Background The dataset includes 9 baseline numeric variables: age, body mass index, average blood pressure, and six blood serum measurements for each of n = 442 diabetes patients. The response of interest is a quantitative measure of diabetes disease progression one year after baseline. The dataset is obtained from sklearn.datasets. We will be fitting multiple linear regression models to the train dataset and making predictions on the test dataset. Attribute Information: age: age in years bmi: body mass index bp: average blood pressure s1: tc, total serum cholesterol s2: ldl, low-density lipoproteins s3: hdl, high-density lipoproteins s4: tch, total cholesterol / HDL s5: ltg, possibly log of serum triglycerides level s6: glu, blood sugar level Target: quantitative measure of disease progression one year after baseline (Response variable) Note: All features have NOT been standardized.

Question 6: Prediction – 9 points For this question, use the…

Question 6: Prediction – 9 points For this question, use the testData. Using testData and with the previously built models in Q2,3,5, predict the Target and output the average of these probabilities for each of the models below and summarize the results: i) Full linear regression model from question 2b (model1) ii) Reduced model from question 2b (model2) iii) Stepwise forward model from question 3a (forward_model) iv) Stepwise backward model from question 3c (backward_model) v) Stepwise forward-backward model from question 3f (both_model) vi) Ridge regression model from question 5a (ridge.model) vii) Regular Lasso model from question 5c (lasso.model) viii) Group Lasso model from question 5f (group_lasso) ix) Elastic Net model from question 5i (enet.model)

A spring is attached to the hook of the iolab. The following…

A spring is attached to the hook of the iolab. The following data is collected showing the spring force (in N) on the vertical axis plotted as a function of position on the horizontal axis (in meters).  From this data, determine the equilibrium position of the spring (approximately). In case you find it hard to read the summary statistics on the right, the slope of the best fit line is approximately -2 and the intercept is approximately 0.4. 

Question 4: Full Model Search – 8 points For this question,…

Question 4: Full Model Search – 8 points For this question, use the trainData. How many models can be constructed using subsets drawn from the full set of variables? (2 points) Compare all possible models using Mallow’s Cp. Display the variables included in the best model and the corresponding Mallow’s Cp value. (2 points) Use the selected variables from Q4b to fit another multiple linear regression model, call it best_model. Display the model summary. (2 points)’ Compare the models (model1, model2, forward_model,backward_model, best_model, both_model) using Adjusted R^2 and AIC. Which model is preferred based on this? (2 points)