Q7: (8 points)After taking the action suggested in the previ…

Q7: (8 points)After taking the action suggested in the previous question, suppose the discount factor is γ = 0.9, the state transfers from s₂ to s₄ after taking action aₜ, and the reward r is 0.6. Please update the Q-table and write down the updated Q-table. Note: Only one value in the table needs updating, and you might need the Bellman Equation:

Q2: (6 points) Assuming we aim to build a more advanced reco…

Q2: (6 points) Assuming we aim to build a more advanced recommendation system for an online bookstore using matrix factorization-based methods, similar to the one that won the Netflix prize. Suppose the global mean rating of books is 3.6 stars. Bob, a loyal customer, has rated 400 books, and his average rating is 0.3 stars higher than the global average rating. Meanwhile, Pride and Prejudice is a book in the bookstore that has 200,000 ratings, with an average rating that is 0.5 stars lower than the global average. What would be a baseline estimate of Bob’s rating for Pride and Prejudice? (2 points) Illustrate how you arrived at your answer. (2 points)

Q4: (8 points) Designing A Machine Learning System.Given use…

Q4: (8 points) Designing A Machine Learning System.Given user features, item features, and a user-item-rating matrix, if we formulate the problem of recommending personalized items for users as a ranking task, how can we use develop a personalized Learning To Rank (LTR) model for recommendations? Please specify: how you will use the data what is your model structure what is your objective function how to use the learned ranking model to conduct personalized recommendations.