Consider a logistic domain where there are 2 cities, 2 trucks, 3 drivers, and 2 packages. Each truck or driver or package can be in any of the cities. If atomic representation is used, what would be the minimum number of states?
Table: Gridworld MDP Table: Gridworld MDP Figure: Transit…
Table: Gridworld MDP Table: Gridworld MDP Figure: Transition Function Figure: Transition Function Review Table: Gridworld MDP and Figure: Transition Function. The gridworld MDP operates like the one discussed in lecture. The states are grid squares, identified by their column (A, B, or C) and row (1 or 2) values, as presented in the table. The agent always starts in state (A,1), marked with the letter S. There are two terminal goal states: (B,1) with reward -5, and (B,2) with reward +5. Rewards are 0 in non-terminal states. (The reward for a state is received before the agent applies the next action.) The transition function in Figure: Transition Function is such that the intended agent movement (Up, Down, Left, or Right) happens with probability 0.8. The probability that the agent ends up in one of the states perpendicular to the intended direction is 0.1 each. If a collision with a wall happens, the agent stays in the same state, and the drift probability is added to the probability of remaining in the same state. Assume that V1_1(A,1) = 0, V1_1(C,1) = 0, V1_1(C,2) = 4, V1_1(A,2) = 4, V1_1(B,1) = -5, and V1_1(B,2) = +5. Given this information, what is the second round of value iteration (V2_2) update for state (A,1) with a discount of 1?
Consider a logistic domain where there are 5 cities, 50 truc…
Consider a logistic domain where there are 5 cities, 50 trucks, and 50 packages. Each truck can be at any of the cities. A package can either be at one of the cities or in one of the trucks. Assuming that a truck can go from any city to any other city, what is the minimum number of variables that will be needed to represent this problem, using factored representation?
Actions: (:action moveTruck :parameters(?t – truck ?source_…
Actions: (:action moveTruck :parameters(?t – truck ?source_loc ?dest_loc – location) :precondition(and (truck_at ?t ?source_loc) (path ?source_loc ?dest_loc)) :effect(and (not (truck_at ?t ?source_loc)(truck_at ?loc ?dest_loc)) ) (:action load :parameters(?p – package ?t – truck ?loc – location) :precondition(and (package_at ?p ?loc)(truck_at ?t ?loc)) :effect(and (not (packeg_at ?p ?loc))(in ?p ?t)) ) (:action unload :parameters(?p – package ?t – truck ?loc – location) :precondition(and (truck_at ?t ?loc)(in ?p ?t)) :effect(and (not (in ?p ?t))(package_at ?p ?loc)) ) Current State: (truck_at truck_2 location_1)(truck_at truck_1 location_2)(package_at package_1 location_1)(package_at package_2 location_2)(path location_1 location_2)(path location_2 location_1) Consider the provided action descriptions and current state. Given this information, which action can be executed?
Consider a logistic domain where there are 100 cities, 5 tru…
Consider a logistic domain where there are 100 cities, 5 trucks, and 6 packages. Each truck can be at any of the cities. A package can either be at one of the cities or in one of the trucks. Assuming that a truck can go from any city to any other city, what is the minimum number of variables that will be needed to represent this problem, using factored representation?
Which task identifies the category of an object in an image?
Which task identifies the category of an object in an image?
Table: Gridworld MDP Table: Gridworld MDP Figure: Transit…
Table: Gridworld MDP Table: Gridworld MDP Figure: Transition Function Figure: Transition Function Review Table: Gridworld MDP and Figure: Transition Function. The gridworld MDP operates like the one discussed in lecture. The states are grid squares, identified by their column (A, B, or C) and row (1 or 2) values, as presented in the table. The agent always starts in state (A,1), marked with the letter S. There are two terminal goal states: (C,2) with reward +1, and (A,2) with reward -1. Rewards are 0 in non-terminal states. (The reward for a state is received before the agent applies the next action.) The transition function in Figure: Transition Function is such that the intended agent movement (Up, Down, Left, or Right) happens with probability 0.8. The probability that the agent ends up in one of the states perpendicular to the intended direction is 0.1 each. If a collision with a wall happens, the agent stays in the same state, and the drift probability is added to the probability of remaining in the same state. The discounting factor is 1. The agent starts with the policy that always chooses to go Up, and it executes three trials: the first trial is (A,1)–(A,2), the second is (A,1)–(A,2), and the third is (A,1)–(B,1)–(C,1)–(C,2). Given these traces, what is the Monte Carlo (direct utility) estimate for state (A,1)?
Table: Gridworld MDP Table: Gridworld MDP Figure: Transit…
Table: Gridworld MDP Table: Gridworld MDP Figure: Transition Function Figure: Transition Function Review Table: Gridworld MDP and Figure: Transition Function. The gridworld MDP operates like the one discussed in lecture. The states are grid squares, identified by their column (A, B, or C) and row (1 or 2) values, as presented in the table. The agent always starts in state (A,1), marked with the letter S. There are two terminal goal states: (B,1) with reward -5, and (B,2) with reward +5. Rewards are -0.1 in non-terminal states. (The reward for a state is received before the agent applies the next action.) The transition function in Figure: Transition Function is such that the intended agent movement (Up, Down, Left, or Right) happens with probability 0.8. The probability that the agent ends up in one of the states perpendicular to the intended direction is 0.1 each. If a collision with a wall happens, the agent stays in the same state, and the drift probability is added to the probability of remaining in the same state. The discounting factor is 1. Given this information, what will be the optimal policy for state (C,1)?
Which equation describes the job of a camera-based robotic p…
Which equation describes the job of a camera-based robotic perception model?
Motion estimation is only possible with specific pre-assumpt…
Motion estimation is only possible with specific pre-assumptions. Which situation satisfies the requirement for a camera to estimate motion?