Sunday, March 1, 2015

Econ 502 -- Optimal Stopping Rule

Topics: Bellman Equation, optimal stopping

Suppose you receive an offer x, drawn randomly from a uniform distribution over [0,1], each period. You can accept the offer and walk away with x, or reject, moving to the next period, when a new random draw is offered, ad infinitum. You discount the future at rate B per period, e.g. B=0.9, such that $1 in the next period is worth B or $0.90 to you today. [Otherwise you would be content to wait around for a draw of x=$1 till the cows come home. This assumption gives you incentive to eventually say yes.]

This is a dynamic programming problem because each period you essentially face the same choice, contingent only on whether you had rejected offers previously. The optimized value function is given by

V(x) = max { x, B*V(x_t+1) }

If we conjecture that the solution to this problem is a threshold rule, i.e. there is some x value, denoted y, such that you will:

Accept if x>y
Reject if x<y

Then we can use this to solve the problem by observing the following about V(x):

V(x) = x if  x>y
V(x) = BV(x_t+1) if x<y

If x=y, then you would be exactly indifferent between rejecting and accepting, e.g.

V(x) = x = y = BV(x_t+1)

The expected value of continuing is

y*V(x) + (1-y)(1+y)/2

because with probability y, you reject again, and with probability (1-y) you get something above y, on average, halfway between y and 1, or (1+y)/2

So BV(x_t+1)=By*(V(x_t+2)) + B(1-y)(1+y)/2

And we are solving for y such that V(x)=y, as above, so

y= By*y + B(1-y)(1+y)/2
y=By^2 +B(1-y^2)/2
0=By^2/2 - y + B/2

Apply quadratic formula with a = B/2, b = -1, c = B/2

 y= (1 +/- sqrt(1 - 4*B/2 * B/2)) / B
 y= (1 +/- sqrt(1 - B^2)) /B

A sqrt must be positive, and dividing by B<1 will expand things, and we need y<1, so it must be the minus:

 y= (1 - sqrt(1 - B^2)) /B
is our solution!

If B=0.9, then y=0.63