Agency theory: Limits to the Complexity of Contracts

In most of this book, we have deliberately chosen to emphasize simple models, where shocks are discretely distributed both in the case of adverse selection and moral hazard. However, the analysis in appendices 3.1 and 4.2 also suggests that optimal contracts may have very complex shapes in the richer case where those shocks are continuously distributed. This complexity has often been viewed as a failure of contract theory to capture the simplicity of real world contracting envi- ronments. We now illustrate in this section how incentive theory can be reconciled with this observed simplicity, provided that the contractual environment is suffi- ciently structured.

1. Menu of Linear Contracts under Adverse Selection

Let us reconsider the optimal contract obtained in appendix 3.1, in the case of a continuum  of  types  distributed  according  to  the  cumulative  distribution  F(·) with density  f(·) on  the  interval .  Let  us  slightly  generalize  the  framework  of  that appendix and also assume that the agent has a cost function θc(q) where c > 0 and c > 0 with the Inada condition c(0) = 0. This extension is straightforward and we leave it unsolved as an exercise for the reader. The optimal second-best production levels qSB(θ) under asymmetric information are characterized by

When the monotone hazard rate property is satisfied, namely , the schedule of output qSB(θ) is invertible. Let θSB(q) be its inverse function. The transfer tSB(θ) paid to the agent is such that

where  the  right-hand  side  above  is  θ-type’s  information  rent  U(θ).

Instead  of  using  the  truthful  direct  revelation  mechanism  {(tSB(θ), qSB(θ))}, the principal could give up any communication with the agent and let him choose an output directly, within a nonlinear schedule TSB(θ). This procedure is basically the reverse of the revelation principle; it is sometimes called the taxation princi- ple.23   To  reconstruct  the  indirect  mechanism  T SB(q) from  the  direct  mechanism {(tSB(θ), qSB(θ))} is  rather  easy.  Indeed,  we  must  have  T SB(q) = tSB(θSB(q)).

When  he  faces  the  nonlinear  payment  TSB(q),  the  agent  replicates  the  same choice of output as with the direct revelation mechanism {(tSB(θ), qSB(θ))}. Indeed, we  have ,  and  thus

Differentiating (9.43) with respect to b immediately yields . Inserting  it  into  (9.44),  we  obtain

which is precisely the first-order condition of the agent’s problem when he chooses an  output  within  the  nonlinear  schedule  TSB(·).

It is important that the agent’s choice can be implemented with a nonlinear payment TSB(·).  However,  in  practice,  one  often  observes  menus  of  linear  con- tracts to choose from. This is the case, for instance, in the relationship between regulatory agencies and regulated firms or the relationship between a buyer and a seller.25

To obtain an implementation of the second-best outcome with a menu of linear  contracts,  we  need  to  be  able  to  replace  the  nonlinear  schedule  TSB(q) by the menu of its tangents. The slope of the tangent at a given point qSB(θ) is the same as that of TSB(q) at this point. Hence, the type θ agent’s marginal incentives to deviate away from qSB(θ) are the same with both mechanisms. Moreover, the tangent also has the same value as TSB(q) at qSB(θ). Hence, the nonlinear schedule TSB(·)  and  its  menu  of  tangents  provide  the  agent  with  the  same  information rent. This equivalence is nevertheless only possible when TSB(q) is in fact convex. Figure 9.6 represents this case.

Figure  9.6: Convexity  of  the  Nonlinear  Schedule  TSB(q)

Let us thus derive the conditions ensuring this convexity. Differentiating (9.45), we obtain

where  q˙SB(θ) is  obtained  by  differentiating  (9.42)  with  respect  to  θ.  Now  we  find that

Inserting this latter expression into (9.46), we get

Now we obtain proposition 9.6.

Proposition  9.6:  Assume that  is increasing with b and that S(q) = 0 for all q. Then, TSB(·) is convex and can be implemented with the menu of its tangents.

Indeed, let us now consider the menu of tangents to TSB(·). The equation of the  tangent  T (·,q0) to  TSB(·) at  a  point  q0   can  be  obtained  as

Facing  the  family  of  linear  contracts  {T (·, q0)},  the  agent  has  now  to  choose which tangent is its most preferred one and what output to produce according to that contract. Therefore, the agent solves

Substituting T Xqa q0c with its expression coming from (9.49), the first-order conditions for this problem are, respectively,


If these necessary conditions are also sufficient, the agent with type θ chooses q = q0 = qSB(θ). The sufficiency of (9.50) and (9.51) is guaranteed when T(q, q0) − θc(q) is concave in (q, q0). Computing the corresponding Hessian H of  second-order  derivatives  at  the  point  (qSB(θ), qSB(θ)) yields

This  Hessian  is  strictly  definite-negative  when is convex as already assumed) and  but the latter condition is satisfied, as can easily be seen by using (9.46).

Remark: Note that if the principal observes only a random signal of q, say q + ε˜, where ε˜ is a random variable with zero mean (E(ε˜) = 0), the menu of linear contracts still implements the same allocation. Indeed, the random variable x˜ disappears in the incentive problem of the risk- neutral agent by the linearity of the expectation operator. Hence, the menu of linear contracts is robust to the addition of some noise.

The implementation of the optimal contract with the menu of its tangents has been extensively used in the field of regulation by Laf-font and Tirole (1986, 1993) and Rogerson (1987). Caillaud, Guesnerie, and Rey (1992) survey various extensions of these results and in particular its robustness when the observation of production is noisy. On this latter point, see also Melumad and Reichelstein (1989).

2. Linear Sharing Rules and Moral Hazard

Moral hazard environments can also have enough structure to let the linearity of contracts emerge at the optimum. To see that, we now consider a twice-repeated version of the model of section 5.3.2 without discounting. Thus the agent has a  utility  function  U  = u(t − ψ(e)),  which  is  defined  over  monetary  gains,  and the disutility of effort is counted in monetary terms. We assume constant risk aversion,  so  that  u(x) = − exp(−rx) for  some  r  >  0.  Note  that  u(0) = −1.  We denote the outcome of the stochastic production function in period i by qi. These realizations are independently distributed over time. In each period, the agent can exert a binary effort e in {0, 1} with a monetary cost normalized so that ψ(1) = ψ and ψ(0) = 0. We assume that only the whole history of outputs can be used by the principal to remunerate the agent. Given a history (q1, q2), the agent receives a transfer t(q1, q2) only at date t = 3 (see figure 9.7 for a timing of the game after the offer of the contract by the principal and its acceptance by the agent). Because there are two possible outputs in each period, the number of possible histories, and thus the number of final transfers t(q1, q2), is 2 × 2 = 4.

Figure 9.7: Timing of the Contractual Game.

The logic underlying this model is that the principal can only incentivize the agent at the end of the working period (date t = 3), but the agent has to choose an effort at dates t = 0.5 and t = 1.5.

Let us denote the agent’s value function from date t = 0.5 on by U1, i.e., his expected payoff if he exerts a positive effort in each period. We have

where  denotes the expectation operator with respect to the distribution of histories induced by the agent exerting a high effort in both periods. Note that with the agent’s disutility of effort being counted as a monetary term, one must subtract the total cost of efforts along the whole history to evaluate the net monetary gain of the agent.

Using that u(x) = − exp(−rx) and the Law of Iterated Expectations, we obtain

where  is  actually  the  agent’s  value  function  from exerting a positive effort in period 2 following an output q1 at date t = 1.

Using the certainty equivalents (denoted by w2(q1)) of the random contin- uation  monetary  gains ,  we  can  in  fact  rewrite . Hence,  inserting  this  into  (9.54),  we  obtain .

Inducing effort at date t = 0.5 requires that the following incentive constraint be satisfied:

Similarly, inducing participation from date t = 0.5 on requires that the agent gets more utility than by refusing to work and obtaining a zero wealth certainty equivalent. Hence, the agent’s participation constraint is written as

From the analysis of section 5.3.2, the pair of certainty equivalents belongs  to  the  set  of  incentive  feasible  transfer  pairs  that induce effort and participation, with the agent being given a zero wealth outside opportunity in the static model of section 5.3.2. Let us denote by F(0) the set of incentive feasible transfers defined by the constraints in (5.83) and (5.84). We have thus   for some pair (t¯1, t1), which belongs to F(0). Let us now move to date t = 1.5. Following a first-period output q1, the agent  knows  then  that  he  will  receive  the  certainty  equivalent  w2(q1).  Hence,  the following participation constraint is satisfied: . To induce effort at date t = 1.5 following a first-period output q1, it must be that the following incentive constraints (which are dependent on q1) are also satisfied:

Immediate observation shows that the pair of transfers  must belong to the set of incentive feasible transfers inducing effort and participa- tion in the static model of section 5.3.2 when the agent has an outside opportunity leaving  him  a  wealth  certainty  equivalent  w2(q1).

From remark 2 in section 5.3.2, we know that we can write those transfers as , where the pair belongs, as to  F(0).  Hence,  the  overall  transfers  are  the  sum  of two  contracts  belonging  to  F(0).  This  property  constitutes  a  significant  reduction of the space of useful contracts.

Using the fact that shocks in each period are independently distributed, the principal’s problem now becomes:

It is apparent that the optimal solution to this problem is the twice replica of the  solution  to  the  static  problem  discussed  in  section  5.3.2.  We  imme- diately obtain the linearity of the optimal schedule.

Proposition  9.7:  The optimal sharing rule  is linear in the num- ber of successes or failures of the production process. We have:

This result obviously generalizes to T ≥ 2 periods and more than two out- comes. Understanding this linearity result requires us to return to the main features of the solution to the static problem of section 5.3.2. The CARA specification for the agent’s utility function implies the absence of any wealth effect. The wage as well as the cost of effort in period 1 (which is counted in monetary terms) are sunk from the point of view of period 2 and have no impact on the incentive pressure that is then needed to induce effort. This incentive pressure is exactly the same as in a static one-shot moral hazard problem. Therefore, the principal views periods 1 and 2 as equivalent, in terms of both the stochastic processes generating output in each period and the incentive pressure needed to induce effort. The principal offers the same contract in each period, and the overall sharing rule, based only on the whole history of outputs up to date t = 2, is linear in the number of successes and failures.

Assume now that there are T ≥ 2 periods. Then, following the same logic as above, the transfer associated with n successes and T n failures only depends on the total production and not on the dates at which these successes and failures take place. More precisely, when we denote the common value of these transfers by  tSB(·) we  get ,  where  n  is  the  number of successes. When we denote by X the aggregate output in the T -Bernoulli tri- als, where at each date q¯ is obtained with probability π1 and q is obtained with probability  1 − π1,  we  have   and

This relationship shows that the sharing rule between the principal and the agent is linear in X. However, in the analysis above, the fixed-fee in (9.58) becomes infinitely large as T goes to infinity. Holmström and Milgrom (1987) solved this difficulty by using a continuous time model where in each period the agent con- trols the drift of a Brownian process. Typically, on an infinitesimal interval of time [t, t + dt], the aggregate output q(t) up to date t jumps up only by a term pro- portional to the effort performed during the interval of length dt plus some noise. More precisely, q(t + dt) − q(t) − edt is the sum of dt independently and identi- cally distributed random variables with a mean of zero. The aggregate output is a unidimensional Brownian motion that follows the stochastic differential equation

where B is a unidimensional Brownian motion with unit variance, and e is the agent’s effort on the interval [t, t + dt].

In the continuous time model presented here, the principal can only use the overall aggregate output q(1) = q at the end of a [0, 1] interval of time to incentivize the agent. Note that (9.59) holds on all intervals [t, t + dt] and that the principal offers the same incentive pressure on each of those infinitesimal intervals so that effort is constant over time. Hence, the aggregate output q(1) is a normal variable with  mean  e and  variance σ2.  Holmström  and  Milgrom  (1987, 321)  show that the optimal contract is a linear contract t(q) = a + bq. Then, the agent’s final wealth is also a normal variable with mean a + be and variance b2σ 2. Because the agent has constant absolute risk aversion, his certainty equivalent income we is such that

where    is  the  density  of  the  normal  distribution  with  mean  e and variance σ 2. We easily find that

When the agent’s disutility of effort is quadratic, i.e., , the sufficient and necessary condition for the optimal choice of effort is e = b. The fixed-fee a can be set so that the agent’s certainty equivalent income is zero and thus  .

The risk-neutral principal’s expected payoff can be computed as

The principal’s problem is thus written in a reduced form as follows:

Replacing b and a with their values as functions of e, the principal’s problem can be reduced to

Optimizing, we easily find the second-best effort eSB:

where eFB is the first-best level of effort.

It is interesting to note that, as the index of absolute risk aversion increases, the second-best effort is further distorted downward. Similarly, as the output becomes a less informative measure of the agent’s effort, i.e., as σ 2 increases, this effort is also reduced. These insights were already highlighted by our basic model in chapter 4.

The Holmström and Milgrom (1987) model can be extended to the case of a multidimensional Brownian process that corresponds to the case where the agent’s output is multidimensional and to the case of a mul- tidimensional effort as we show below. Schättler and Sung (1993) showed that a time-dependent technology calls for the optimal contract to be nonlinear. Sung (1995) showed that the optimal contract can still be linear when the agent controls the variance of the stochastic process. Hellwig and Schmidt (1997) proposed further links between the discrete and the continuous time model. Lastly, Bolton and Harris (1997) generalized the Brownian motion. Diamond (1998b) proposed a static model with limited liability, risk neutrality and three possible outcomes; linearity emerges because the agent has a rich set of choices in the distribution of these outcomes. More generally, it is also worth stressing that the linearity of contracts may be imposed when the principal must prevent arbitrage or resale among his agents.

Once the Holmström-Milgrom framework is accepted, i.e., the CARA utility function, the Brownian motion of output whose drift is affected by effort, trans- fers depending only on the final aggregate output, the analysis is made very easy because contracts can be taken to be linear, and these linear contracts can be developed in many directions. Below we give an example of an interesting use of the model.

A Multitask Illustration

The model above with a CARA utility function and a normal distribution of the performance is also useful when analyzing multitask agency models. To see that, let us assume that the agent exerts a vector of effort    for the principal. The disutility of effort is written as ψ(e), where ψ(·) is convex. To make things simpler we will assume that ψ(·) is a quadratic form, and we will write for some semidefinite positive and symmetric matrix

We denote  as the inverse matrix that is also symmetric. Contrary to the model of section 5.2.3, the disutility of effort is counted in monetary terms. Each of these efforts affects the mean of a random variable q˜i in the following manner:

where ε˜i is a random variable. We will denote the vector of performances by .  The  vector  of  random  variable  is  normally  distributed  with  a zero mean and a covariance matrix given by

Let us thus write the optimal linear contract offered by the principal as  or, using the inner product,

With this type of linear contract, the agent gets a certainty equivalent income we


Using the moment generating function,30 we find that

Optimizing (9.66) with respect to e yields the agent’s response to any linear contract. We obtain the following incentive constraints:

This yields the following expression of the certainty equivalent income of the agent:

Of course, this certainty equivalent income must be positive in order to pro- vide the agent with at least his outside opportunity.

The risk-neutral principal benefits from the sum of the performances on each particular task. The principal’s expected payoff is written as

Expressing this term, we find an expected payoff for the principal of  ( IIb)e a, where II = (1, 1) is the line vector having all of its components equal to one.

We can finally write the principal’s problem as:

Of course, these two constraints are binding at the optimum, and we can rewrite the principal’s problem in reduced form as

This problem is concave in b, and the second-best vector of optimal marginal incentives is obtained as a solution to


is the identity matrix.

The matrix ∑ψ, which is computed from the technology and the informative- ness of the different signals, plays a significant role in determining the second-best marginal incentives and the optimal effort.31 This matrix measures the discrepancy between second-best and first-best incentives.

As an example, let us assume that efforts are imperfect substitutes, with ψ11 = ψ22 = 1 > ψ12, and off-diagonal terms of ∑ are all equal to zero. Then, the values of the marginal incentives are given by

Assume for instance that σ2 = ∞ to capture the fact that the performance q˜2 may not be observable at all by the principal. Then we have

It becomes impossible to reward the hardly observable task q˜2, and so all incentive pressure is put on the first task.32 This points to the complementarity between rewarding different substitute tasks in a second-best environment that we already stressed in the bare-bones model of section 5.2.5. On this, see also Holmström and Milgrom (1994).

Source: Laffont Jean-Jacques, Martimort David (2002), The Theory of Incentives: The Principal-Agent Model, Princeton University Press.

Leave a Reply

Your email address will not be published. Required fields are marked *