In most of this book, we have deliberately chosen to emphasize simple models, where shocks are discretely distributed both in the case of adverse selection and moral hazard. However, the analysis in appendices 3.1 and 4.2 also suggests that optimal contracts may have very complex shapes in the richer case where those shocks are continuously distributed. This complexity has often been viewed as a failure of contract theory to capture the simplicity of real world contracting envi- ronments. We now illustrate in this section how incentive theory can be reconciled with this observed simplicity, provided that the contractual environment is suffi- ciently structured.

### 1. Menu of Linear Contracts under Adverse Selection

Let us reconsider the optimal contract obtained in appendix 3.1, in the case of a continuum of types distributed according to the cumulative distribution *F(*·) with density *f(*·) on the interval . Let us slightly generalize the framework of that appendix and also assume that the agent has a cost function *θc*(*q)* where *c*^{‘} *> *0 and *c*^{“} *> *0 with the Inada condition *c*^{‘}(0) = 0. This extension is straightforward and we leave it unsolved as an exercise for the reader. The optimal second-best production levels *q ^{SB}*(

*θ)*under asymmetric information are characterized by

When the monotone hazard rate property is satisfied, namely , the schedule of output *q*^{S}* ^{B}*(

*θ)*is invertible. Let

*θ*

^{S}*(*

^{B}*q)*be its inverse function. The transfer

*t*(

^{SB}*θ)*paid to the agent is such that

where the right-hand side above is *θ*-type’s information rent *U(θ)*.

Instead of using the truthful direct revelation mechanism {(*t*^{S}* ^{B}*(

*θ),*

*q*

^{S}*(*

^{B}*θ))}*, the principal could give up any communication with the agent and let him choose an output directly, within a nonlinear schedule

*T*

^{S}*(*

^{B}*θ)*. This procedure is basically the reverse of the revelation principle; it is sometimes called the taxation princi- ple.23 To reconstruct the indirect mechanism

*T*

^{S}*(*

^{B}*q)*from the direct mechanism {(

*t*

^{S}*(*

^{B}*θ),*

*q*

^{S}*(*

^{B}*θ))}*is rather easy. Indeed, we must have

*T*

^{S}*(*

^{B}*q)*=

*t*

^{S}*(*

^{B}*θ*

^{S}*(*

^{B}*q))*.

When he faces the nonlinear payment *T*^{S}* ^{B}*(

*q)*, the agent replicates the same choice of output as with the direct revelation mechanism {(

*t*

^{S}*(*

^{B}*θ),*

*q*

^{S}*(*

^{B}*θ))}*. Indeed, we have , and thus

Differentiating (9.43) with respect to b immediately yields . Inserting it into (9.44), we obtain

which is precisely the first-order condition of the agent’s problem when he chooses an output within the nonlinear schedule *T*^{S}* ^{B}*(·).

It is important that the agent’s choice can be implemented with a nonlinear payment *T*^{S}* ^{B}*(·). However, in practice, one often observes menus of linear con- tracts to choose from. This is the case, for instance, in the relationship between regulatory agencies and regulated firms or the relationship between a buyer and a seller.25

To obtain an implementation of the second-best outcome with a menu of linear contracts, we need to be able to replace the nonlinear schedule *T*^{S}* ^{B}*(

*q)*by the menu of its tangents. The slope of the tangent at a given point

*q*(

^{SB}*θ)*is the same as that of

*T*

^{S}*(*

^{B}*q)*at this point. Hence, the type

*θ*agent’s marginal incentives to deviate away from

*q*(

^{SB}*θ)*are the same with both mechanisms. Moreover, the tangent also has the same value as

*T*

^{S}*(*

^{B}*q)*at

*q*

^{S}*(*

^{B}*θ)*. Hence, the nonlinear schedule

*T*

^{S}*(·) and its menu of tangents provide the agent with the same information rent. This equivalence is nevertheless only possible when*

^{B}*T*

^{S}*(*

^{B}*q)*is in fact

*convex*. Figure 9.6 represents this case.

Figure 9.6: Convexity of the Nonlinear Schedule *T*^{S}* ^{B}*(

*q)*

Let us thus derive the conditions ensuring this convexity. Differentiating (9.45), we obtain

where *q*˙^{S}* ^{B}*(

*θ)*is obtained by differentiating (9.42) with respect to

*θ*. Now we find that

Inserting this latter expression into (9.46), we get

Now we obtain proposition 9.6.

**Proposition 9.6: ***Assume that * *is increasing with *b *and that **S*^{“}(*q)* = 0 *for all **q. Then, **T*^{S}* ^{B}*(·)

*is convex and can be implemented with the menu of its tangents.*

Indeed, let us now consider the menu of tangents to *T*^{S}* ^{B}*(·). The equation of the tangent

*T*(·,

*q*) to

_{0}*T*

^{S}*(·) at a point*

^{B}*q*can be obtained as

_{0}Facing the family of linear contracts {*T* (·, *q*_{0})}, the agent has now to choose which tangent is its most preferred one and what output to produce according to that contract. Therefore, the agent solves

Substituting *T *X*q*a *q*_{0}c with its expression coming from (9.49), the first-order conditions for this problem are, respectively,

and

If these necessary conditions are also sufficient, the agent with type *θ *chooses *q *= *q*_{0} = *q ^{SB}*(

*θ)*. The sufficiency of (9.50) and (9.51) is guaranteed when

*T(*

*q,*

*q*

_{0}) −

*θ*

*c*(

*q)*is concave in (

*q,*

*q*

_{0}). Computing the corresponding Hessian

*H*of second-order derivatives at the point (

*q*

^{S}*(*

^{B}*θ),*

*q*

^{S}*(*

^{B}*θ))*yields

This Hessian is strictly definite-negative when is convex as already assumed) and but the latter condition is satisfied, as can easily be seen by using (9.46).

**Remark: **Note that if the principal observes only a random signal of *q*, say *q* + ε˜, where ε˜ is a random variable with zero mean (*E(ε*˜) = 0), the menu of linear contracts still implements the same allocation. Indeed, the random variable x˜ disappears in the incentive problem of the risk- neutral agent by the linearity of the expectation operator. Hence, the menu of linear contracts is robust to the addition of some noise.

The implementation of the optimal contract with the menu of its tangents has been extensively used in the field of regulation by Laf-font and Tirole (1986, 1993) and Rogerson (1987). Caillaud, Guesnerie, and Rey (1992) survey various extensions of these results and in particular its robustness when the observation of production is noisy. On this latter point, see also Melumad and Reichelstein (1989).

### 2. Linear Sharing Rules and Moral Hazard

Moral hazard environments can also have enough structure to let the linearity of contracts emerge at the optimum. To see that, we now consider a twice-repeated version of the model of section 5.3.2 without discounting. Thus the agent has a utility function *U** *= *u*(*t* − ψ(*e))*, which is defined over monetary gains, and the disutility of effort is counted in monetary terms. We assume constant risk aversion, so that *u*(*x)* = − exp(−*r**x)* for some *r** > *0. Note that *u(*0) = −1. We denote the outcome of the stochastic production function in period *i *by *q _{i}*. These realizations are independently distributed over time. In each period, the agent can exert a binary effort

*e*in {0, 1} with a monetary cost normalized so that ψ(1) = ψ and ψ(0) = 0. We assume that only the whole history of outputs can be used by the principal to remunerate the agent. Given a history (

*q*

_{1},

*q*

_{2}), the agent receives a transfer

*t*(

*q*

_{1},

*q*

_{2}) only at date

**= 3 (see figure 9.7 for a timing of the game after the offer of the contract by the principal and its acceptance by the agent). Because there are two possible outputs in each period, the number of possible histories, and thus the number of final transfers**

*t**t*(

*q*

_{1},

*q*

_{2}), is 2 × 2 = 4.

Figure 9.7: Timing of the Contractual Game.

The logic underlying this model is that the principal can only incentivize the agent at the end of the working period (date ** t **= 3), but the agent has to choose an effort at dates

**= 0.5 and**

*t***= 1.5.**

*t*Let us denote the agent’s value function from date ** t **= 0.5 on by

*U*

_{1}, i.e., his expected payoff if he exerts a positive effort in each period. We have

where denotes the expectation operator with respect to the distribution of histories induced by the agent exerting a high effort in both periods. Note that with the agent’s disutility of effort being counted as a monetary term, one must subtract the total cost of efforts along the whole history to evaluate the net monetary gain of the agent.

Using that *u(**x)* = − exp(−*r**x)* and the Law of Iterated Expectations, we obtain

where is actually the agent’s value function from exerting a positive effort in period 2 following an output *q*_{1} at date ** t **= 1.

Using the certainty equivalents (denoted by *w*_{2}(*q*_{1})) of the random contin- uation monetary gains , we can in fact rewrite . Hence, inserting this into (9.54), we obtain .

Inducing effort at date ** t **= 0.5 requires that the following incentive constraint be satisfied:

Similarly, inducing participation from date ** t **= 0.5 on requires that the agent gets more utility than by refusing to work and obtaining a zero wealth certainty equivalent. Hence, the agent’s participation constraint is written as

From the analysis of section 5.3.2, the pair of certainty equivalents belongs to the set of incentive feasible transfer pairs that induce effort and participation, with the agent being given a zero wealth outside opportunity in the static model of section 5.3.2. Let us denote by *F(*0) the set of incentive feasible transfers defined by the constraints in (5.83) and (5.84). We have thus for some pair (*t*¯_{1}, * t_{1}*), which belongs to

*F(*0). Let us now move to date

**= 1.5. Following a first-period output**

*t**q*

_{1}, the agent knows then that he will receive the certainty equivalent

*w*

_{2}(

*q*

_{1}). Hence, the following participation constraint is satisfied: . To induce effort at date

**= 1.5 following a first-period output**

*t**q*

_{1}, it must be that the following incentive constraints (which are dependent on

*q*

_{1}) are also satisfied:

Immediate observation shows that the pair of transfers must belong to the set of incentive feasible transfers inducing effort and participa- tion in the static model of section 5.3.2 when the agent has an outside opportunity leaving him a wealth certainty equivalent *w*_{2}(*q*_{1}).

From remark 2 in section 5.3.2, we know that we can write those transfers as , where the pair belongs, as to *F(*0). Hence, the overall transfers are the *sum *of two contracts belonging to *F(*0). This property constitutes a significant reduction of the space of useful contracts.

Using the fact that shocks in each period are independently distributed, the principal’s problem now becomes:

It is apparent that the optimal solution to this problem is the twice replica of the solution to the static problem discussed in section 5.3.2. We imme- diately obtain the linearity of the optimal schedule.

**Proposition 9.7: ***The optimal sharing rule * *is linear in the num- ber of successes or failures of the production process. We have: *

This result obviously generalizes to *T *≥ 2 periods and more than two out- comes. Understanding this linearity result requires us to return to the main features of the solution to the static problem of section 5.3.2. The CARA specification for the agent’s utility function implies the absence of any wealth effect. The wage as well as the cost of effort in period 1 (which is counted in monetary terms) are sunk from the point of view of period 2 and have no impact on the incentive pressure that is then needed to induce effort. This incentive pressure is exactly the same as in a static one-shot moral hazard problem. Therefore, the principal views periods 1 and 2 as equivalent, in terms of both the stochastic processes generating output in each period and the incentive pressure needed to induce effort. The principal offers the same contract in each period, and the overall sharing rule, based only on the whole history of outputs up to date ** t **= 2, is linear in the number of successes and failures.

Assume now that there are *T *≥ 2 periods. Then, following the same logic as above, the transfer associated with *n *successes and *T *− *n *failures only depends on the total production and not on the dates at which these successes and failures take place. More precisely, when we denote the common value of these transfers by *t*^{S}* ^{B}*(·) we get , where

*n*

*is the number of successes. When we denote by*

*X*the aggregate output in the

*T*-Bernoulli tri- als, where at each date

*q*¯ is obtained with probability π

_{1}and

*is obtained with probability 1 − π*

__q___{1}, we have

*and*

This relationship shows that the sharing rule between the principal and the agent is linear in *X*. However, in the analysis above, the fixed-fee in (9.58) becomes infinitely large as *T *goes to infinity. Holmström and Milgrom (1987) solved this difficulty by using a continuous time model where in each period the agent con- trols the drift of a Brownian process. Typically, on an infinitesimal interval of time [*t,* *t *+ *dt*], the aggregate output *q*(*t)* up to date *t *jumps up only by a term pro- portional to the effort performed during the interval of length *dt *plus some noise. More precisely, *q*(*t *+ *dt)* − *q(**t)* − *edt *is the sum of *dt *independently and identi- cally distributed random variables with a mean of zero. The aggregate output is a unidimensional Brownian motion that follows the stochastic differential equation

where *B *is a unidimensional Brownian motion with unit variance, and *e *is the agent’s effort on the interval [*t,* *t *+ *dt*].

In the continuous time model presented here, the principal can only use the overall aggregate output *q(*1) = *q *at the end of a [0, 1] interval of time to incentivize the agent. Note that (9.59) holds on all intervals [*t,* *t *+ *dt*] and that the principal offers the same incentive pressure on each of those infinitesimal intervals so that effort is constant over time. Hence, the aggregate output *q(*1) is a normal variable with mean *e *and variance σ^{2}. Holmström and Milgrom (1987, 321) show that the optimal contract is a linear contract *t*(*q)* = *a *+ *bq*. Then, the agent’s final wealth is also a normal variable with mean *a *+ *be* and variance *b*^{2}σ ^{2}. Because the agent has constant absolute risk aversion, his certainty equivalent income *w _{e}*

*is such that*

where is the density of the normal distribution with mean *e *and variance σ ^{2}. We easily find that

When the agent’s disutility of effort is quadratic, i.e., , the sufficient and necessary condition for the optimal choice of effort is *e *= *b*. The fixed-fee *a *can be set so that the agent’s certainty equivalent income is zero and thus .

The risk-neutral principal’s expected payoff can be computed as

The principal’s problem is thus written in a reduced form as follows:

Replacing *b *and *a *with their values as functions of *e*, the principal’s problem can be reduced to

Optimizing, we easily find the second-best effort *e ^{SB}*:

where *e ^{FB} *is the first-best level of effort.

It is interesting to note that, as the index of absolute risk aversion increases, the second-best effort is further distorted downward. Similarly, as the output becomes a less informative measure of the agent’s effort, i.e., as σ ^{2} increases, this effort is also reduced. These insights were already highlighted by our basic model in chapter 4.

The Holmström and Milgrom (1987) model can be extended to the case of a multidimensional Brownian process that corresponds to the case where the agent’s output is multidimensional and to the case of a mul- tidimensional effort as we show below. Schättler and Sung (1993) showed that a time-dependent technology calls for the optimal contract to be nonlinear. Sung (1995) showed that the optimal contract can still be linear when the agent controls the variance of the stochastic process. Hellwig and Schmidt (1997) proposed further links between the discrete and the continuous time model. Lastly, Bolton and Harris (1997) generalized the Brownian motion. Diamond (1998b) proposed a static model with limited liability, risk neutrality and three possible outcomes; linearity emerges because the agent has a rich set of choices in the distribution of these outcomes. More generally, it is also worth stressing that the linearity of contracts may be imposed when the principal must prevent arbitrage or resale among his agents.

Once the Holmström-Milgrom framework is accepted, i.e., the CARA utility function, the Brownian motion of output whose drift is affected by effort, trans- fers depending only on the final aggregate output, the analysis is made very easy because contracts can be taken to be linear, and these linear contracts can be developed in many directions. Below we give an example of an interesting use of the model.

#### A Multitask Illustration

The model above with a CARA utility function and a normal distribution of the performance is also useful when analyzing multitask agency models. To see that, let us assume that the agent exerts a vector of effort for the principal. The disutility of effort is written as *ψ(**e)*, where *ψ(*·) is convex. To make things simpler we will assume that *ψ*(·) is a quadratic form, and we will write for some semidefinite positive and symmetric matrix

We denote as the inverse matrix that is also symmetric. Contrary to the model of section 5.2.3, the disutility of effort is counted in monetary terms. Each of these efforts affects the mean of a random variable *q*˜* _{i}* in the following manner:

where ε˜* _{i}* is a random variable. We will denote the vector of performances by . The vector of random variable is normally distributed with a zero mean and a covariance matrix given by

Let us thus write the optimal linear contract offered by the principal as or, using the inner product,

With this type of linear contract, the agent gets a certainty equivalent income *w _{e}*

Using the moment generating function,30 we find that

Optimizing (9.66) with respect to *e *yields the agent’s response to any linear contract. We obtain the following incentive constraints:

This yields the following expression of the certainty equivalent income of the agent:

Of course, this certainty equivalent income must be positive in order to pro- vide the agent with at least his outside opportunity.

The risk-neutral principal benefits from the sum of the performances on each particular task. The principal’s expected payoff is written as

Expressing this term, we find an expected payoff for the principal of ( II^{‘} − *b*^{‘})*e *− *a*, where II^{‘} = (1, 1) is the line vector having all of its components equal to one.

We can finally write the principal’s problem as:

Of course, these two constraints are binding at the optimum, and we can rewrite the principal’s problem in reduced form as

This problem is concave in *b*, and the second-best vector of optimal marginal incentives is obtained as a solution to

where

is the identity matrix.

The matrix ∑ψ, which is computed from the technology and the informative- ness of the different signals, plays a significant role in determining the second-best marginal incentives and the optimal effort.31 This matrix measures the discrepancy between second-best and first-best incentives.

As an example, let us assume that efforts are imperfect substitutes, with ψ_{11} = ψ_{22} = 1 *> ψ _{12}*, and off-diagonal terms of ∑ are all equal to zero. Then, the values of the marginal incentives are given by

Assume for instance that σ_{2} = ∞ to capture the fact that the performance *q*˜_{2} may not be observable at all by the principal. Then we have

It becomes impossible to reward the hardly observable task *q*˜_{2}, and so all incentive pressure is put on the first task.32 This points to the complementarity between rewarding different substitute tasks in a second-best environment that we already stressed in the bare-bones model of section 5.2.5. On this, see also Holmström and Milgrom (1994).

Source: Laffont Jean-Jacques, Martimort David (2002), *The Theory of Incentives: The Principal-Agent Model*, Princeton University Press.