# Analysis in ecological perspective: Models for Transition Rates

A major goal in our research is to explore the effects of population hetero- geneity on transition rates. This means analyzing how measured character- istics of organizations, populations, and environments affect rates of initiation, merger, and dissolution.

### 1. The Exponential Model

The baseline for comparison is the model of a constant rate:

r(t) = λ

which implies that

Because the constant-rate model implies an exponential distribution of durations, it is commonly referred to as the exponential model. Alternatively, we can express the constant rate (or exponential) model as a “regression model” for the log of the waiting time (age or duration, depending on the context):

log T = – α + W,

where a = – log λ and the disturbance, W, has an extreme value distribution. That is,

(see Johnson and Kotz 1970, chap. 2.1). Since failure times are not always observed because of censoring, procedures must be developed to estimate such regressions from censored data, as we discuss in the last section of this chapter.

In considering the effects of covariates, the most important question confronting analysts is substantive: how are transition rates and explanatory variables related? The theories presented in Part I do not provide guidance about the exact functional forms of organizational processes. Therefore, we choose simple, tractable functional forms that agree qualitatively with the substantive arguments. We use a variety of models that specify log-linear relationships between transition rates and explanatory variables:

assuming for the moment that the vector of covariates, x, does not vary over time.

In terms of the regression model for the log of duration (or age), this model implies that

(8.7)          log T = -β’x + W,

where the disturbance term, W, again has an extreme value distribution. Note that the signs of the effects have been reversed (that is, multiplied by -1) in (8.7) because a variable that increases the hazard decreases the expected time to failure or exit.

We often estimate models that specify more complicated dependence of rates on covariates than in (8.6). All that is required for the class of models we are considering is that the model be log-linear in parameters. Restricting models to ones that are log-linear in parameters is not very confining. In particular, we rely heavily on quadratic relationships between density in a population and founding and disbanding rates as a way to analyze intra- population competition.

Since we use the log-linear specification repeatedly, it is worth discussing interpretation of its parameters. The situation is simplest for covariates that are dummy variables. Consider the effect of X1. Under the log-linear specification, the model can be written as

where r* is the rate given by all of the other covariates. That is, r* = exp(β0 + β2X2 + • • • + βNXN). When the dummy variable X1 equals zero, the rate is just r*. When the dummy variable equals unity, the rate is r* multiplied by exp(β1). Thus the antilog of the coefficient of a dummy variable tells the ratio of the rate for those whose value on the dummy variable is unity to those whose value is zero. Put differently, it is the multiplier that must be applied to the rate for those with the value of zero on the covariate to obtain the rate for those with the value of unity. For example, in analyzing the disbanding rates of labor unions, we use a covariate that equals one in years of economic depression and zero otherwise. The estimated effect of this variable is approximately 0.30. Since exp(0.30) ≈ 1.35, this finding means that economic depressions increased the disbanding rate by a factor of 1.35. In other words, they increased the rate by 35 percent.

When the co variate is metric, the coefficient (\$n gives the effect of a marginal change in X„ on the logarithm of the rate. When such effects have substantive importance, we describe them using plots and numerical calcu- lations.

Because transition rates, like any other instantaneous quantity, cannot be observed directly, it may be hard to think substantively about such relationships. However, there is a simple relationship that often proves helpful in interpretation. When transition rates are independent of time, the expected duration in a state j equals the inverse of the hazard function:

For instance, we find that the hazard of mortality from all causes for national labor unions created by a merger of existing unions equals 0.027. With this hazard, the expected lifetime of a union has been 37.3 years.

Because distributions of durations tend to be positively skewed, calcula- tions of expected durations are usually dominated by extreme values. Therefore it is often useful to summarize the implications of a transition rate in another way, using the concept of half-life. The half-life of a stochastic process is the expected duration (or age) at which half of the units of risk of leaving a state will have left, that is, G(t) = .5. For example, if the hazard of leaving a state does not vary over time, using (8.1) and setting G(t) = .5 gives

or

since log (.5) ≈ -0.69. For example, the half-life of labor unions implied by the constant rate of 0.027 is 25.6 years. Notice that the half-life implied by this hazard is substantially smaller than the expected lifetime (37.3 versus 25.6 years). This difference reflects the fact that extreme values play a less powerful role in determining half-lives.

### 2. Changing   Covariates

To this point we have treated the covariates as fixed over time. However, the environmental factors and competitive relations that shape population dynamics are rarely stable. The simplest way of allowing the levels of covariates to change over time is to generalize the model in (8.6) to include effects of preselected periods, denoted by p:

In the simplest case, we constrain the effect parameters to be constant over periods but update the values of the covariates at the beginning of each period:

We usually chose the periods to coincide with calender years because the values of covariates are often supplied as yearly time series. So we assume that the covariates are step functions in time, that they are constant within years and change values at the start of each year.

Time dependence in rates of organizational change is relevant to a number of core theoretical issues in organizational demography and ecology, as we pointed out in Part I. Models in which the hazard or rate depends on a set of covariates but not on time imply that the distribution of times of events is exponential. Because of the memory-less property of the exponential distribution, this model implies that the distribution of event times does not depend on the length of time since the previous event.

We now want to consider models in which the hazard does depend on the time since the previous event, that is, on duration. The Gompertz model (and its extension the Makeham model) have formed the basis of most previous research on age dependence in organizational mortality (Carroll 1983; Freeman, Carroll, and Hannan 1983; Singh, House, and Tucker 1986). This model assumes that the rate is an exponential function of the waiting time:

Of course, this model assumes that time dependence is monotonic. When y is positive, as in the case of organizational mortality, the rate declines from λ to zero. Since it is unreasonable to assume that very old organizations escape the risk of mortality completely, organizational researchers have used a variation of the Gompertz model called the Makeham model, which introduces a non- zero asymptote:

Although this slight change makes analysis much more complicated (because the model is no longer a member of the exponential family and is also not a proportional-hazards model), the qualitative behavior of the model in the crucial early portion of the waiting time distribution is essentially the same as that of the Gompertz model.

Note that the logarithm of the rate of the Gompertz model is a linear function of the waiting time. We use this fact in evaluating the fit of the Gompertz model.

### 3. The Weibull Model

The second model of time dependence that we use, the Weibull model, holds that the rate is a power function of the waiting time:

The rate is a monotone-decreasing function of duration (or age) if p < 1, a monotone-increasing function of duration for p > 1, and it equals the exponential when p = 1. We use the last relationship to construct likelihood ratio tests of Weibull models against the null hypothesis of an exponential model.

The fit of a Weibull model can also be evaluated by considering plots of empirical log-hazards against age or duration. Whereas the Gompertz model implies that the log-hazard is a linear function of duration (or age), the Weibull model implies that it is a linear function of the logarithm of duration (or age).

According to the Weibull model, the integrated hazard of a Weibull process is a power function of the waiting time:

(8.11)        log H(t) = p(log T + log λ).

So another way to evaluate the fit of the Weibull model is with plots of the logarithm of the estimated integrated hazard against the waiting time (age or duration, depending on the context).

We generalize the Weibull model to include the effects of measured covariates, using the assumption:

So the baseline Weibull hazard is multiplied (“accelerated”) by the effects of the vector of covariates according to this generalization. In terms of a regression model for the logarithm of the waiting time, this generalization of the Weibull model implies that

where  (see Kalbfleisch and Prentice 1980, pp. 31- 32). Note again that the signs of the effects of covariates have been reversed from their effects on the rate.

### 4. A Gamma Model

Although we have tried to measure the most important determinants of the various rates studied, we surely have not measured all of them. Suppose that a transition rate such as an organizational disbanding rate varies within a population, but we assume mistakenly that the rate is a constant (conditional on a set of measured covariates). What difference does this error make? The substantively most important implication is that ignoring unobserved variation in transition rates produces spurious time depen dence in estimated transition rates. So we explore whether apparent time dependence in rates might plausibly reflect the operation of unobservables rather than genuine time dependence in the rates at the level of the individual organization.

We use a parametric approach. That is, we specify a particular parametric distribution for the effects of the unobservables. We assume a gamma distribution for the disturbances. Because it is so flexible, the gamma distribution is used commonly in contexts like the ones we are considering. Of the various tractable nonnegative probability distributions, it is perhaps the most flexible. Depending on the values of its parameters, a gamma distribution can range from a highly skewed J-shape to a nearly symmetric unimodal shape.

So we introduce a gamma-distributed disturbance into the regression model for the logarithm of the waiting time. That is, we specify that

log T = α + W,

and that the density of the disturbance, W, equals

The implied transition rate is

where Tk(λt) is an incomplete gamma integral. Notice that the presence of this form of unobserved heterogeneity produces duration dependence or age dependence in the transition rate. The rate is a monotonic-increasing function of duration or age if k > 1 and a decreasing function of duration or age if k< 1. This model has as a special case the exponential (k – 1), which means that there is negligible unobserved heterogeneity. We use this fact to construct a likelihood ratio test of the gamma model against the exponential.

### 5. A Generalized Gamma Model

It turns out to be useful in our substantive work to use a model proposed by Stacy (1962) that generalizes the gamma model (see also Kalbfleisch and Prentice 1980, p. 227). This model is most easily specified in terms of the regression model of the logarithm of the waiting time:

(8.15) log T = -β’x + σW,

where σ= p-1 and W has the density given by (8.13). This model has as special cases the exponential (σ = 1, k1), the Weibull (k = 1, σ # 1), and the gamma (σ = 1, k # 1). So we estimate this generalized model by maximum likelihood and conduct likelihood ratio tests to see whether the generalized model including the gamma-distributed disturbance improves the fit significantly compared to the Weibull model with covariates and a gamma model with covariates.

### 6. Log-logistic Model

There are suggestions from previous research that hazards of mortality for organizations may be non-monotonic functions of age. Carroll and Huo (1988) report that hazards of mortality for local unions of the Knights of Labor rose over the first two years before beginning the expected decline with aging. Langton (1984) reports a similar pattern for unions in the service sector. Singh, House, and Tucker (1986) have reported a similar pattern for mortality rates of voluntary social service agencies in Toronto. Given these findings, we have also explored the fit of two widely used parametric models of age dependence that imply non-monotonic relationships between hazards and duration or age: the log-logistic and the lognormal models. Because the two typically had equally good fits to our data and have similar interpretations, we concentrated on the log-logistic model, which has the nice property that it can imply either monotonie or non-monotonic duration dependence or age dependence, depending on the value of its scale parameter. The log-logistic model for the rate holds that

The numerator of this expression is the same as the Weibull model. The log- logistic is a monotone-decreasing function of duration or age if p ≤ 1. It is a non-monotonic function of duration or age, increasing and then decreasing with duration or aging, if p > 1. In terms of the regression model for the logarithm of the waiting time, the log-logistic model is

Source: Hannan Michael T., Freeman John (1993), Organizational Ecology, Harvard University Press; Reprint edition.