%Paper: ewp-em/9308001
%From: bparks@wuecona.wustl.edu (Bob Parks)
%Date: Thu, 5 Aug 1993 14:31:27 -0500 (CDT)

\documentstyle[12pt]{article}
\topmargin -.25in
\textheight 8.25in
\textwidth 6in
\oddsidemargin .25in
\evensidemargin .25in
\newcommand{\inv}{^{\mbox{}-1}}
\newcommand{\Cov}{\mbox{Cov}}
\newcommand{\minus}{\mbox{}-}
\newcommand{\figspepsf}{\mbox{ } \vspace{1.7in} \mbox{ }}
\newcommand{\figsppsbox}{\mbox{ } \vspace{2in} \mbox{ }}
\begin{document}
\reversemarginpar
\newlength{\single}
\setlength{\single}{1.0\baselineskip}
\newlength{\double}
\setlength{\double}{1.0\baselineskip}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%  the 8 figures were included with a figures command - they are         %
%  tar'ed, compressed and uuencoded (358k->1.8meg)                       %
%  For Unix users, just csh figtarz.uu.  For DOS users,                  %
%  supposing that you have called it figtarz.uu                          %
%    1. uudecode figtarz.uu to produce figtar.Z                          %
%    2. uncomp figtar.z  to produce figtar                               %
%    3. tar -xf figtar   to produce fig1.ps -- fig8.ps                   %
%  4. just below choose your favorite macro package by (un)commenting    %
%     the appropriate line below                                         %
%  5. If you choose something other than epsf, then you will have to     %
%     edit the end of this document for the appropriate figure commands. %
%                     OR                                                 %
%        just comment out the \input below, uncomment the first          %
%        \end{document} and do without the figures                       %
%        (8 references will then be unresolved)                          %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%


\input{epsf}
%\input{psbox}
%\input{psfig}

%\bibliographystyle{/local/tex/lib/texinput/bibtex/plain}
%\bibliographystyle{/local/tex/lib/texinput/local/emt}
%\bibliographystyle{emt}
\title{A Predictive Approach to Model Selection and
Multicollinearity\thanks{Thanks to Professor Siddhartha Chib for
comments.  An earlier version of this paper was presented at the
Midwest Econometrics Group meeting at Notre Dame in September 1991.}}
\author{Edward Greenberg\\
Department of Economics \\Washington
University\\One Brookings Drive \\St. Louis MO 63130
\and Robert P. Parks \\
Department of Economics \\Washington
University\\One Brookings Drive\\St. Louis MO 63130}
\maketitle
\begin{abstract}

    We argue for the adoption of a predictive approach to model
specification.  Specifically, we derive the difference between means and
the ratio of determinants of covariance matrices when a subset of
explanatory variables is included or excluded from a regression.  For
several special cases these measures are shown to be related to widely
used tools for studying model specification.  Results for a set of
simulated data and for two economic applications are presented as
examples.

\end{abstract}
\newpage
\baselineskip\double
\section{Introduction}

    This paper addresses the question of when it might be reasonable to
simplify a linear regression model by not including a set of variables
in an equation that contains another set of variables whose inclusion is
clearly warranted on the basis of theory or previous empirical evidence.
We are thus concerned with the problem of model selection, which we
approach from a predictive Bayesian viewpoint.  This problem, of course,
has been intensively and extensively studied from many other viewpoints.
Since nonexperimental data in general, and economics data in particular,
are often highly correlated, model specification is closely related to
the problem of multicollinearity.

    Our approach is to compare the predictive density for an equation
with and without the set of variables in question and to argue that the
set may be safely omitted if the omission has little or no effect on the
predictive density.  This approach can be utilized in more general
settings; for example, Marriott et al.  \cite{Marriott} take a very
similar approach in the context of a model with serially correlated
errors.  The predictive density seems to us the best instrument for
making this kind of choice because it is defined in terms of observable
values of the dependent variable, about which an investigator is likely
to have information.  In contrast, little or nothing is known about
values of regression coefficients in almost all econometric studies.
The pervasive use of the null hypothesis value of zero for a coefficient
is a sign of this ignorance.

    Gaver and Geisel \cite{GaverGeisel} and Amemiya \cite{Amemiya}
review a number of methods that have been proposed to deal with the
model selection problem, both in the sampling-theory and Bayesian
frameworks.  We show below that our approach is closely related to
several of these methods.  Zellner's \cite{Zellner} Bayesian approach to
model selection requires that a prior probability of the truth of each
model be specified.  Although our method does not require the
specification
of such probabilities and may therefore be somewhat lacking in
Bayesian rigor, our analysis and examples show that the method has
practical advantages and provides useful information. In addition, we
note below the relationship between the Bayes factor and our measures.

    Emphasis on the predictive distribution has been championed by
Geisser and co-authors; see, for example, Geisser \cite{Geis85} and
Johnson and Geisser \cite{JohnsGeis}.  Johnson and Geisser's
\cite{JohnsGeis} application of predictive methods to the problem of
detecting influential observations provides new insights into the
problem.  They find that some of the measures they examine have been
proposed in the sampling-theory approach.
    Geisser and Eddy \cite{GeisserEddy} propose a predictive sample
reuse technique that involves the computation of a predictive density,
where the density is evaluated as the product of the conditional
densities at the observed sample points.  Likelihoods evaluated at the
sample points are then compared for each model under consideration.
This approach does not require specifying probabilities of models.  Like
most of the other approaches to model selection, however, the method
attempts to rank the models without regard to the point at which the
prediction will take place.  As will be seen below, our method can be
used to examine prediction at particular points.  We do not, however,
attempt to reduce the problem of model selection to ranking by one
criterion, because we believe that there may be much to be learned from
examination of a richer information set.

Other statisticians have argued persuasively for a predictive approach.
For
example, Aitchison and Dunsmore \cite{AitchisonDunsmore} note at page
{\it x}:
\begin{quote} Now the purpose of such [parametric] inference statements
is surely to convey to some second party information about what is
likely to happen if the experiment is performed again, or perhaps
repeated a number of times.  It is surprising therefore that greater
thought has not been given to the more direct practical type of
inference, where statements are required for what is likely to occur
when future experiments are performed.  \end{quote}

    A quotation from Clayton, Geisser, and Jennings \cite{ClayGeisJenn}
is pertinent to the model selection problem:  \begin{quote} Econometric
and other statistical models are often simplifications of extremely
complicated phenomena, and it is a mistake to assume that any particular
model is actually a true representation of the underlying process.  What
is hoped is that the model may be an adequate description and perhaps
useful for some purpose.  Hence it is often puzzling why there has been
so much effort, especially in the softer social sciences, devoted to
``testing'' parameters of a model as if they were true entities and not,
as in most instances, convenient artifices.  A more substantial
enterprise than testing should be model selection, i.e.\ selecting one
of several alternative models such that the selected model (irrespective
of its truth) would serve best some purpose of the investigator
(descriptive or predictive).  \end{quote} In our approach, comparing
predictions of various models at the values of the independent variables
observed in the sample would help evaluate the descriptive power of a
model, while comparing predictions at a value of the independent
variables for which a prediction is to be made would evaluate its
predictive power.

    Finally, Jaynes \cite{JaynesWhat}, in discussing the approach of
physicists
to the problem of hypothesis testing, puts the matter clearly (in what
follows $\lambda$ is a parameter value that is being compared to the
presently accepted value of $\lambda_0):$ \begin{quote} When we retain
the null hypothesis, our reason is not that it has emerged from the test
with a high posterior probability, or even that it has accounted well
for the data.  $H_0$ is retained for the totally different reason that
if the most sensitive available test fails to detect its existence, the
new effect $(\lambda-\lambda_0)$ can have no observable consequences.
That is, we are still free to adopt the alternative $H_1$ if we wish to;
but then we shall be obliged to use a value of $\lambda$ so close to the
previous $\lambda_0$ that all our resulting predictive distributions
will be indistinguishable from those based on $H_0.$ \end{quote} For
example, there can be no practical value in adding a set of variables
that leaves the predictive variance unchanged and changes the predictive
mean only in the third decimal place of a variable that is observed to
one decimal place, even if the regression coefficient is highly
significant.  Examination of the effects on the predictive density is
therefore a way of avoiding the problem, which arises in hypothesis
testing, of including a variable whose coefficient is highly
significant, perhaps because of a large sample size, but which has only
a negligible effect on the dependent variable.


    We conclude that, for the Bayesian, the natural basis on which to
judge whether a model is satisfactory is the predictive density.  Since
this density is defined over observations, the researcher can judge
whether the differences between one model and another are minor or
major.  Although it is usually impossible in a parametric model to
observe parameters directly, the data are observed.  A Bayesian's
statements about the world are conditioned on the data, and the predictive
density is such a statement.  Hence, in considering model selection, it
is natural for the Bayesian to turn to the predictive density to decide
whether to include a subset of variables.  A difference between
predictive densities indicates whether two models imply any real
differences about the world, although neither may in fact be true.

    Our main results are derived in Section 2; Section 3 presents
examples; and Section 4 contains our conclusions.

\section{Predictive Densities} \label{PredictiveDensities}

    We divide the explanatory variables in a regression into two sets,
$X_1$ and $X_2.$ The full model is specified as \begin{equation}
\label{full} y=X_1 \beta_1 + X_2 \beta_2 + e, \end{equation} and the
predictive distribution for $y_0$ given $X_0=(X_{01}, X_{02})$ is
compared with the predictive distribution for $y_0$ given only $X_{01}$
from the partial model
 \begin{equation} \label{partial} y=X_1 \beta_1 +e_1.
\end{equation}
 Dimensions for vectors and matrices are as follows:
$y,e,$ and $e_1$ are $T\times 1,$ $X_1$ is $T \times k_1,$ $X_2$ is
$T\times k_2,$ $\beta_1$ is $k_1 \times 1,$ $\beta_2$ is $k_2 \times 1,$
$y_0$ is $n \times 1,$ $X_0$ is $n \times (k_1+k_2),$ $X_{01}$ is $n
\times k_1,$ and $X_{02}$ is $n \times k_2.$ We assume that $e\sim {\cal
N}(0, \sigma^2 I)$ and $e_1\sim {\cal N}(0, \sigma^2_1 I).$

    Throughout the paper we assume that all variables have zero means.
In most cases this is achieved by defining variables as deviations from
their sample means.  When the data are a pooled cross section-time
series, the variables are defined as deviations from cross-section and
time means.  Degrees of freedom are adjusted accordingly.

    Zellner \cite[pages 72--74]{Zellner} shows that the
predictive
density conditional on $X_0$ for the model
\begin{equation} \label{zellnermodel} y=X\beta + u,\end{equation}
where $X$
is $T\times k$, $\beta$ is $k\times 1,$ $u$ is $T \times 1,$
$u\sim{\cal
N}(0,\sigma^2I)$ and the (noninformative) prior for
$(\beta,\sigma)$ is
proportional to $1/\sigma,$ is a multivariate Student $t$
form with
\begin{equation} \label{predict} E(y_0|X_0) = X_0
\hat{\beta}\end{equation} when
$\nu=T-k-1
>1$ and covariance matrix \begin{equation} \label{cov}
\Cov(y_0|X_0)=
\frac{\nu}{\nu-2}s^2[I + X_0(X'X)\inv X_0']\end{equation}
when $\nu>2.$
In these equations,  $\nu s^2$ is the sum of squared
residuals, and
$\hat{\beta}$ is the least squares estimator of $\beta$ in
(\ref{zellnermodel}).

    In view of our introductory remarks, it is possible
that previous empirical work has established prior information regarding
$\beta_1.$ In that case, the above results are easily extended to the
case where $\beta_1 \sim {\cal N}(\bar{\beta}_1, A)$, and $\beta_2$ may
have a diffuse or an informative prior.

    Applying (\ref{predict}) to (\ref{full}) and (\ref{partial}), it
follows that \begin{equation} \label{difference}
E(y_0|X_0)-E(y_0|X_{01})=(X_{02}- X_{01}\hat{\Gamma}_1)\hat{\beta}_2 =
\Delta'\hat{\beta}_2, \end{equation} where $\hat{\Gamma}_1 =
(X'_1X_1)\inv X'_1X_2 $ and $\hat{\beta}=
\left(\begin{array}{c}\hat{\beta}_1\\ \hat{\beta}_2\\
\end{array}\right)=(X'X)\inv X'y.  $ Note the role played by
$\Delta'=X_{02}- X_{01}\hat{\Gamma}_1.$ It equals zero when $X_{02} =
X_{01}\hat{\Gamma}_1,$ which implies that the point at which the
prediction is to be made lies on the regression of $X_2$ on $X_1$ in the
original sample.  Thus, a value of $\Delta$  far from zero indicates
that the prediction is to be made at a value of $X$ that is quite
different from those in the original sample.
%Equation
%(\ref{difference}) indicates that the predictive mean
%is unchanged if $X_{01}=0$ and $X_{02}=0,$ i.e.\ the prediction is made
%at the sample means.  More generally, it shows that the mean of the
%predictive density is unchanged if the prediction takes place at a point
%where $\Delta=0,$ i.e.\ the rows of $X_0$ lie on the regression plane of
%$X_2$ on $X_1$ in the original sample.
This relationship is the
Bayesian counterpart of the principle that multicollinearity does not
affect prediction if it persists in the period for which predictions are
to be made.

    From the sampling-theory viewpoint, $\Delta' \beta_2$ is the bias in
the least squares prediction of $y$ when the true model is (\ref{full}),
but the statistician estimates on the basis of (\ref{partial});
$\Delta'\hat{\beta}_2$ can therefore be viewed as an estimate of the
bias.  When $n=1$, (\ref{cov}) is proportional to the
sampling-theory estimator for the variance of the prediction.  A
tradeoff of bias and variance plays a role in several sampling-theory
approaches to model selection.  The above comments can be extended to
cover the case of general linear restrictions of the form $R\beta=
{r},$ of which $\beta_2 = { 0}$ is a special case.

    The effect of including $X_2$ on the predictive covariance matrix is
indicated by the generalized variance ratio (GVR), defined as the ratio
of the determinants of the two covariance matrices:
\begin{eqnarray}
\label{covr} \mbox{GVR}=\frac{|\Cov(y_0|X_{01})|}{|\Cov(y_0|X_0)|}
&=&\left( \frac{\nu_1-k_2-2}{\nu_1-2} \right)^n \left( 1 + \frac{\Delta
R^2}{1-R^2} \right)^n \nonumber \\ &\times & \frac{|I +
X_{01}(X'_1X_1)\inv X'_{01}|}{|I +X_0(X'X)\inv X_0'|},
\end{eqnarray}
where $R^2$ is determined from the regression of $y$ on $X_1$ and
$X_2$,
$\Delta R^2$ is the increase in $R^2$ from adding $X_2$ to the
regression of $y$ on $X_1,$ and $\nu_1= T-k_1 - 1.$ When $n=1$ the GVR
reduces to the ratio of the two predictive variances.

    It may be seen that (\ref{covr}) is the product of three terms, the
first two of which are independent of the point at which the prediction
is to be made.  The first depends on the degrees of freedom for the two
equations and is always less than one.  The second, which is greater
than or equal to one, is a function of the improvement in the fit of the
regression when $X_2$ is included.  It can be written in terms of the
two $R^2$,
 \[ \left(\frac{1-R^2_1}{1-R^2}\right)^n, \]
where $R^2_1$ is based on $X_1$ only, or
in terms of the $F$ statistic for testing the
hypothesis that $\beta_2 =0,$
\[ \left(1+F\frac{k_2}{T-k_1-k_2-1}\right)^n.\]
 The third term depends on the point
at which the prediction is to be made. It is less than or
equal to one,
as may be seen from rewriting it as
\begin{equation}
\label{thirdterm}
 \frac{|I+ X_{01}(X'_1X_1)\inv X'_{01}|}
{|I+ X_{01}(X'_1X_1)\inv X'_{01}+
\Delta'[X_2'(I-H_1)X_2]\inv\Delta|},\end{equation}
where $H_1 = X_1(X_1'X_1)\inv X_1'.$
It is instructive to consider several special cases.

Suppose first that $\Delta=0$ and $n=1.$
Then the third term equals one, and GVR can be written in
terms of
 degrees of freedom and adjusted
$R^2$s; it is therefore related to several of the variable
selection
rules described in Judge, et al.\ \cite[section
21.2]{Judgeetal}
or in Amemiya \cite{Amemiya}. For
example, $C_p$ is a linear transformation of GVR:
\[
C_p = \frac{(T-k_1-2)(T-k_1-k_2-1)}{T-k_1-
k_2}\mbox{GVR}+2k_1-T.\]

    Suppose next that we are considering prediction at the observed
design matrix, i.e.\ $X_0 =X.$ Then by applying Theorem A.8.2 in
Judge, et al.\ \cite[page 950]{Judgeetal} it is easy to see that the
third term in (\ref{covr}) equals $2^{\mbox{}-k_2}.$ Again (\ref{covr})
depends only on degrees of freedom and the $R^2$s, and the GVR is thus
related to the selection rules mentioned above.

    The GVR can also be related to the posterior odds criterion and the
Bayes factor; see Zellner \cite{zellnerJBPO}, \cite{zellnerPORF},
Zellner and Siow \cite{zellnersiowPORF}, or Smith and Spiegelhalter
\cite{s&s}. This may be regarded as the ratio of the predictive
densities for
the two models evaluated at the observed sample points. For the problem
we are considering, with the additional
assumption that both models have the same prior probability, the
posterior odds ratio for model (\ref{full}) relative to model
(\ref{partial}) is
\[ \left(\pi^{k_2/2} (\nu_1-k_2)^{k_2/2}
 \frac{\Gamma(\nu_2/2)}{\Gamma(\nu_1/2)} \right)
\left(1 + \frac{\Delta R^2}{1-R^2}\right)^{\nu_1/2}
\left( \frac{|X'_1X_1|}{|X'X|}\right)^{1/2}
 s^{k_2}. \]
The first of these terms involves degrees of freedom, the second the
relative $R^2$ (as in the GVR), the third the design matrices, and the
fourth term, which has no counterpart in the GVR, is a function of the
residual variance from the full model.

    Another special case is to set $X_{01}=0,$ $k_2 =1,$ and $n=1.$ Then
\[ \frac{\mbox{Var}(y_0|0)}{\mbox{Var}(y_0|0,X_{02})}=
\left(\frac{T-(k_1+4)}{T-(k_1+3)}\right) \left( 1 + \frac{\Delta R^2}
{1-R^2}\right) \left(1+\frac{X^2_{02}/X_2'X_2}
{1-R^2_{2\cdot1}}\right)\inv,\] where $R^2_{2\cdot1}$ is the $R^2$ in
the regression of $X_2$ on $X_1.$ The term $1/(1-R^2_{2\cdot1})$ is
called the ``variance inflation factor'' in the multicollinearity
literature.  If $X^2_{02} = X_2'X_2/\nu$ and $\nu_1/\nu\approx 1,$ then
GVR $\approx$ 1 when the $F$ statistic for testing the hypothesis that
$\beta_2 =0$ is approximately equal to the variance inflation factor.

    As a final special case, assume that $X_0 = x'_i,$ where $x'_i$ is
the $i$th row of $X$; i.e.\ the prediction is to take place at the
observed value of the $i$th observation.  Then the third term in
(\ref{covr}) can be written in terms of the $i$th diagonal element of
the ``hat'' matrices, $ X(X'X)\inv X'$ and $X_1(X_1'X_1)\inv X_1',$ as
\begin{equation} \label{hats} \frac{1+x'_{1i}(X'_1X_1)\inv x_{1i}}
{1+x_i' (X'X)\inv x_i}= \frac{1+h_{1i}}{1+h_i}.\end{equation} In
Belsley, Kuh, and Welsch \cite{BKW} the $h_i$ are extensively used to
find observations that have a large effect on parameter estimates
(influential observations).  See also Johnson and Geisser
\cite{JohnsGeis}.  Thus, in terms of this diagnostic tool, (\ref{covr})
can be written as the product of a degrees of freedom term, a term in
the relative goodness of fit of the two models, and a term in the
relative degree of influence of $x_i$ in the two models.  Let us
consider this third term in a bit more detail.  Since it is less than or
equal to 1, we have $h_i \geq h_{1i}.$ When $n=1$, we have from
(\ref{thirdterm}) and (\ref{hats}), \begin{eqnarray*}
h_i-h_{1i}&=&\Delta'[X'_2(I-H_1)X_2]\inv \Delta\\ &=&
\frac{(x'_{2i}-x'_{1i}\hat{\Gamma}_1) (x'_{2i}-x'_{1i}\hat{\Gamma}_1)'}
{\sum_i (x'_{2i}-x'_{1i}\hat{\Gamma}_1)(x'_{2i}-
x'_{1i}\hat{\Gamma}_1)'}.
\end{eqnarray*}
 Thus, predicting at a point where $\Delta$ is large tends to
reduce the variance of the partial model relative to that of the full
model.

    Two other matters are addressed before we turn to empirical
examples:  the information criterion and the relation between $\Delta$
and GVR.  Johnson and Geisser \cite{JohnsGeis}, who are concerned with a
predictive view of detecting influential observations, propose the use
of the Kullback-Liebler (K-L) divergence to measure discrepancies
between densities; see Kullback and Liebler \cite{KL}.  In the case of
two normal distributions, the K-L divergence for comparing $f_1$ and
$f_2$ is given by
\begin{equation} \label{info}
I(f_2,f_1)= \frac{1}{2}[(\mu_2-\mu_1)'\Sigma\inv_1(\mu_2-
\mu_1) +
\mbox{tr} \Sigma_2 \Sigma\inv_1 -\ln |\Sigma_2 \Sigma\inv_1|
- M],
\end{equation}
where the $\mu_i$ are the means and the $\Sigma_i$ the covariance
matrices of the $M$ dimensional normal densities being compared.  The
information measure has the virtue of combining differences between
means and covariance matrices into one measure.

To explain the appearance of the graphs presented below,
we conclude this section with a few remarks on the
relation
between $\Delta'\hat{\beta}_2/\sigma_{\hat{y}_{01}}$ and GVR
when
$n=1$, where
\[
\sigma_{\hat{y}_{01}}=\sqrt{\frac{\nu_1}{\nu_1-2}} s_1
\sqrt{1 + X_{01}
(X_1'X_1)\inv X_{01}'}\] is the standard error of prediction
for
model (\ref{partial}) at $X_{01}.$ Consider first the case
$k_2=1.$ The first two terms in (\ref{covr}) are
denoted by $K,$ which permits us to write, using
(\ref{thirdterm}),
  \begin{equation} \label{gvrcone}
\mbox{GVR}=\frac{K}{1+a
(\Delta'\hat{\beta}_2/\sigma_{\hat{y}_{01}})^2},
\end{equation}
where \[ a =\frac{\nu_1 s^2_1}{(\nu_1-2) \hat{\beta}_2^2
X'_2(I-H_1)X_2}.\]  Thus the points in a plot of GVR on
$(\Delta'
\hat{\beta}_2/\sigma_{\hat{y}_{01}})^2,$ where the variation
is in $\Delta/\sigma_{\hat{y}_{01}},$ lie on equation (\ref{gvrcone}).

For general $k_2$ and $n=1$,
we establish an inequality to show
that a similar equation acts as an upper bound to GVR for a
given
value of the standardized difference in means.  Let
\[\mbox{GVR}=\frac{K}{1+b\Delta'M_1\Delta/\sigma_{\hat{y}_0}
^2},\]
where $b=\nu_1s_1^2/(\nu_1-2)$ and $M_1=[X_2'(I-
H_1)X_2]\inv.$  The
square of the standardized difference in means is
$\Delta'\hat{\beta}_2
\hat{\beta}_2'\Delta/\sigma_{\hat{y}_{01}}^2.$
  Since the ranks of  $M_1$ and $\hat{\beta}_2\hat{\beta}_2'$
are
$k_2$ and 1,
respectively, and both are symmetric  there
exist
matrices $R$ and $Q$, $Q'Q=I,$ such that
%\[R'M_1R=I=Q'R'M_1RQ\] and
%\[Q'R'\hat{\beta}_2\hat{\beta}_2'RQ= \left(
%\begin{array}{cc} \omega & 0
%\\ 0 & 0 \end{array} \right), \] where $\omega$ is a scalar.
$R'M_1R=I=Q'R'M_1RQ$ and
$'R'\hat{\beta}_2\hat{\beta}_2'RQ= \left(
\begin{array}{cc} \omega & 0
\\ 0 & 0 \end{array} \right), $ where $\omega$ is a scalar.
Then, defining $u$ by $\Delta=RQu,$ we have
%\[ \Delta'\hat{\beta}_2\hat{\beta}_2'
%\Delta = \omega u_1^2 \] and \[\Delta'M_1\Delta =
%\sum_1^{k_2} u_i^2.\]
$ \Delta'\hat{\beta}_2\hat{\beta}_2'
\Delta = \omega u_1^2 $ and $ \Delta'M_1\Delta =
\sum_1^{k_2} u_i^2.$
Therefore, \begin{eqnarray} \label{gvrctwo}
\mbox{GVR}&=&\frac{K}{1+b\Delta'M_1\Delta
/\sigma_{\hat{y}_{01}}^2
}\\&=&\frac{K}{1+b\sum u^2_i
/\sigma_{\hat{y}_{01}}^2
}\\ &\leq &\frac{K}{1+bu^2_1/\sigma_{\hat{y}_{01}}^2}\\
&=&\frac{K}{1+\frac{b}{\omega}\Delta'\hat{\beta}_2'\hat{
\beta}_2
\Delta/\sigma_{\hat{y}_{01}}^2}.  \end{eqnarray}

\section{Examples}

    In the examples that follow we present several graphical and
numerical devices to compare predictive means and GVRs of various
models.  These include summary statistics, scatter plots of GVR and
$\Delta'\hat{\beta}_2/\sigma_{\hat{y}_{01}},$ K-L statistics, and
overlap statistics.  Before defining the latter, we note that Marriott et
al.\ \cite{Marriott} employ a somewhat different graphical device.
They present scatter plots of (in our notation)
$|y_i-E(y_i|y,X)|$
on $[\mbox{Var}(y_i|y,X)]^{1/2}.$  The expectation and variance for
each observation are
computed for several specifications of ARMA$(p,q)$ errors in a linear
regression model.  Models for which the points are close to zero
in both dimensions are considered preferable.

Our overlap statistic measures the extent to which 95\%
confidence
intervals computed at the sample $X$ values for a model
containing all
explanatory variables overlap confidence intervals from a
model
containing a subset.  Let a
confidence interval for the full model be $(a,b)$ and that
of the subset model
be $(c,d)$.  Then the overlap of the intervals is
$r/s,$ where $s=\max(b,d)-\min(a,c)$ and $r$ is
defined as:
\begin{equation}
r=\cases{d-c & if $c>a$ and $b>d,$\cr
        b-a & if $a>c$ and $d>b,$\cr
        \min(d-a,b-c) & otherwise.\cr}
%r = \left\{
%\begin{array}{llllll}
%d - c & \mbox{if} & c>a & \mbox{and}& b>d, \\
%b - a & \mbox{if} & a>c & \mbox{and} & d>b, \\
%\min(d-a,b-c) & \mbox{otherwise.}
% \end{array} \right. \end{equation}
\end{equation}
If $r/s$ is large, there is little
difference in the position and length of the confidence
intervals
generated by the models, and if it is small
there is a large difference. It is easy to see that $0\leq
r/s \leq 1.$
Note that the Bayesian and
sampling-theory confidence intervals coincide since we have
assumed diffuse
priors for the parameters.

\subsection{Simulated data}

Our first example is based on constructed data.  The model
is
\[ y_i = b_1  x_{1t} + x_{2t} + x_{3t} + x_{4t} + u_t, \;\;
 t=1,\ldots,30, \]
where $u_t$ is a pseudo-random normal variate with mean 0
and
variance 100;  $x_{jt}$ for $j=1,2,3,5$ are pseudo-random
variates with mean 100 and variance 400;  \[x_{4t} =
(x_{2t}+x_{3t})/2 + w_t,\] where $w_t$ is a pseudo-random
normal variate with mean 0 and variance 1/900; and $b_1$ takes
the values 0.1, 1.0, 2.0, and 5.0.  Note that $x_{5t}$ does not enter
the model generating the dependent variable but is included in the
regressions.

    For this example we use changes in predictive densities to compare
various models, rather than specify an $X_1$ and $X_2,$ since there is
no relevant theory.  We calculate the mean posterior prediction for each
observation, its standard error, the difference in the predictions from
the `full' model (using $x_1$ to $x_5$), the K-L statistics, and the
overlap statistics.  The various statistics are calculated for the full
model and for models with 1, 2, 3, and 4 variables deleted.  Three lists
of the variables were specified to determine the effects of deleting
variables in different orders (with the last variable deleted first).
They are:

\baselineskip\single
{\begin{enumerate}
\item $x_1,x_2,x_3,x_4,x_5$;\label{vdl1}
\item $x_1,x_2,x_5,x_3,x_4;$\label{vdl3}
\item $x_2,x_3,x_4,x_1,x_5$.\label{vdl2}
\end{enumerate}}
\baselineskip\double

Figure 1 (page \pageref{fig1}) shows the effects of
deleting variables from list \ref{vdl1} with $b_1=.1$; the
data points are labeled with the number of variables removed
from the model, and GVR is plotted against $\Delta'
\hat{\beta}_2/\sigma_{\hat{y}_{01}},$ the standardized
difference in the mean of the posterior distribution.  The
standardization is by the standard deviation of the reduced
model (hence the standardization changes between
models).
    From Figure~1a we see that the curve defined by (\ref{gvrcone}) is
clearly indicated; the figure also reveals only very small changes in
standardized means and a GVR that is close to one for most observations.
Figure~1b displays the effect of deleting 2 variables ($x_5$ and $x_4$),
while Figures~1c and 1d, respectively, show the additional effects of
removing $x_3$ and $x_2$.  Note the large changes in both the GVR and
the mean differences when first $x_3$ and then $x_2$ are deleted.  These
effects are consistent with the true model:  $x_5$ does not appear in
it, and $x_4$ is almost a linear combination of $x_2$ and $x_3.$

    In Figure~2a (page \pageref{fig2})
    we combine the graphs of Figure~1.  Although the scale
of the GVR axis is now compressed, the graph clearly reveals the slight
effect on both standardized mean difference and GVR from dropping first
$x_5$ and then $x_4.$ In contrast, the points labeled `3' (resp.\ `4')
show that removing $x_3$ $(x_2)$ has a noticeable effect.  Figure~2b
shows the effects of removing the variables in a different order.  It
can be clearly seen that removing $x_4$ has little effect; removing
$x_3$ has a noticeable effect; removing $x_5$ has no effect; and
removing $x_2$ has a pronounced effect.  Again, these results are
consistent with our knowledge of the true model.  In Figures~2c and 2d,
we set $b_1=5$ and compare deleting variables according to lists
\ref{vdl1} and \ref{vdl3}.  It can be seen that the size of the
coefficient of $x_1$ has no effect on results.  We shall see below,
however, that this coefficient does play a role when variables are
removed according to list \ref{vdl2}.

    In some applications it may be of interest to know whether
particular variables affect particular observations in an important way.
Accordingly, we present Figure~3 (page \pageref{fig3}), which repeats
Figure~2 except that the points are unlabeled and lines connect the
observations.  These graphs show how the GVR and normalized deviations
change for a particular observation as the model is varied.  For
example, the standardized difference of the observation on the far left
of each graph increases as more variables are removed, while the line on
the right shows that the standardized difference decreases for that
observation as the fourth variable is removed.

    Figure~4 (page \pageref{fig4})
compares the results of removing variables according to
lists \ref{vdl1} and \ref{vdl2}.  Figures~4a and 4c are identical to
Figures~1a and 1c.  Figures~4b and 4d reveal again what we already know:
removing $x_5$ and $x_4$ has little effect on the predictive
distribution.  Note, however, the role played by the coefficient of
$x_1.$ In Figure~4b, where the coefficient is 0.1, removing $x_1$ has
little effect, whereas Figure~4d clearly shows that removing the same
variable when the coefficient is 5.0 has a large effect.  Hence, the
effect depends both on the size of coefficient and the order in which
the variable is removed.

    We present summaries of the overlap statistics for various models in
Tables~\ref{overlap1} to \ref{overlap3}.  Table~\ref{overlap1} reveals
that the average overlap remains over .9 when $x_5$ and $x_4$ are
excluded, then drops precipitously when $x_3$ is excluded, and
drops somewhat less when $x_2$ is also excluded.  Table~\ref{overlap2}
indicates the sensitivity of results to the size of $b_1$:  the mean
overlap drops much more for $b_1$ equal to 5.0 then when it equals 0.1.
Table~\ref{overlap3} reveals a large effect on the average overlap from
excluding $x_3$ when $x_4$ has already been eliminated.

{\footnotesize
\begin{table}   \caption{Overlap statistics, List 1, All
$b_1$}
\label{overlap1}
\centering
\begin{tabular}{|r|r|r|r|r|}
%\hline\hline %\\[1ex]
\hline\hline
 & \multicolumn{4}{c|}{Omitted variables} \\
\hline
\multicolumn{1}{|c|}{Statistic}  & \multicolumn{1}{c|}{$x_5$} &
\multicolumn{1}{c|}{$x_4,x_5$} &
\multicolumn{1}{c|}{$x_3,x_4,x_5$} &
\multicolumn{1}{c|}{$x_2,x_3,x_4,x_5$}   \\
\hline
Mean   & 0.965446 & 0.923488 & 0.341212 & 0.223044\\
 Max & 0.979343 & 0.977273 & 0.373777 & 0.241762\\
 Q3 & 0.977273 & 0.951598 & 0.351753 & 0.234447\\
 Med & 0.971485 & 0.932216 & 0.349555 & 0.228276\\
 Q1 & 0.959846 & 0.901245 & 0.340885 & 0.223308\\
Min & 0.920340 & 0.823271 & 0.187333 & 0.126265\\
\hline\hline
\end{tabular}
\end{table}
}


{\footnotesize
\begin{table}\caption{Overlap statistics, List 2}
\label{overlap2}
\centering
\begin{tabular}{|r|r|r|r|r|}
%\\[1ex]
\hline\hline
 & \multicolumn{4}{c|}{Omitted variables} \\
\hline
\multicolumn{1}{|c|}{Statistic}  & \multicolumn{1}{c|}{$x_5$} &
\multicolumn{1}{c|}{$x_1,x_5$} &
\multicolumn{1}{c|}{$x_1,x_4,x_5$} &
\multicolumn{1}{c|}{$x_1,x_3,x_4,x_5$} \\
%&&$b_1=.1$ & & \\
\hline & \multicolumn{4}{|c|}{$b_1=.1$} \\
\hline
Mean   & 0.965446 & 0.7057 & 0.710283 & 0.313172\\
 Max & 0.979343 & 0.798292 & 0.816929 & 0.342713\\
 Q3 & 0.977273 & 0.763943 & 0.78585 & 0.331103\\
 Med & 0.971485 & 0.752413 & 0.765924 & 0.322065\\
 Q1 & 0.959846 & 0.666576 & 0.650816 & 0.316346\\
 Min & 0.92034 & 0.411592 & 0.450845 & 0.141848\\
%&&$b_1=5$ & & \\
\hline & \multicolumn{4}{|c|}{$b_1=5$} \\
\hline
Mean   & 0.965446 & 0.096244 & 0.098293 & 0.090441\\
 Max & 0.979343 & 0.103856 & 0.107239 & 0.100961\\
 Q3 & 0.977273 & 0.100083 & 0.104496 & 0.098135\\
 Med & 0.971485 & 0.098411 & 0.101674 & 0.095215\\
 Q1 & 0.959846 & 0.097233 & 0.100281 & 0.093283\\
Min & 0.92034 & 0.013165 & -0.01965 & -0.06969\\
\hline\hline
\end{tabular}
\end{table}
}

{\footnotesize
\begin{table} \caption{Overlap Statistics, List 3, All
$b_1$}
\label{overlap3}
\centering
\begin{tabular}{|r|r|r|r|r|r|}
%\\[1ex]
\hline\hline
& \multicolumn{4}{c|}{Omitted variables} \\
\hline
\multicolumn{1}{|c|}{Statistic}   & \multicolumn{1}{c|}{$x_4$} &
\multicolumn{1}{c|}{$x_3,x_4$} &
\multicolumn{1}{c|}{$x_3,x_4,x_5$} &
\multicolumn{1}{c|}{$x_2,x_3,x_4,x_5$}   \\
\hline
Mean   & 0.932858 & 0.347052 & 0.341212 & 0.223044\\
 Max & 0.990849 & 0.373699 & 0.373777 & 0.241762\\
 Q3 & 0.974193 & 0.353109 & 0.351753 & 0.234447\\
 Med & 0.940615 & 0.347558 & 0.349555 & 0.228276\\
 Q1 & 0.902701 & 0.343195 & 0.340885 & 0.223308\\
 Min & 0.823524 & 0.276110 & 0.187333 & 0.126265\\
\hline\hline
\end{tabular}
\end{table}
}

    Figure~5 (page \pageref{fig5})
plots the confidence intervals that were used to construct
the overlap statistics for $b_1=0.1$ and list 1. The horizontal axis is
an observation label, and the vertical axis is the confidence interval
for each data point.  There are 5 confidence intervals for each data
point---one for the full model, and one for each model deleting one,
two, three, or four variables.  The observations are ordered by
their mean prediction, which is labeled with a small horizontal bar.
Deletion of relevant variables should increase the confidence intervals,
and deletion of irrelevant ($x_5$) or collinear ($x_4$) variables should
not change the confidence intervals much.  The figure reveals that
deleting the first two variables in list \ref{vdl1} does not have much
effect, but deleting $x_3$ and then $x_2$ does, as is shown by the much
wider
confidence intervals.  The confidence interval graphs tell the same
story that we have seen in Figures~1, 2, and 3, and in the overlap
tables---influential variables show large effects.

    Although the K-L statistics of Tables~\ref{KLlist1}, \ref{KLlist2},
and
\ref{KLlist3} yield conclusions similar to those reported above about
the importance of the
variables, it is difficult to know what represents a large change in
these numbers since the K-L statistic has a minimum of 0 and no maximum.
In addition, the K-L statistic depends upon which model is treated as
the reference distribution.  (In these tables, KLA denotes the full
model treated as the reference, and KL1 denotes the submodel treated as
the reference.)  For comparison purposes we also present the variance of
the difference in the predicteds.  For all values of $b_1$ and list
\ref{vdl1} the KL statistics are contained in Table~\ref{KLlist1}.
{\footnotesize
\begin{table} \caption{K-L Statistics, List 1, All $b_1$}
\label{KLlist1} \centering
\begin{minipage}{4in}\begin{tabular}{|l|r|r|r|r|r|}
%\\[1ex]
\hline\hline
 \multicolumn{1}{|c|}{Omitted Variables} &
\multicolumn{1}{c|}{KLA\footnote{Reference distribution is full model.}}
&\multicolumn{1}{c|}{KL1\footnote{Reference distribution is partial
model.}}
&\multicolumn{1}{c|}{Variance\footnote{Variance of predicted $y_t.$}}
\\
\hline
$x_2, x_3, x_4,
x_5$ & 30890.56 & 61780.80 & 1992.89 \\ $x_3, x_4, x_5$ & 11571.12 &
23141.99 & 746.49 \\ $x_4, x_5$ & 60.92 & 121.68 & 3.91 \\ $x_5$ & 1.74
& 3.39 & 0.10
\\ \hline\hline \end{tabular}
\end{minipage} \end{table}
}
Table~\ref{KLlist2} indicates the effects of different values of
$b_1$.  For the third list, the values are the same across values
of $b_1$; see Table~\ref{KLlist3}. Note that both KL statistics and the
variance jump more than two orders of magnitude when an important
variable is dropped.  Thus, in this example all three measures provide
similar information.

{\footnotesize
 \begin{table} \caption{K-L
Statistics, List 2} \label{KLlist2} \centering\begin{minipage}{4in}
\begin{tabular}{|l|r|r|r|}
%\\[1ex]
\hline\hline
 \multicolumn{1}{|c|}{Omitted Variables} & \multicolumn{1}{c|}{KLA%
\footnote{Reference distribution is full model.}}
&\multicolumn{1}{c|}{KL1\footnote{Reference distribution is partial
model.}} &\multicolumn{1}{c|}{Variance
\footnote{Variance of predicted $y_t.$}} \\
\hline
& \multicolumn{3}{|c|}{$b_1=.5$} \\
\hline $x_1, x_3,x_4,x_5$ & 14652.74 & 29305.17
& 945.29 \\ $x_1,x_4,x_5$ & 1241.84 & 2483.43 & 80.08 \\ $x_1,x_5$ &
1201.03 & 2401.90 & 77.46 \\ $x_5$ & 1.74 & 3.39 & 0.10 \\ \hline
& \multicolumn{3}{|c|}{$b_1=1$}\\
\hline $x_1, x_3,x_4,x_5$ &20645.80 & 41291.28 &
1331.94 \\ $x_1,x_4,x_5$ &5449.11 & 10897.99 & 351.52 \\ $x_1,x_5$
&5424.98 & 10849.80 & 349.98 \\ $x_5$ &1.74 & 3.39 & 0.10 \\ \hline
& \multicolumn{3}{|c|}{$b_1=2$} \\
\hline $x_1, x_3,x_4,x_5$ &42112.03 & 84223.75 &
2716.86\\ $x_1,x_4,x_5$ &23009.03 & 46017.83 & 1484.42\\ $x_1,x_5$
&23005.10 & 46010.04 & 1484.18\\ $x_5$ &1.74 & 3.39 & 0.10\\ \hline
& \multicolumn{3}{|c|}{$b_1=5$} \\
\hline $x_1, x_3,x_4,x_5$ &182351.70 & 364703.00
& 11764.57\\ $x_1,x_4,x_5$ &148851.70 & 297703.10 & 9603.30\\ $x_1,x_5$
&148803.30 & 297606.40 & 9600.20\\ $x_5$ &1.74 & 3.39 & 0.10
\\ \hline \hline
\end{tabular} \end{minipage}\end{table}
}
 {\footnotesize
\begin{table} \caption{K-L Statistics, List 3,
All $b_1$} \label{KLlist3} \centering
\begin{minipage}{4in}
\begin{tabular}{|l|r|r|r|}
%\\[1ex]
\hline\hline
 \multicolumn{1}{|c|}{Omitted Variables} & \multicolumn{1}{c|}{KLA%
\footnote{Reference distribution is full model.}}
&\multicolumn{1}{c|}{KL1\footnote{Reference distribution is partial
model.}
} &\multicolumn{1}{c|}{Variance\footnote{Variance of predicted $y_t.$}
} \\
\hline \hline $x_2,x_3,x_4,x_5$
& 30890.56 & 61780.80 & 1992.89 \\ $x_3,x_4,x_5$ & 11571.12 & 23141.99 &
746.49 \\ $x_3,x_4$ & 10741.51 & 21482.87 & 692.98 \\ $x_4$ & 57.12 &
114.15 & 3.67 \\
\hline\hline \end{tabular}\end{minipage} \end{table}
}

    We conclude from this example that the nature of the model that
generated the data is revealed clearly by scatter plots of GVR and
$\Delta'\hat{\beta}_2,$ the overlap statistics and graphs, and the K-L
values. For the next two examples, we confine our attention to the
scatter plots and the overlap tables.

    \subsection{Investment Model} This section is based on the work of
Fazzari, Hubbard, and Petersen \cite{FHP} and Fazzari and Petersen
\cite{FP}.  The reader should examine these sources for further details;
we present only a brief summary of their model.  These authors are
concerned with the role of cash flow variables as explanations for firm
investment in new plant and equipment.  Taking a Tobin-$q$ hypothesis as
the basic model, they study whether the addition of a cash flow variable
(CF$_t$) helps to explain the fixed investment to capital ratio
$(I_t/K_t)$
in a cross-section of firms.  In addition, they examine a variant of the
basic model in which sales and lagged sales $(S_t, S_{t-1})$ are also
included.  Firms are grouped by their dividend payout ratios as a method
of determining which are likely to be cash constrained and therefore
particularly sensitive to cash flows.
    We utilize a sample of 443 observations, consisting of 37 to 48
firms over 10 years; these are firms that paid little or no dividends
during the year.\footnote{We are grateful to Professor Steven Fazzari
for making these data available to us.} Individual firm and time dummy
variables are included
to eliminate unobservable firm differences and common time effects, but
are not reported upon.  The $x_i$ values for within sample predictions
are therefore net of firm and time means.

    Tables~\ref{RegressionQ}--\ref{RegressionFull} present regression
results for several variants of the model.  Of particular interest is
the statistical significance of CF$_t$ in an equation that includes
$q_t,$
$S_t$, and $S_{t-1}.$ When we turn to the predictive analysis, however,
CF$_t$ seems to be somewhat less central.  Figures~6
and 7  (pages \pageref{fig6} and \pageref{fig7}) display GVR
against
standardized differences in means at the sample values of $X.$ As
above,
a `1', `2', or `3' represents an observation, and `1' indicates
that CF$_t$ has been removed from the full model, `2' indicates that
CF$_t$ and
$S_{t-1}$ have been removed, and `3' indicates that CF$_t$, $S_{t-1}$,
and
$S_t$ have been removed.  Note that there is little effect in either
predictive means or GVR when CF$_t$ is omitted.  For only two
observations
does the standardized predictive mean change by more than one standard
deviation.
It may be seen that removing $S_{t-1}$ has little effect on most
observations, but a considerable effect is evident when $S_t$
is removed. Thus,
although the coefficient of CF$_t$ is highly significant in the full
model, its presence in an equation that already contains $q_t$, $S_t$,
and
$S_{t-1}$ does not greatly affect the predictive densities evaluated at
the observed $X$ matrix.  Figure~7 reveals that one observation is
clearly an outlier.  It can also be
seen that for a number of observations GVR decreases upon removing
$S_{t-1}$ when CF$_t$ has already been removed.  By analogy to the
simulated data, it appears that $S_{t-1}$ for those observations is an
inconsequential variable.  Further analysis might be done to separate
those firms to determine how they differ from the rest of the firms
which have a GVR increase.

{\footnotesize
\begin{table}\caption{Regression analysis:  $q$ model} \centering
\begin{minipage}{5in} \label{RegressionQ}
\centering \begin{tabular}{|c|c|c|c|c|c|}
%\\[1ex]
\hline
%&
SSE & DFE & MSE & RMSE & RSQUARE \\
%Residual Error: &
 10.16142 & 384&0.023467&0.153191&0.248068
\\ \hline
 \end{tabular}
\\[1ex]
\centering \begin{tabular}{|c|c|c|c|c|}
\hline
%\\[1ex]
VARIABLE &    BETA&  STDERR&  TRATIO&   PROBT \\
\hline
$q_t$   &0.007861&0.000777&10.11823&       0 \\
\hline
\end{tabular}
\\[1ex]
\centering \begin{tabular}{|c|c|c|c|}
\hline
 %\\[1ex]
  \multicolumn{4}{|c|}{$\hat{\Gamma}_1$}
%\footnote{$X_1=q_t;X_2=(S_t, S_{t-1}, \mbox{CF}).$}
 \\
\hline
& $S_t$ & $S_{t-1}$ & $\mbox{CF}_t$ \\
%0.695275& 0.68904&0.054937 \\
%0.961287&0.452605&0.091129 \\
%0.75843 &0.343786&0.076406 \\
%0.868733& 0.77202&0.099975 \\
%0.612999&0.980354&0.015497 \\
%0.776785&0.526474&0.079069 \\
%0.749394&0.597553&0.098696 \\
%0.684606&0.444319&0.118719 \\
%0.418023&0.304587&0.107225 \\
$q_t$ & 0.095145&0.073975&0.008587 \\
\hline
\end{tabular}
\end{minipage}
\end{table}
}

{\footnotesize
\begin{table} \caption{Regression analysis: $q_t$ and $S_t$
model}    \centering
\begin{minipage}{5in}
\label{RegressionQS}

\centering \begin{tabular}{|c|c|c|c|c|c|}
\hline %\\[1ex]
%     &
SSE  &   DFE  &   MSE  &  RMSE &
RSQUARE \\
%Residual Error:&
8.369626  &   383&0.019374&0.139191&0.380658
\\ \hline \end{tabular}
 \\[1ex]

\centering
\begin{tabular}{|c|c|c|c|c|}
\hline %\\[1ex]
VARIABLE&     BETA&  STDERR&  TRATIO&   PROBT \\
$q_t$  & 0.003706&0.000828&4.477794&9.664E-6 \\
$S_t$ &  0.04367&0.004541&9.616853&       0 \\
\hline
\end{tabular}
 \\[1ex]

\centering
\begin{tabular}{|r|r|r|}\hline %\\[1ex]
\multicolumn{3}{|c|}{$\hat{\Gamma}_1$}
%\footnote{$X_1=(q_t, S_t);X_2=(S_{t-1}, \mbox{CF$_t$}).$}
\\
\hline
& $S_{t-1}$ & $\mbox{CF}_t$ \\
% 0.27901&0.015479 \\
% -0.1143&0.036574 \\
%-0.10349&0.033363 \\
%0.259695&0.050673 \\
%0.618845&-0.01929 \\
%0.068375&0.034984 \\
%0.155607&0.056166 \\
%0.040581&0.079866 \\
%0.058063&0.083501 \\
$q_t$ & 0.017864&0.003188 \\
$S_t$ & 0.589738&0.056753 \\
\hline
\end{tabular}
\end{minipage}
\end{table}
}

{\footnotesize
\begin{table} \caption{Regression analysis: $q_t$, $S_t$, and
$S_{t-1}$
model}  \label{RegressionQSLS} \centering
\begin{minipage}{5in}
\centering \begin{tabular}{|c|c|c|c|c|c|}\hline %\\[1ex]
%             &
SSE    & DFE &    MSE &   RMSE& RSQUARE
\\
%Residual Error:&
7.531083    & 382&0.017474&0.132187& 0.44271
\\ \hline
\end{tabular}
\\[1ex]

\centering
\begin{tabular}{|c|c|c|c|c|}\hline %\\[1ex]
VARIABLES &    BETA&  STDERR&  TRATIO&   PROBT \\
\hline
$q_t$   &0.004714&0.000799&5.897797&7.452E-9 \\
$S_t$ &0.076962&0.006457&11.91904&       0 \\
$S_{t-1}$&-0.05645&0.008149&-6.92744&1.57E-11 \\
\hline \end{tabular}
\\[1ex]

\centering
\begin{tabular}{|r|r|}
\hline
\multicolumn{2}{|c|}{$\hat{\Gamma}_1$}
%\footnote{$X_1=(q_t, S_t,S_{t-1}); X_2=$CF$_t$.}}
\\
\hline
& $\mbox{CF}_t $ \\
%0.044135    \\
%0.024834    \\
%0.022734    \\
%0.077345    \\
%0.044268    \\
%0.042007    \\
%0.072148    \\
%0.084034    \\
%0.089465    \\
$ q_t$ & 0.005022    \\
$S_t$ & 0.117323    \\
$S_{t-1}$ & -0.102710    \\
\hline
\end{tabular}
\end{minipage}
\end{table}
}

{\footnotesize
\begin{table} \caption{Regression analysis:  Full model}
\label{RegressionFull}
%     YSD=0.174854
\centering
     \begin{tabular}{|c|c|c|c|c|c|}\hline %\\[1ex]
% &
 SSE & DFE & MSE & RMSE
& RSQUARE \\
% Residual Error:&
7.066109 & 381& 0.016433&
0.128191&
0.477117 \\ \hline \end{tabular}
\\[1ex]
\centering
\begin{tabular}{|c|c|c|c|c|}\hline %\\[1ex]
 VARIABLES&BETA&STDERR& TRATIO& PROBT \\
\hline
$q_t$   & 0.003625 &0.000802 & 4.521252 &7.959E-6  \\
$S_t$ & 0.051514 &0.00788  & 6.537244 &1.78E-10  \\
$S_{t-1}$& -0.03418 &0.008944 &-3.821080 &.000152   \\
CF$_t$& 0.216901 &0.040776 & 5.319348 &1.679E-7  \\
\hline
\end{tabular}
\end{table}
}

{\footnotesize
\begin{table}
\caption{Overlap statistics, $q_t, S_t$ and $S_{t-1}$ model}
\label{overlapq}
\centering
\begin{tabular}{|cc|cc|}\hline\hline %\\[1ex]
 \multicolumn{2}{|c|}{N}       & \multicolumn{2}{c|}{443}
\\ \hline
%& Sum Wgts &      443   \\
 Mean    &  0.932002& Sum      & 412.8767   \\
 Std Dev &  0.069894& Variance & 0.004885   \\
 Skewness&  -4.42321& Kurtosis & 31.55166   \\
%USS     &  386.9609& CSS      & 2.159215   \\
%CV      &  7.499291& Std Mean & 0.003321   \\
%T:Mean=0&  280.6607& Prob$>|T|$ &   0.0      \\
%Sgn Rank&     49173& Prob$>|S|$ &   0.0001   \\
%Num = 0&       443&          &            \\
%W:Normal&  0.594224& Prob$<W$   &   0.0      \\
\hline\hline
\end{tabular}
\\[1ex]

            Quintiles

\begin{tabular}{|cc|}
\hline
  Max& 0.969835 \\
  Q3 & 0.969775 \\
  Med& 0.960839 \\
  Q1 & 0.919392 \\
  Min& 0.211606 \\
\hline
\end{tabular}
\end{table}
}

{\footnotesize
\begin{table}
\caption{Overlap statistics: $q_t$ and $S_t$ model}
\label{overlapqs}
\centering
\begin{tabular}{|c|c|c|c|}\hline\hline %\\[1ex]
 \multicolumn{2}{|c|}{N} & \multicolumn{2}{c|}{443} \\
\hline
%&Sum Wgts &      443  \\
 Mean     & 0.880007 &Sum      &  389.843  \\
 Std Dev  & 0.099071 &Variance & 0.009815  \\
 Skewness & -3.94582 &Kurtosis & 19.79354  \\
% USS      & 347.4028 &CSS      & 4.338295  \\
% CV       & 11.25803 &Std Mean & 0.004707  \\
% T:Mean=0 &  186.956 &Prob$>|T|$ &   0.0     \\
% Sgn Rank &    49173 &Prob$>|S|$ &   0.0001  \\
% Num = 0 &      443 &         &           \\
% W:Normal & 0.502279 &Prob$<W $  &   0.0     \\
\hline\hline
\end{tabular}
\\[1ex]

            Quintiles

    \begin{tabular}{|cc|}
\hline
 Max& 0.945768\\ Q3& 0.921175\\ Med&
0.921019\\
Q1& 0.885834\\ Min& 0.057009
\\ \hline
 \end{tabular} \end{table}
}

{\footnotesize
\begin{table}
\caption{Overlap statistics:  $q_t$ model}
\label{overlapqsls}
\centering
\begin{tabular}{|c|c|c|c|}\hline\hline %\\[1ex]
 N        &      443& Sum Wgts &      443  \\
 Mean     & 0.785133& Sum      & 347.8138  \\
 Std Dev  & 0.115624& Variance & 0.013369  \\
 Skewness & -3.11602& Kurtosis & 11.11508  \\
% USS      & 278.9891& CSS      & 5.909077  \\
% CV       &  14.7267& Std Mean & 0.005493  \\
% T:Mean=0 & 142.9211& Prob$>|T|$ &   0.0     \\
% Sgn Rank &    49173& Prob$>|S|$ &   0.0001  \\
% Num = 0 &      443&          &           \\
% W:Normal & 0.538343& Prob$<W$   &   0.0     \\
\hline \hline
\end{tabular}
\\[1ex]

      Quintiles

\begin{tabular}{|cc|}\hline %\\[1ex]
%\multicolumn{2}{|c|}{Quintiles}\\
  Max& 0.868775  \\
  Q3 & 0.837181  \\
  Med& 0.836919  \\
  Q1 & 0.788278  \\
  Min& 0.107075  \\
\hline
\end{tabular}
\end{table}
}

    The overlap statistics reported in
Tables~\ref{overlapq}--\ref{overlapqsls} suggest a similar conclusion.
Table~\ref{overlapq} compares confidence intervals based on the full
model and the full model without CF$_t$.  The average overlap is .93,
and the median is .96; 75\% of the observations yield an overlap greater
than .92.  These statistics suggest that the CF$_t$ variable
has a great effect on the confidence intervals for only a small number
of observations.

    On the substantive question of the importance of CF$_t$ for
investment
decisions, it should be noted that we did not examine the effects of
changing the deletion order of the variables.  Moreover, we have worked
only with those firms that paid low dividends; CF$_t$ may have appeared
to
be more influential if we had included firms with more varied dividend
behavior.

\subsection{Money Demand Model}

    In this section we examine a money demand model in the spirit of the
work of Cooley and LeRoy \cite{CooleyLeRoy}.  The object of their
research is to examine the extent to which the coefficients of focus
variables (i.e.\ those in $X_1$) depend on the other variables, termed
the ``doubtful variables,'' included in the equation. They take
as their example money demand over the period 1952.2 to 1978.4.  The
dependent variable is the logarithm of real money minus the logarithm of
real GNP.  The focus variables are the Treasury bill rate and the
savings and loan passbook rate.  The doubtful variables are real GNP,
the inflation rate, the real value of credit card transactions, real
wealth, and lagged values of these variables.  We were not able to
duplicate completely the variables used by these authors and estimate
over a slightly different time period.  Definitions of the variables are
presented in Table~\ref{moneydata}.

{\footnotesize
\begin{table} \centering
\caption{Definitions of money demand data (All variables
in logs)}
\label{moneydata} \begin{tabular}{|l|l|}\hline\hline %\\[1ex]
\multicolumn{1}{|c|}{Variable} & \multicolumn{1}{c|}{Definition}  \\\hline
 %\\[1ex]
RTB & Treasury Bill Rate  \\
RSD & Savings Deposit Rate  \\
RSD$_{\minus 1}$ & RSD lagged 1 period \\
RSD$_{\minus2}$ & RSD lagged $i$ periods \\
GP & Real Gross National Product  \\
$W$ & Real Wealth  \\
VCC & Value of credit card transactions  \\
PP & Inflation rate  \\
RTB$_{\minus 1}$ & RTB lagged 1 period  \\
RTB$_{\minus 2}$ & RTB lagged 2 periods \\
\hline\hline
  \end{tabular}
\end{table}
}

{\footnotesize
\begin{table} \caption{Coefficients of focus variables}
\label{focus}           \centering
\begin{tabular}{|l||r|r||r|r|}
\hline\hline
%\multicolumn{5}{c}{}\\
\multicolumn{3}{|c||}{} %\hrulefill
&  \multicolumn{2}{c|}{Sum of} \\
%\multicolumn{3}{|c||}{} \hline \\
\multicolumn{1}{|c||}{Variables}&
\multicolumn{2}{c||}{Coefficients of} &
%\multicolumn{2}{c|}{Sum of coefficients$^a$ }
\multicolumn{2}{c|}{Coefficients$^a$}
\\
\multicolumn{1}{|c||}{Included} & \multicolumn{1}{c|}{RTB} &
\multicolumn{1}{c||}{RSD}  & \multicolumn{1}{c|}{RTB}
& \multicolumn{1}{c|}{RSD} \\
\hline % &&&&\\
RTB and RSD & .0159 & \mbox{}-.0318 & .0159 & \mbox{}-.0318
\\
\hline % & & & & \\
Above and RSD$_{\minus 1}$ & .0155 & \mbox{}-.0914 & .0155 &
\mbox{}-.0315
\\ Above and RSD$_{\minus 2}$ & .0154 & \mbox{}-.0242 & .0154 &
\mbox{}-.0314
\\ Above and GP & .0210 &  \mbox{}-.0114 & .0210 & \mbox{}-.0070 \\
\hline % & & & & \\
Above and $W$ & \mbox{}-.0008 & .0013 & \mbox{}-.0008 & \mbox{}-.0077
\\
\hline %& & & &\\
Above and VCC & \mbox{}-.0040 & .0061 & \mbox{}-.0040 & \mbox{}-.0068
\\
Above and PP & \mbox{}-.0047 & .0049 & \mbox{}-.0047 & \mbox{}-.0069
\\
Above and RTB$_{\minus 1}$ & .0034 & .0044 & \mbox{}-.0064 &
\mbox{}-.0060
\\ Above and RTB$_{\minus 2}$ & .0046 & .0046 & .0045 & \mbox{}-.0061
\\ \hline\hline
\multicolumn{5}{l}{a. Current and Lagged Values}
\end{tabular} \end{table}
}

    Like Cooley and LeRoy, we are interested in how coefficients change
as the specification changes.  Rather than consider all possible
regressions, however, we present results in which the variables are
deleted from the bottom up in the order of the list in
Table~\ref{moneydata}.  As may be seen in Figure~8 (page \pageref{fig8}),
the variables seem
to fall into three groups as doubtful variables are deleted; within each
group the GVR is approximately constant, but it varies from group to
group.  Lagged values of RSD and RTB constitute the first group; PP is
the second group; and VCC, $W$, and GP is the third.  The effects of
changing specifications on the coefficients of the focus variables are
presented in Table~\ref{focus}.  The only consistent sign is that of the
sum of the coefficients of RSD, and its lagged values.

    Figure~8 reveals little change in the means of the predicted
densities as the specification is changed.  The first four variables
deleted (moving from the bottom up in Table~\ref{moneydata}) result in
changes that remain in the one-standard deviation range.  As $W$
and GP, RSD$_{\minus 2},$ and RSD$_{\minus 1}$ are deleted, the
differences in means remain in
the two-standard deviation band except for one observation.  Thus it is
clear that the mean of the predictive densities is little affected by
the inclusion of the doubtful variables.  Table~\ref{cooleyleroyoverlap}
shows that the overlap proportions vary sharply between groups (due to
the changes in standard errors), but not very greatly within groups.  A
sharp break occurs when $W$ is omitted, dropping the average overlap
from .814 to .502.

{\footnotesize
\begin{table} \centering \caption{Overlap means and
standard deviations as variables are deleted}\label{cooleyleroyoverlap}
\begin{tabular}{|l||r|r|}\hline\hline %\\[1ex]
 \multicolumn{1}{|c||}{Variables} &
\multicolumn{1}{c|}{Mean} &
%\multicolumn{1}{c|}{Standard Deviation} \\
\multicolumn{1}{c|}{S.D.} \\
\multicolumn{1}{|c||}{Deleted} && \\
\hline %& & \\
RTB$_{\minus 2}$ & .975 & .020 \\
Above and RTB$_{\minus 1}$ & .908 & .065 \\
Above and PP & .904 & .069 \\
Above and VCC & .814 & .085 \\
Above and $W$ & .502 &.074 \\
Above and GP & .407 & .053\\
Above and RSD$_{\minus 1}$ & .404 & .056 \\
Above and RSD$_{\minus 2}$ & .404 & .058
\\ \hline\hline
\end{tabular} \end{table}
}

    In conclusion, although Table~\ref{focus} reveals rather large
changes in coefficients as variables are added, we find that the
predicted means are not greatly affected.  We thus agree with Cooley and
Leroy that little can be said about values of the coefficients
of the focus variables
from these data.  It is, however, noteworthy that the length of
confidence intervals depends greatly on the doubtful variables because
the coefficient variances are quite sensitive to them.

    \section{Conclusions}

    The examples of the previous section suggest
that examination of changes in predicted means and of GVRs is a useful
supplement to other methods of investigating model specification.
Although these concepts arise naturally in the Bayesian view of
inference, they have sampling-theory interpretations and are related to
concepts that already appear in that literature.  The graphical tools we
have introduced, however, furnish useful information that does not
appear elsewhere.  In future work, we plan to investigate the usefulness
of the predictive approach to other aspects of model specification.

  \newpage
\baselineskip\single
% \bibliography{predict}
\begin{thebibliography}{10}

\bibitem{AitchisonDunsmore}
J.~Aitchison and I.~R. Dunsmore.
\newblock {\em Statistical Prediction Analysis}.
\newblock Cambridge University Press, Cambridge, 1975.

\bibitem{Amemiya}
Takeshi Amemiya.
\newblock Selection of regressors.
\newblock {\em International Economic Review}, 21(2):331--354, June 1980.

\bibitem{BKW}
David~A. Belsley, Edwin Kuh, and Roy~E. Welsch.
\newblock {\em Regression Diagnostics}.
\newblock John Wiley \& Sons, New York, 1980.

\bibitem{ClayGeisJenn}
Murray~K. Clayton, Seymour Geisser, and Dennis~E. Jennings.
\newblock A comparison of several model selection procedures.
\newblock In P.~Goel and A.~Zellner, editors, {\em Bayesian Inference and
  Decision Techniques}, chapter~27, pages 425--439. Elsevier Science Publishers
  B. V., Amsterdam, 1986.

\bibitem{CooleyLeRoy}
T.~F. Cooley and S.~F. LeRoy.
\newblock Identification and estimation of money demand.
\newblock {\em American Economic Review}, 71(5):825--844, December 1981.

\bibitem{FP}
Steven Fazzari and Bruce Petersen.
\newblock Investment smoothing with working capital: New evidence on the impact
  of financial constraints.
\newblock Washington University, 1992.

\bibitem{FHP}
Steven~M. Fazzari, R.~Glenn Hubbard, and Bruce~C. Petersen.
\newblock Financing constraints and corporate investment.
\newblock {\em Brookings Papers on Economic Activity}, (1):141--206, 1988.

\bibitem{GaverGeisel}
Kenneth~M. Gaver and Martin~S. Geisel.
\newblock Discriminating among alternative models: \mbox{B}ayesian and
  non-\mbox{B}ayesian methods.
\newblock In Paul Zarembka, editor, {\em Frontiers in Econometrics}, chapter
  Two, pages 49--77. Academic Press, New York, 1974.

\bibitem{Geis85}
S.~Geisser.
\newblock On the prediction of observables: A selective update.
\newblock In J.~M. Bernardo, M.~H. De\mbox{G}root, D.~V. Lindley, and A.~F.~M.
  Smith, editors, {\em Bayesian Statistics 2}, pages 203--230. Elsevier Science
  Publishers B. V. (North-Holland), New York, 1985.

\bibitem{GeisserEddy}
S.~Geisser and W.~F. Eddy.
\newblock A predictive approach to model selection.
\newblock {\em Journal of the American Statistical Association}, 74:153--160,
  1979.

\bibitem{JaynesWhat}
E.~T. Jaynes.
\newblock What is the question?
\newblock In R.~D. Rosenkrantz, editor, {\em E. T. Jaynes: Papers on
  Probability, Statistics and Statistical Physics}, pages 376--400. Kluwer
  Academic Publishers, Dordrecht, 1983.

\bibitem{JohnsGeis}
W.~Johnson and S.~Geisser.
\newblock A predictive view of the detection and characterization of
  influential observations in regression analysis.
\newblock {\em Journal of the American Statistical Association}, 78:137--144,
  1983.

\bibitem{Judgeetal}
George~G. Judge, W.~E. Griffiths, R.~Carter Hill, Helmut
  L$\ddot{\mbox{u}}$tkepohl, and Tsoung-Chao Lee.
\newblock {\em The Theory and Practice of Econometrics}.
\newblock John Wiley \& Sons, New York, second edition, 1985.

\bibitem{KL}
S.~Kullback and R.~A. Leibler.
\newblock On information and sufficiency.
\newblock {\em Annals of Mathematical Statistics}, 22:79--86, 1951.

\bibitem{Marriott}
John Marriott, Nalini Ravishanker, Alan Gelfand, and Jeffrey Pai.
\newblock Bayesian analysis of {ARMA} processes: Complete sampling based
  inference under full likelihoods.
\newblock \mbox{}.

\bibitem{s&s}
A.~F.~M. Smith and D.~J. Spiegelhalter.
\newblock Bayes factors and choice criteria for linear models.
\newblock {\em Journal of the Royal Statistical Society, B}, 42(2):213--220,
  1980.

\bibitem{Zellner}
Arnold Zellner.
\newblock {\em An Introduction to Bayesian Inference in Econometrics}.
\newblock John Wiley \& Sons, New York, 1971.

\bibitem{zellnerJBPO}
Arnold Zellner.
\newblock Jeffreys-{B}ayes posterior odds ratio and the {A}kaike information
  criterion for discriminating between models.
\newblock {\em Economics Letters}, 1:337--342, 1978.

\bibitem{zellnerPORF}
Arnold Zellner.
\newblock Posterior odds ratios for regression hypotheses: General
  considerations and some specific results.
\newblock In Arnold Zellner, editor, {\em Basic Issues in Econometrics},
  chapter 3.6, pages 275--305. University of Chicago Press, Chicago, 1984.

\bibitem{zellnersiowPORF}
Arnold Zellner and A.~Siow.
\newblock Posterior odds ratios for selected regression hypotheses.
\newblock In J.~M. Bernardo, M.~H. De{G}root, D.~V. Lindley, and A.~F.~M.
  Smith, editors, {\em Bayesian Statistics}, pages 585--603. University Press,
  Valencia, Spain, 1980.

\end{thebibliography}

\clearpage

%\end{document} % uncomment if you do not want/have the figures


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% BEGIN figures with epsf macros                                         %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%


\figspepsf

\epsfysize=6.3in \epsfbox{fig1.ps}
\label{fig1} \clearpage
\figspepsf

\epsfysize=6.3in \epsfbox{fig2.ps}
\label{fig2} \clearpage
\figspepsf

\epsfysize=6.3in \epsfbox{fig3.ps}
\label{fig3} \clearpage
\figspepsf

\epsfysize=6.3in \epsfbox{fig4.ps}
\label{fig4} \clearpage
\figspepsf

\epsfysize=6.3in \epsfbox{fig5.ps}
\label{fig5} \clearpage
\figspepsf

\epsfysize=6.3in \epsfbox{fig6.ps}
\label{fig6} \clearpage
\figspepsf

\epsfysize=6.3in \epsfbox{fig7.ps}
\label{fig7} \clearpage
\figspepsf

\epsfysize=6.3in \epsfbox{fig8.ps}
\label{fig8} \clearpage

  \end{document}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% END   figures with epsf macros                                         %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% BEGIN figures with psbox macros                                        %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\figsppsbox

$$\psboxscaled{750}{fig1.ps}$$
\label{fig1} \clearpage

\figsppsbox

$$\psboxscaled{750}{fig2.ps}$$
\label{fig2} \clearpage

\figsppsbox

$$\psboxscaled{750}{fig3.ps}$$
\label{fig3} \clearpage

\figsppsbox

$$\psboxscaled{750}{fig4.ps}$$
\label{fig4} \clearpage

\figsppsbox

$$\psboxscaled{750}{fig5.ps}$$
\label{fig5} \clearpage

\figsppsbox

$$\psboxscaled{750}{fig6.ps}$$
\label{fig6} \clearpage

\figsppsbox

$$\psboxscaled{750}{fig7.ps}$$
\label{fig7} \clearpage

\figsppsbox

$$\psboxscaled{750}{fig8.ps}$$
\label{fig8} \clearpage

\end{document}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% END   figures with psbox macros                                        %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% BEGIN figures with psfig macros                                        %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\vspace{2in}

\psfig{figure=fig1.ps,width=8.0in}

\label{fig1}
\clearpage

\vspace{2in}

\psfig{figure=fig2.ps,width=8.0in}
\label{fig2}
\clearpage

\vspace{2in}

\psfig{figure=fig3.ps,width=8.0in}
\label{fig3}
\clearpage

\vspace{2in}

\psfig{figure=fig4.ps,width=8.0in}
\label{fig4}
\clearpage

\vspace{2in}

\psfig{figure=fig5.ps,width=8.0in}
\label{fig5}
\clearpage

\vspace{2in}

\psfig{figure=fig6.ps,width=8.0in}
\label{fig6}
\clearpage

\vspace{2in}

\psfig{figure=fig7.ps,width=8.0in}
\label{fig7}
\clearpage

\vspace{2in}

\psfig{figure=fig8.ps,width=8.0in}
\label{fig8}

\end{document}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% END   figures with psfig macros                                        %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\end{document}