Question Details

[answered] Large-Scale Inference: Empirical Bayes Methods for Estimati


Can someone help me do Exercise 1.4 please? I attached the first chapter of the book here, the question is on page 16.

Exercise 1.4: (a) Use Equation (1.30) to verify Equation (1.31). (b) Use Equation?(1.31) to verify?Equation?(1.24).
Large-Scale Inference:

 

Empirical Bayes Methods for

 

Estimation, Testing and Prediction Bradley Efron

 

Stanford University Foreword At the risk of drastic oversimplification, the history of statistics as a recognized discipline can be divided into three eras:

 

1 The age of Quetelet and his successors, in which huge census-level data

 

sets were brought to bear on simple but important questions: Are there

 

more male than female births? Is the rate of insanity rising?

 

2 The classical period of Pearson, Fisher, Neyman, Hotelling, and their

 

successors, intellectual giants who developed a theory of optimal inference capable of wringing every drop of information out of a scientific

 

experiment. The questions dealt with still tended to be simple ? Is treatment A better than treatment B? ? but the new methods were suited to

 

the kinds of small data sets individual scientists might collect.

 

3 The era of scientific mass production, in which new technologies typified by the microarray allow a single team of scientists to produce data

 

sets of a size Quetelet would envy. But now the flood of data is accompanied by a deluge of questions, perhaps thousands of estimates or

 

hypothesis tests that the statistician is charged with answering together;

 

not at all what the classical masters had in mind.

 

The response to this onslaught of data has been a tremendous burst of

 

statistical methodology, impressively creative, showing an attractive ability

 

to come to grips with changed circumstances, and at the same time highly

 

speculative. There is plenty of methodology in what follows but that is

 

not the main theme of the book. My primary goal has been to ground the

 

methodology in familiar principles of statistical inference.

 

This is where the ?empirical Bayes? in my subtitle comes into consideration. By their nature, empirical Bayes arguments combine frequentist and

 

Bayesian elements in analyzing problems of repeated structure. Repeated

 

structures are just what scientific mass production excels at, e.g., expression levels comparing sick and healthy subjects for thousands of genes at

 

the same time by means of microarrays. At their best, the new methodoloiii iv Foreword gies are successful from both Bayes and frequentist viewpoints, which is

 

what my empirical Bayes arguments are intended to show.

 

False discovery rates, Benjamini and Hochberg?s seminal contribution,

 

is the great success story of the new methodology. Much of what follows is

 

an attempt to explain that success in empirical Bayes terms. FDR, indeed,

 

has strong credentials in both the Bayesian and frequentist camps, always

 

a good sign that we are on the right track, as well as a suggestion of fruitful

 

empirical Bayes explication.

 

The later chapters are at pains to show the limitations of current largescale statistical practice: Which cases should be combined in a single analysis? How do we account for notions of relevance between cases? What is

 

the correct null hypothesis? How do we handle correlations? Some helpful

 

theory is provided in answer but much of the argumentation is by example,

 

with graphs and figures playing a major role. The examples are real ones,

 

collected in a sometimes humbling decade of large-scale data analysis at

 

the Stanford School of Medicine and Department of Statistics. (My examples here are mainly biomedical, but of course that has nothing to do with

 

the basic ideas, which are presented with no prior medical or biological

 

knowledge assumed.)

 

In moving beyond the confines of classical statistics, we are also moving

 

outside its wall of protection. Fisher, Neyman, et al fashioned an almost

 

perfect inferential machine for small-scale estimation and testing problems.

 

It is hard to go wrong using maximum likelihood estimation or a t-test on

 

a typical small data set. I have found it very easy to go wrong with huge

 

data sets and thousands of questions to answer at once. Without claiming a

 

cure, I hope the various examples at least help identify the symptoms.

 

The classical era of statistics can itself be divided into two periods: the

 

first half of the 20th century during which basic theory was developed, and

 

then a great methodological expansion of that theory in the second half.

 

Empirical Bayes stands as a striking exception. Emerging in the 1950s in

 

two branches identified with Charles Stein and Herbert Robbins, it represented a genuinely new initiative in statistical theory. The Stein branch

 

concerned normal estimation theory while the Robbins branch was more

 

general, being applicable to both estimation and hypothesis testing.

 

Typical large-scale applications have been more concerned with testing

 

than estimation. If judged by chapter titles, the book seems to share this

 

imbalance but that is misleading. Empirical Bayes blurs the line between

 

testing and estimation as well as between frequentism and Bayesianism.

 

Much of what follows is an attempt to say how well we can estimate a testing procedure, for example how accurately can a null distribution be esti- Foreword v mated? The false discovery rate procedure itself strays far from the spirit

 

of classical hypothesis testing, as discussed in Chapter 4.

 

About this book: it is written for readers with at least a second course in

 

statistics as background. The mathematical level is not daunting ? mainly

 

multidimensional calculus, probability theory, and linear algebra ? though

 

certain parts are more intricate, particularly in Chapters 3 and 7 (which can

 

be scanned or skipped at first reading). There are almost no asymptotics.

 

Exercises are interspersed in the text as they arise (rather than being lumped

 

together at the end of chapters), where they mostly take the place of statements like ?It is easy to see . . . ? or ?It can be shown . . . ?. Citations are

 

concentrated in the Notes section at the end of each chapter. There are two

 

brief appendices, one listing basic facts about exponential families, the second concerning access to some of the programs and data sets featured in

 

the text.

 

I have perhaps abused the ?mono? in monograph by featuring methods

 

from my own work of the past decade. This is not a survey or a textbook

 

though I hope it can be used for a graduate-level lecture course. In fact, I

 

am not trying to sell any particular methodology, my main interest as stated

 

above being how the methods mesh with basic statistical theory.

 

There are at least three excellent books for readers who wish to see different points of view. Working backwards in time, Dudoit and van der

 

Laan?s 2009 Multiple Testing Procedures with Applications to Genomics

 

emphasizes the control of Type I error. It is a successor to Resamplingbased Multiple Testing: Examples and Methods for p-Value Adjustment

 

(Westfall and Young, 1993), which now looks far ahead of its time. Miller?s

 

classic text, Simultaneous Statistical Inference (1981), beautifully describes

 

the development of multiple testing before the era of large-scale data sets,

 

when ?multiple? meant somewhere between 2 and 10 problems, not thousands.

 

I chose the adjective large-scale to describe massive data analysis problems rather than ?multiple?, ?high-dimensional?, or ?simultaneous?, because of its bland neutrality with regard to estimation, testing, or prediction, as well as its lack of identification with specific methodologies. My

 

intention is not to have the last word here, and in fact I hope for and expect a healthy development of new ideas in dealing with the burgeoning

 

statistical problems of the 21st century. vi Foreword Acknowledgments

 

The Institute of Mathematical Statistics has begun an ambitious new monograph series in statistics, and I am grateful to the editors David Cox, XiaoLi Meng, and Susan Holmes for their encouragement, and for letting me

 

in on the ground floor. Diana Gillooly, the editor at Cambridge University

 

Press (now in its fifth century!) has been supportive, encouraging, and gentle in correcting my literary abuses. My colleague Elizabeth Halloran has

 

shown a sharp eye for faulty derivations and confused wording. Many of

 

my Stanford colleagues and students have helped greatly in the book?s final development, with Rob Tibshirani and Omkar Muralidharan deserving

 

special thanks. Most of all, I am grateful to my associate Cindy Kirby for

 

her tireless work in transforming my handwritten pages into the book you

 

see here. Bradley Efron

 

Department of Statistics

 

Stanford University

 

March 2010 Contents Foreword and Acknowledgements iii 1

 

1.1

 

1.2

 

1.3

 

1.4

 

1.5 Empirical Bayes and the James?Stein Estimator

 

Bayes Rule and Multivariate Normal Estimation

 

Empirical Bayes Estimation

 

Estimating the Individual Components

 

Learning from the Experience of Others

 

Empirical Bayes Confidence Intervals

 

Notes 1

 

2

 

4

 

7

 

10

 

12

 

14 2

 

2.1

 

2.2

 

2.3

 

2.4

 

2.5

 

2.6 Large-Scale Hypothesis Testing

 

A Microarray Example

 

Bayesian Approach

 

Empirical Bayes Estimates

 

Fdr(Z) as a Point Estimate

 

Independence versus Correlation

 

Learning from the Experience of Others II

 

Notes 15

 

15

 

17

 

20

 

22

 

26

 

27

 

28 3

 

3.1

 

3.2

 

3.3

 

3.4

 

3.5 Significance Testing Algorithms

 

p-Values and z-Values

 

Adjusted p-Values and the FWER

 

Stepwise Algorithms

 

Permutation Algorithms

 

Other Control Criteria

 

Notes 30

 

31

 

34

 

37

 

39

 

43

 

45 4

 

4.1

 

4.2

 

4.3 False Discovery Rate Control

 

True and False Discoveries

 

Benjamini and Hochberg?s FDR Control Algorithm

 

Empirical Bayes Interpretation 46

 

46

 

48

 

52 vii viii Contents 4.4

 

4.5

 

4.6 Is FDR Control ?Hypothesis Testing??

 

Variations on the Benjamini?Hochberg Algorithm

 

Fdr and Simultaneous Tests of Correlation

 

Notes 58

 

59

 

64

 

69 5

 

5.1

 

5.2

 

5.3

 

5.4 Local False Discovery Rates

 

Estimating the Local False Discovery Rate

 

Poisson Regression Estimates for f (z)

 

Inference and Local False Discovery Rates

 

Power Diagnostics

 

Notes 70

 

70

 

74

 

77

 

83

 

88 6

 

6.1

 

6.2

 

6.3

 

6.4

 

6.5 Theoretical, Permutation and Empirical Null Distributions

 

Four Examples

 

Empirical Null Estimation

 

The MLE Method for Empirical Null Estimation

 

Why the Theoretical Null May Fail

 

Permutation Null Distributions

 

Notes 89

 

90

 

97

 

102

 

105

 

109

 

112 7

 

7.1

 

7.2

 

7.3

 

7.4

 

7.5 Estimation Accuracy

 

Exact Covariance Formulas

 

Rms Approximations

 

Accuracy Calculations for General Statistics

 

The Non-Null Distribution of z-Values

 

Bootstrap Methods

 

Notes 113

 

115

 

121

 

126

 

132

 

138

 

139 8

 

8.1

 

8.2

 

8.3

 

8.4

 

8.5 Correlation Questions

 

Row and Column Correlations

 

Estimating the Root Mean Square Correlation

 

Are a Set of Microarrays Independent of Each Other?

 

Multivariate Normal Calculations

 

Count Correlations

 

Notes 141

 

141

 

145

 

149

 

153

 

159

 

161 9

 

9.1

 

9.2

 

9.3

 

9.4 Sets of Cases (Enrichment)

 

Randomization and Permutation

 

Efficient Choice of a Scoring Function

 

A Correlation Model

 

Local Averaging

 

Notes 163

 

164

 

170

 

174

 

181

 

184 Contents ix 10

 

10.1

 

10.2

 

10.3

 

10.4

 

10.5 Combination, Relevance, and Comparability

 

The Multi-Class Model

 

Small Subclasses and Enrichment

 

Relevance

 

Are Separate Analyses Legitimate?

 

Comparability

 

Notes 185

 

187

 

192

 

196

 

199

 

206

 

209 11

 

11.1

 

11.2

 

11.3

 

11.4

 

11.5 Prediction and Effect Size Estimation

 

A Simple Model

 

Bayes and Empirical Bayes Prediction Rules

 

Prediction and Local False Discovery Rates

 

Effect Size Estimation

 

The Missing Species Problem

 

Notes 211

 

213

 

217

 

223

 

227

 

233

 

240 Appendix A Exponential Families 243 Appendix B

 

Bibliography

 

Index Data Sets and Programs 249

 

251

 

258 1

 

Empirical Bayes and the James?Stein

 

Estimator

 

Charles Stein shocked the statistical world in 1955 with his proof that maximum likelihood estimation methods for Gaussian models, in common use

 

for more than a century, were inadmissible beyond simple one- or twodimensional situations. These methods are still in use, for good reasons,

 

but Stein-type estimators have pointed the way toward a radically different empirical Bayes approach to high-dimensional statistical inference. We

 

will be using empirical Bayes ideas for estimation, testing, and prediction,

 

beginning here with their path-breaking appearance in the James?Stein formulation.

 

Although the connection was not immediately recognized, Stein?s work

 

was half of an energetic post-war empirical Bayes initiative. The other

 

half, explicitly named ?empirical Bayes? by its principal developer Herbert Robbins, was less shocking but more general in scope, aiming to show

 

how frequentists could achieve full Bayesian efficiency in large-scale parallel studies. Large-scale parallel studies were rare in the 1950s, however,

 

and Robbins? theory did not have the applied impact of Stein?s shrinkage

 

estimators, which are useful in much smaller data sets.

 

All of this has changed in the 21st century. New scientific technologies, epitomized by the microarray, routinely produce studies of thousands

 

of parallel cases ? we will see several such studies in what follows ?

 

well-suited for the Robbins point of view. That view predominates in the

 

succeeding chapters, though not explicitly invoking Robbins? methodology

 

until the very last section of the book.

 

Stein?s theory concerns estimation whereas the Robbins branch of empirical Bayes allows for hypothesis testing, that is, for situations where

 

many or most of the true effects pile up at a specific point, usually called

 

0. Chapter 2 takes up large-scale hypothesis testing, where we will see, in

 

Section 2.6, that the two branches are intertwined. Empirical Bayes theory

 

blurs the distinction between estimation and testing as well as between fre1 Empirical Bayes and the James?Stein Estimator 2 quentist and Bayesian methods. This becomes clear in Chapter 2, where we

 

will undertake frequentist estimation of Bayesian hypothesis testing rules. 1.1 Bayes Rule and Multivariate Normal Estimation

 

This section provides a brief review of Bayes theorem as it applies to multivariate normal estimation. Bayes rule is one of those simple but profound

 

ideas that underlie statistical thinking. We can state it clearly in terms of

 

densities, though it applies just as well to discrete situations. An unknown

 

parameter vector ? with prior density g(?) gives rise to an observable data

 

vector z according to density f? (z ), ? ? g(?) and z |? ? f? (z ). (1.1) Bayes rule is a formula for the conditional density of ? having observed z

 

(its posterior distribution),

 

g(?|z ) = g(?) f? (z )/ f (z ) (1.2) where f (z ) is the marginal distribution of z ,

 

Z

 

g(?) f? (z ) d?,

 

f (z ) = (1.3) the integral being over all values of ?.

 

The hardest part of (1.2), calculating f (z ), is usually the least necessary. Most often it is sufficient to note that the posterior density g(?|z ) is

 

proportional to g(?) f? (z ), the product of the prior density g(?) and the

 

likelihood f? (z ) of ? given z . For any two possible parameter values ?1

 

and ?2 , (1.2) gives

 

g(?1 |z ) g(?1 ) f?1 (z )

 

=

 

,

 

(1.4)

 

g(?2 |z ) g(?2 ) f?2 (z )

 

that is, the posterior odds ratio is the prior odds ratio times the likelihood

 

ratio. Formula (1.2) is no more than a statement of the rule of conditional

 

probability but, as we will see, Bayes rule can have subtle and surprising

 

consequences.

 

Exercise 1.1 Suppose ? has a normal prior distribution with mean 0 and

 

variance A, while z given ? is normal with mean ? and variance 1,

 

? ? N(0, A) and z|? ? N(?, 1). (1.5) Show that

 

?|z ? N(Bz, B) where B = A/(A + 1). (1.6) 1.1 Bayes Rule and Multivariate Normal Estimation 3 Starting down the road to large-scale inference, suppose now we are

 

dealing with many versions of (1.5),

 

?i ? N(0, A) and zi |?i ? N(?i , 1) [i = 1, 2, . . . , N], (1.7) the (?i , zi ) pairs being independent of each other. Letting ? = (?1 , ?2 , . . . ,

 

?N )0 and z = (z1 , z2 , . . . , zN )0 , we can write this compactly using standard

 

notation for the N-dimensional normal distribution, ? ? NN (0, AI) (1.8) z |? ? NN (?, I), (1.9) and I the N ? N identity matrix. Then Bayes rule gives posterior distribution ?|z ? NN (Bz , BI) [B = A/(A + 1)], (1.10) this being (1.6) applied component-wise.

 

? = t(z ),

 

Having observed z we wish to estimate ? with some estimator ? ?? = (?? 1 , ?? 2 , . . . , ?? N )0 . (1.11) ?,

 

We use total squared error loss to measure the error of estimating ? by ?

 

? ) = k?

 

? ? ?k2 =

 

L (?, ? N

 

X (?? i ? ?i )2 (1.12) i=1 ?)

 

with the corresponding risk function being the expected value of L(?, ?

 

for a given ?,

 

n

 

o

 

? )} = E? kt(z ) ? ?k2 ,

 

R(?) = E? {L (?, ?

 

(1.13)

 

E? indicating expectation with respect to z ? NN (?, I), ? fixed.

 

The obvious estimator of ?, the one used implicitly in every regression

 

and ANOVA application, is z itself, ?? (MLE) = z , (1.14) the maximum likelihood estimator (MLE) of ? in model (1.9). This has

 

risk

 

R(MLE) (?) = N (1.15) for every choice of ?; every point in the parameter space is treated equally

 

? (MLE) , which seems reasonable for general estimation purposes.

 

by ? Empirical Bayes and the James?Stein Estimator 4 Suppose though we have prior belief (1.8) which says that ? lies more

 

or less near the origin 0. According to (1.10), the Bayes estimator is

 

!

 

1

 

(Bayes)

 

z,

 

(1.16)

 

??

 

= Bz = 1 ?

 

A+1

 

this being the choice that minimizes the expected squared error given z . If

 

? (Bayes) shrinks ?

 

? (MLE) halfway toward 0. It has risk

 

A = 1, for instance, ?

 

R(Bayes) (?) = (1 ? B)2 k?k2 + NB2 , (1.17) (1.13), and overall Bayes risk

 

n

 

o

 

A

 

R(Bayes)

 

= E A R(Bayes) (?) = N

 

,

 

A

 

A+1

 

E A indicating expectation with respect to ? ? NN (0, AI). (1.18) Exercise 1.2 Verify (1.17) and (1.18).

 

? (MLE) is

 

The corresponding Bayes risk for ?

 

R(MLE)

 

=N

 

A

 

? (Bayes) offers substantial

 

according to (1.15). If prior (1.8) is correct then ?

 

savings,

 

R(MLE)

 

? R(Bayes)

 

= N/(A + 1);

 

A

 

A (1.19) ? (Bayes) removes half the risk of ?

 

? (MLE) .

 

with A = 1, ? 1.2 Empirical Bayes Estimation

 

Suppose model (1.8) is correct but we don?t know the value of A so we

 

? (Bayes) . This is where empirical Bayes ideas make their appearcan?t use ?

 

ance. Assumptions (1.8), (1.9) imply that the marginal distribution of z

 

(integrating z ? NN (?, I) over ? ? NN (0, A ? I)) is z ? NN (0, (A + 1)I) . (1.20) The sum of squares S = kz k2 has a scaled chi-square distribution with N

 

degrees of freedom,

 

S ? (A + 1)?2N , (1.21) so that

 

(

 

E )

 

N?2

 

1

 

=

 

.

 

S

 

A+1 (1.22) 1.2 Empirical Bayes Estimation 5 Exercise 1.3 Verify (1.22).

 

The James?Stein estimator is defined to be

 

!

 

N?2

 

(JS)

 

z.

 

??

 

= 1?

 

S (1.23) ? (Bayes) with an unbiased estimator (N ? 2)/S substituting for

 

This is just ?

 

the unknown term 1/(A + 1) in (1.16). The name ?empirical Bayes? is sat? (JS) : the Bayes estimator (1.16) is itself being empirically

 

isfyingly apt for ?

 

estimated from the data. This is only possible because we have N similar

 

problems, zi ? N(?i , 1) for i = 1, 2, . . . , N, under simultaneous consideration.

 

It is not difficult to show that the overall Bayes risk of the James?Stein

 

estimator is

 

A

 

2

 

R(JS)

 

+

 

.

 

(1.24)

 

A = N

 

A+1 A+1

 

Of course this is bigger than the true Bayes risk (1.18), but the penalty is

 

surprisingly modest,

 

. (Bayes)

 

2

 

R(JS)

 

RA

 

=1+

 

.

 

(1.25)

 

A

 

N?A

 

For N = 10 and A = 1, R(JS)

 

A is only 20% greater than the true Bayes risk.

 

The shock the James?Stein estimator provided the statistical world didn?t

 

come from (1.24) or (1.25). These are based on the zero-centric Bayesian

 

? (0) = z , which

 

model (1.8), where the maximum likelihood estimator ?

 

doesn?t favor values of ? near 0, might be expected to be bested. The rude

 

surprise came from the theorem proved by James and Stein in 19611 :

 

Theorem For N ? 3, the James?Stein estimator everywhere dominates

 

? (0) in terms of expected total squared error; that is

 

the MLE ?

 

n

 

o

 

n

 

o

 

? (JS) ? ?k2 < E? k?

 

? (MLE) ? ?k2

 

E? k?

 

(1.26)

 

for every choice of ?.

 

Result (1.26) is frequentist rather that Bayesian ? it implies the supe? (JS) no matter what one?s prior beliefs about ? may be. Since

 

riority of ?

 

? (MLE) dominate popular statistical techniques such as linear

 

versions of ?

 

regression, its apparent uniform inferiority was a cause for alarm. The fact

 

that linear regression applications continue unabated reflects some virtues

 

? (MLE) discussed later.

 

of ?

 

1 Stein demonstrated in 1956 that ?? (0) could be everywhere improved. The specific form

 

(1.23) was developed with his student Willard James in 1961. Empirical Bayes and the James?Stein Estimator 6 A quick proof of the theorem begins with the identity

 

(?? i ? ?i )2 = (zi ? ?? i )2 ? (zi ? ?i )2 + 2 (?? i ? ?i ) (zi ? ?i ). (1.27) Summing (1.27) over i = 1, 2, . . . , N and taking expectations gives

 

N

 

X

 

o

 

n

 

n

 

o

 

? k2 ? N + 2

 

? ? ?k2 = E? kz ? ?

 

cov? (?? i , zi ) ,

 

E ? k? (1.28) i=1 where cov? indicates covariance under z ? NN (?, I). Integration by parts

 

involving the multivariate normal density function f? (z) = (2?)?N/2 exp{? 21

 

P

 

(zi ? ?i )2 } shows that

 

(

 

)

 

??? i

 

cov? (?? i , zi ) = E?

 

(1.29)

 

?zi

 

as long as ?? i is continuously differentiable in z . This reduces (1.28) to

 

(

 

)

 

N

 

X

 

n

 

o

 

??? i

 

2

 

2

 

? ? ?k = E? kz ? ?

 

?k ? N + 2

 

E? k?

 

E?

 

.

 

(1.30)

 

?zi

 

i=1

 

? (JS) (1.23) gives

 

Applying (1.30) to ?

 

)

 

(

 

 2 

 

(N ? 2)2 (JS) ?

 

E? ?

 

(1.31)

 

??

 

= N ? E?

 

S

 

P

 

with S = z2i as before. The last term in (1.31) is positive if N exceeds 2,

 

proving the theorem.

 

Exercise 1.4

 

(1.24). (a) Use (1.30) to verify (1.31). (b) Use (1.31) to verify The James?Stein estimator (1.23) shrinks each observed value zi toward

 

0. We don?t have to take 0 as the preferred shrinking point. A more general

 

version of (1.8), (1.9) begins with

 

ind ?i ? N(M, A) and ind zi |?i ? N(?i , ?20 ) (1.32) for i = 1, 2, . . . , N, where M and A are the mean and variance of the prior

 

distribution. Then (1.10) and (1.20) become

 



 



 



 



 

ind

 

ind

 

zi ? N M, A + ?20

 

and ?i |zi ? N M + B(zi ? M), B?20

 

(1.33)

 

for i = 1, 2, . . . , N, where

 

B= A

 

.

 

A + ?20 (1.34) 1.3 Estimating the Individual Components 7 Now Bayes rule ?? (Bayes)

 

= M + B(zi ? M) has James?Stein empirical Bayes

 

i

 

estimator

 

!

 

(N ? 3)?20

 

(JS)

 

?? i = z? + 1 ?

 

(zi ? z?),

 

(1.35)

 

S

 

P

 

P

 

with z? = zi /N and S = (zi ? z?)2 . The theorem remains true as stated,

 

except that we now require N ? 4.

 

? (JS) would be no more than an

 

If the difference in (1.26) were tiny then ?

 

? (JS)

 

interesting theoretical tidbit. In practice though, the gains from using ?

 

can be substantial, and even, in favorable circumstances, enormous.

 

Table 1.1 illustrates one such circumstance. The batting averages zi (number of successful hits divided by the number of tries) are shown for 18

 

major league baseball players early in the 1970 season. The true values ?i

 

are taken to be their averages over the remainder of the season, comprising

 

about 370 more ?at bats? each. We can imagine trying to predict the true

 

values from the early results, using either ?? (MLE)

 

= zi or the James?Stein

 

i

 

2

 

estimates (1.35) (with ?0 equal the binomial estimate z?(1??z)/45, z? = 0.265

 

the grand average2 ). The ratio of prediction errors is

 

18 

 

18 

 

X

 

2 , X

 

2

 

?? (JS)

 

?? (MLE)

 

?

 

?

 

? ?i = 0.28,

 

(1.36)

 

i

 

i

 

i

 

1 1 indicating a tremendous advantage for the empirical Bayes estimates.

 

The initial reaction to the Stein phenomena was a feeling of paradox:

 

Clemente, at the top of the table, is performing independently of Munson,

 

near the bottom. Why should Clemente?s good performance increase our

 

? (JS) (mainly by increasing z? in (1.35)),

 

prediction for Munson? It does for ?

 

? (MLE) . There is an implication of indirect evidence lurking

 

but not for ?

 

among the players, supplementing the direct evidence of each player?s

 

own average. Formal Bayesian theory supplies the extra evidence through

 

a prior distribution. Things are more mysterious for empirical Bayes methods, where the prior may exist only as a motivational device. 1.3 Estimating the Individual Components

 

Why haven?t James?Stein estimators displaced MLE?s in common statistical practice? The simulation study of Table 1.2 offers one answer. Here

 

N = 10, with the 10 ?i values shown in the first column; ?10 = 4 is much

 

2 The zi are binomial here, not normal, violating the conditions of the theorem, but the

 

James?Stein effect is quite insensitive to the exact probabilistic model. 8 Empirical Bayes and the James?Stein Estimator Table 1.1 Batting averages zi = ?? (MLE)

 

for 18 major league players early

 

i

 

in the 1970 season; ?i values are averages over the remainder of the

 

season. The James?Stein estimates ?? (JS)

 

(1.35) based on the zi values

 

i

 

provide much more accurate overall predictions for the ?i values. (By

 

coincidence, ?? i and ?i both average 0.265; the average of ?? (JS)

 

must equal

 

i

 

that of ?? (MLE)

 

.)

 

i

 

Name

 

Clemente

 

F Robinson

 

F Howard

 

Johnstone

 

Berry

 

Spencer

 

Kessinger

 

L Alvarado

 

Santo

 

Swoboda

 

Unser

 

Williams

 

Scott

 

Petrocelli

 

E Rodriguez

 

Campaneris

 

Munson

 

Alvis

 

Grand Average hits/AB ?? (MLE)

 

i ?i ?? (JS)

 

i 18/45

 

17/45

 

16/45

 

15/45

 

14/45

 

14/45

 

13/45

 

12/45

 

11/45

 

11/45

 

10/45

 

10/45

 

10/45

 

10/45

 

10/45

 

9/45

 

8/45

 

7/45 .400

 

.378

 

.356

 

.333

 

.311

 

.311

 

.289

 

.267

 

.244

 

.244

 

.222

 

.222

 

.222

 

.222

 

.222

 

.200

 

.178

 

.156 .346

 

.298

 

.276

 

.222

 

.273

 

.270

 

.263

 

.210

 

.269

 

.230

 

.264

 

.256

 

.303

 

.264

 

.226

 

.286

 

.316

 

.200 .294

 

.289

 

.285

 

.280

 

.275

 

.275

 

.270

 

.266

 

.261

 

.261

 

.256

 

.256

 

.256

 

.256

 

.256

 

.252

 

.247

 

.242 .265 .265 .265 different than the others. One thousand simulations of z ? N10 (?, I) each

 

? (MLE) = z and ?

 

? (JS) (1.23). Average squared errors for

 

gave estimates ?

 

each ?i are shown. For example (?? (MLE)

 

? ?1 )2 averaged 0.95 over the 1000

 

1

 

(JS)

 

simulations, comp...

 


Solution details:
STATUS
Answered
QUALITY
Approved
ANSWER RATING

This question was answered on: Sep 18, 2020

PRICE: $15

Solution~0001013064.zip (25.37 KB)

Buy this answer for only: $15

This attachment is locked

We have a ready expert answer for this paper which you can use for in-depth understanding, research editing or paraphrasing. You can buy it or order for a fresh, original and plagiarism-free copy from our tutoring website www.aceyourhomework.com (Deadline assured. Flexible pricing. TurnItIn Report provided)

Pay using PayPal (No PayPal account Required) or your credit card . All your purchases are securely protected by .
SiteLock

About this Question

STATUS

Answered

QUALITY

Approved

DATE ANSWERED

Sep 18, 2020

EXPERT

Tutor

ANSWER RATING

GET INSTANT HELP/h4>

We have top-notch tutors who can do your essay/homework for you at a reasonable cost and then you can simply use that essay as a template to build your own arguments.

You can also use these solutions:

  • As a reference for in-depth understanding of the subject.
  • As a source of ideas / reasoning for your own research (if properly referenced)
  • For editing and paraphrasing (check your institution's definition of plagiarism and recommended paraphrase).
This we believe is a better way of understanding a problem and makes use of the efficiency of time of the student.

NEW ASSIGNMENT HELP?

Order New Solution. Quick Turnaround

Click on the button below in order to Order for a New, Original and High-Quality Essay Solutions. New orders are original solutions and precise to your writing instruction requirements. Place a New Order using the button below.

WE GUARANTEE, THAT YOUR PAPER WILL BE WRITTEN FROM SCRATCH AND WITHIN YOUR SET DEADLINE.

Order Now