[answered] 12/15/2016 Statistics 506, Fall 2016 Statistics 506, Fall 2

Question: R programming (futures package)

Please help me answer this question. This question is about making simulation of "ALL SUBSETS SELECTION" in R using AIC, BIC or Adj. R2 statistics and use the package of "futures" to make the process concurrently. I attach the problem set in the attachment. I also attach my professor's lecture notes about how he use future package to make simulation in logistic regression faster. But i just don't know how to use the futures package in this "ALL SUBSETS SELECTION" problem. Please help me.

12/15/2016 Statistics 506, Fall 2016 Statistics 506, Fall 2016 Computational methods and tools in statistics

Problem set 5

Due December 18

There is only one exercise to complete here:

Suppose we are considering a linear regression with a response variable y, and independent variables x1,

x2, ?, xp. We can ??t the regression using all p of the covariates, or we can ??t the regression using any

subset of the covariates. ?All subsets selection? is a technique that ??ts every possible submodel, and

selects the submodel with the best ??t.

In this exercise you should implement all subsets selection using the R futures package to obtain the ??ts

concurrently. Hint on how to cycle through all possible models: Represent a particular submodel using a vector of p 0?s

and 1?s, where a 0 indicates that the corresponding variable is not included and a 1 indicates that the

variable is included. To visit all the models, view these binary vectors as representing integers in base 2

form. If you add 1 (in base 2) to each vector, you will obtain a new model, until you have visited all the

models, at which point the process will repeat. For example, if there are three variables, the models are

represented as triples (a, b, c), corresponding to the integer a*4 + b*2 + c. The ??rst model is 0 = (0, 0, 0),

meaning that none of the variables are included (always include the intercept, which R does by default).

Adding 1 to this model in base 2 gives us 1 = (0, 0, 1); adding 1 again gives us 2 = (0, 1, 0), and so on.

You can use any appropriate model selection statistic to de??ne which model is ?best?. Appropriate

choices would be AIC, BIC, or adjusted R^2, but not the unadjusted R^2. These are all immediately

available in R.

You should implement a con??gurable cap on the number of processes that can be run concurrently (as in

the examples in the notes).

To demonstrate your code, simulate data from a single population of your choice. Run the procedure on

regarding the model that is selected, and the time that was required for the computation to complete.

? Kerby Shedden. Some rights reserved; please attribute properly and link back. http://dept.stat.lsa.umich.edu/~kshedden/Courses/Stat506/ps5/ 1/1

Solution details:
STATUS
QUALITY
Approved

This question was answered on: Sep 18, 2020

Solution~0001001750.zip (25.37 KB)

This attachment is locked

We have a ready expert answer for this paper which you can use for in-depth understanding, research editing or paraphrasing. You can buy it or order for a fresh, original and plagiarism-free copy from our tutoring website www.aceyourhomework.com (Deadline assured. Flexible pricing. TurnItIn Report provided)

STATUS

QUALITY

Approved

Sep 18, 2020

EXPERT

Tutor