
<?xml version="1.0" encoding="utf-8"?>
<!-- generator="FeedCreator 1.7.2-ppt DokuWiki" -->
<?xml-stylesheet href="https://www2.math.binghamton.edu/lib/exe/css.php?s=feed" type="text/css"?>
<feed xmlns="http://www.w3.org/2005/Atom">
    <title>Department of Mathematics and Statistics, Binghamton University people:qiao:teach:448</title>
    <subtitle></subtitle>
    <link rel="alternate" type="text/html" href="https://www2.math.binghamton.edu/"/>
    <id>https://www2.math.binghamton.edu/</id>
    <updated>2026-04-12T17:38:04-04:00</updated>
    <generator>FeedCreator 1.7.2-ppt DokuWiki</generator>
<link rel="self" type="application/atom+xml" href="https://www2.math.binghamton.edu/feed.php" />
    <entry>
        <title>(Archive) Math 448 Computing Homework (Fall 2015)</title>
        <link rel="alternate" type="text/html" href="https://www2.math.binghamton.edu/p/people/qiao/teach/448/448_cp"/>
        <published>2016-01-24T18:45:17-04:00</published>
        <updated>2016-01-24T18:45:17-04:00</updated>
        <id>https://www2.math.binghamton.edu/p/people/qiao/teach/448/448_cp</id>
        <summary>
&lt;p&gt;
Read &lt;a href=&quot;https://www2.math.binghamton.edu/p/people/gang/cp_sol&quot; class=&quot;wikilink2&quot; title=&quot;people:gang:cp_sol&quot; rel=&quot;nofollow&quot;&gt;solutions to previous computing homework&lt;/a&gt;.
&lt;/p&gt;

&lt;h1 class=&quot;sectionedit1&quot; id=&quot;archive_math_448_computing_homework_fall_2015&quot;&gt;(Archive) Math 448 Computing Homework (Fall 2015)&lt;/h1&gt;
&lt;div class=&quot;level1&quot;&gt;

&lt;p&gt;
&lt;em class=&quot;u&quot;&gt;Due: 8 pm, November 5.&lt;/em&gt;
&lt;/p&gt;

&lt;p&gt;
Each of you will receive two rows of 0 or 1&amp;#039;s. Each row has 31 observations (0 or 1&amp;#039;s). It is assumed that the two rows correspond to two samples from two populations, and each observation follows Bernoulli distribution with probability $p_1$ for Population 1 or $p_2$ for Population 2. We will conduct a hypothesis test to tell whether $p_1\neq p_2$. Define a new parameter $\theta=p_1-p_2$.
&lt;/p&gt;

&lt;p&gt;
If you use the following R code you can extract the two rows and save them as two vectors.
&lt;/p&gt;
&lt;pre class=&quot;code&quot;&gt;  ####Set the R working directory###
   setwd(&amp;quot;C:/&amp;quot;) 
  ###read in the data file, make sure your data file is under the working directory## 
  dat = read.csv(&amp;#039;data_6.csv&amp;#039;,header=FALSE) 
  ### the &amp;quot;dat&amp;quot; you just read into R is a data frame, need to convert it into a matrix
  dat &amp;lt;- as.matrix(dat) ##dat now is a 2x31 matrix
  x1 &amp;lt;- dat[1,]  ##take the first row of this matrix as your sample 1
  x2 &amp;lt;- dat[2,]  ##take the second row of this matrix as your sample 2&lt;/pre&gt;
&lt;ol&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; Suppose we use the test statistic introduced in class to build the test, namely $$TS=\frac{\hat \theta - \theta_0}{SE(\hat \theta)}.$$ What is the probability of making a Type I error if we use $|TS|&amp;gt;1$ as the rejection rule? Hint: what distribution does TS approximately have under the null hypothesis? Note that you do not need to make use of the data that you have received to answer this question.&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; Suppose we use $|TS|&amp;gt;t$ (where $t&amp;gt;0$) as the rejection rule. Determine the value of $t$ if we want to control the probability of making a Type I error at 0.10. Note that you do not need to make use of the data that you have received to answer this question.&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; Suppose we approximate the $SE(\hat \theta)$ by $\sqrt{\hat p_1(1-\hat p_1)/n_1+\hat p_2(1-\hat p_2)/n_2}$. Then calculate the observed value of the TS using the data given.&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; Make a decision/conclusion with significance level 0.10 using the TS whose denominator is approximated as in Question 3.&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; Suppose we approximate the $SE(\hat \theta)$ by $\sqrt{\hat p(1-\hat p)/n_1+\hat p(1-\hat p)/n_2}$, where $\hat p$ is the pooled sample proportion, defined as $\hat p=(Y_1+Y_2)/(n_1+n_2)$. Then calculate the observed value of the TS using the data given.&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; Make a decision/conclusion with significance level 0.10 using the TS whose denominator is approximated as in Question 5.&lt;/div&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;/div&gt;
&lt;!-- EDIT1 SECTION &quot;(Archive) Math 448 Computing Homework (Fall 2015)&quot; [72-2490] --&gt;
&lt;h1 class=&quot;sectionedit2&quot; id=&quot;computing_homework_5&quot;&gt;Computing Homework 5&lt;/h1&gt;
&lt;div class=&quot;level1&quot;&gt;

&lt;p&gt;
&lt;em class=&quot;u&quot;&gt;Due: 8 pm, October 18.&lt;/em&gt;
&lt;/p&gt;

&lt;p&gt;
Each of you receive 20 numbers. It is known that they are from an exponential distribution with mean parameter $\theta$. Answer the following questions.
&lt;/p&gt;
&lt;ol&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; Report the Method of Moment estimate of $\theta$ by matching the population &lt;em class=&quot;u&quot;&gt;first&lt;/em&gt; moment and sample &lt;em class=&quot;u&quot;&gt;first&lt;/em&gt; moment.&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; Report the Method of Moment estimate of $\theta$ by matching the population &lt;em class=&quot;u&quot;&gt;second&lt;/em&gt; moment and sample &lt;em class=&quot;u&quot;&gt;second&lt;/em&gt; moment.&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; Report the Minimum Variance Unbiased Estimate of $\theta$.&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; Now let&amp;#039;s consider a new parameter $\psi=\theta^2$. Report the Method of Moment estimate of $\psi$ by matching the population &lt;em class=&quot;u&quot;&gt;first&lt;/em&gt; moment and sample &lt;em class=&quot;u&quot;&gt;first&lt;/em&gt; moment.&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; Report the Method of Moment estimate of $\psi$ by matching the population &lt;em class=&quot;u&quot;&gt;second&lt;/em&gt; moment and sample &lt;em class=&quot;u&quot;&gt;second&lt;/em&gt; moment.&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; Report the Minimum Variance Unbiased Estimate of $\psi$.&lt;/div&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;
In Questions 3 and 6, you need to derive the sufficient statistic for the parameter ($\theta$ or $\psi$), calculate its bias, and correct the bias by applying some transformations to the sufficient statistics if necessary.
&lt;/p&gt;
&lt;!-- EDIT3 PLUGIN_WRAP_START [0-] --&gt;&lt;div class=&quot;wrap_center wrap_round wrap_important plugin_wrap&quot; style=&quot;width:60%;&quot;&gt;
&lt;p&gt;
Note: in practice, we discourage people from using the method in Questions 2 and 5, compared to that in Questions 1 and 4. This exercise is merely for an illustration why would this be the case.
&lt;/p&gt;
&lt;/div&gt;&lt;!-- EDIT4 PLUGIN_WRAP_END [0-] --&gt;
&lt;/div&gt;
&lt;!-- EDIT2 SECTION &quot;Computing Homework 5&quot; [2491-3852] --&gt;
&lt;h1 class=&quot;sectionedit5&quot; id=&quot;computing_homework_4&quot;&gt;Computing Homework 4&lt;/h1&gt;
&lt;div class=&quot;level1&quot;&gt;

&lt;p&gt;
&lt;em class=&quot;u&quot;&gt;Due: 8 pm, September 27.&lt;/em&gt;
&lt;/p&gt;

&lt;p&gt;
Each of you receive 255 numbers, denoted as $X_1,\dots,X_{255}$, all of which follow a normal distribution with an &lt;strong&gt;unknown&lt;/strong&gt; mean and an &lt;strong&gt;unknown&lt;/strong&gt; variance. &lt;em class=&quot;u&quot;&gt;&lt;strong&gt;Please read following questions carefully. Note that not all numbers will be used!!!&lt;/strong&gt;&lt;/em&gt;
&lt;/p&gt;

&lt;p&gt;
The goals include finding a point estimator and a confidence interval for $\mu$ with good accuracy.
&lt;/p&gt;
&lt;ol&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; Choose the first 10 observations, $X_1,\dots,X_{10}$, as your sample. Report an estimate of the unknown population variance. In R, the following command extracts the first 10 elements of vector &lt;code&gt;x&lt;/code&gt; and save them as a new vector &lt;code&gt;y&lt;/code&gt;.&lt;pre class=&quot;code&quot;&gt;y=x[1:10]&lt;/pre&gt;

&lt;p&gt;
Recall Computing Homework 2 on how to calculate the sample variance. Parts (2)-(4) in that homework gave you the numerator of the sample variance formula.
&lt;/p&gt;
&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level2&quot;&gt;&lt;div class=&quot;li&quot;&gt; &lt;span class=&quot;wrap_hi &quot;&gt;Pretend that 10 is a large number&lt;/span&gt; (although it is not that large, let&amp;#039;s pretend it that way for now.) Construct a 90% confidence interval using only the first 10 observations (the sample mean is based on the 10 observations, and the sample standard deviation is also based on the 10 observations.) &lt;span class=&quot;wrap_hi &quot;&gt;Report the two-sided confidence interval: lower bound in (2a) and upper bound in (2b).&lt;/span&gt; &lt;span class=&quot;wrap_hi &quot;&gt;For this question, ignore the materials in Section 8.8. That is, please use the formula in Section 8.6.&lt;/span&gt;&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level2&quot;&gt;&lt;div class=&quot;li&quot;&gt; The confidence interval obtained above does not provide a lot of information about  $\mu$ since it is too wide. In addition, the sample mean based on only 10 observations is also unlikely to be accurate. &lt;span class=&quot;wrap_hi &quot;&gt;&lt;strong&gt;In this problem, we want to find an estimator (sample mean) whose error of estimation is no greater than 0.5 with a probability of 99%.&lt;/strong&gt;&lt;/span&gt; One way to achieve this goal is to increase your sample size. Your answer in Question 1 above (based on 10 data points) gives you an estimate to the true population variance of random variables $X_i$&amp;#039;s. Now calculate the minimum sample size (i.e. number of observations) needed to achieve the desired accuracy (that is, the error of estimation has to be less than 0.5 with a probability 0.99). Round your answer up as an integer. Denote the required sample size as $n$. &lt;span class=&quot;wrap_hi &quot;&gt;Report the total cost of collecting these $n$ observations&lt;/span&gt; (remember: each observation costs $\$$12 dollars.)&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level2&quot;&gt;&lt;div class=&quot;li&quot;&gt; Use the first $n$ data points in the data set that you have received, and use these $n$ observations to calculate an unbiased point estimate for $\mu$. Recall the command &lt;code&gt;x[1:n]&lt;/code&gt;.&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level2&quot;&gt;&lt;div class=&quot;li&quot;&gt; Again, use the first $n$ data points, provide a 90% confidence interval for $\mu$. Note that with $n&amp;gt;10$ observations in your hand (which you have paid for $\$$12 each), you can get a more accurate estimate of the population variance. Remember to use the $n$ data points to calculate a new sample mean and a new standard error. &lt;span class=&quot;wrap_hi &quot;&gt;Report the 90% two-sided confidence interval, the lower bond in (5a) and the upper bound in (5b).&lt;/span&gt; &lt;/div&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;!-- EDIT6 PLUGIN_WRAP_START [0-] --&gt;&lt;div class=&quot;wrap_center wrap_round wrap_info plugin_wrap&quot; style=&quot;width:70%;&quot;&gt;
&lt;p&gt;
To find the percentile point of a standard normal distribution, do not use the SOA normal table or Table 4 in the text book. Instead, you can use command &lt;code&gt;z=qnorm(0.995)&lt;/code&gt;, &lt;code&gt;z=qnorm(0.975)&lt;/code&gt;, &lt;code&gt;z=qnorm(0.95)&lt;/code&gt;, etc, to find the percentile point. This request is to help us grade your answer numerically. For example, &lt;code&gt;qnorm(0.975)&lt;/code&gt; gives 1.959964, which corresponds to 1.96 in the normal tables. If you use 1.96 instead of 1.959964, your answer may be mistakenly graded as incorrect. Type in &lt;code&gt;?qnorm&lt;/code&gt; in R for more information on the function &lt;code&gt;qnorm()&lt;/code&gt;.
&lt;/p&gt;
&lt;/div&gt;&lt;!-- EDIT7 PLUGIN_WRAP_END [0-] --&gt;
&lt;/div&gt;
&lt;!-- EDIT5 SECTION &quot;Computing Homework 4&quot; [3853-7441] --&gt;
&lt;h1 class=&quot;sectionedit8&quot; id=&quot;computing_homework_3&quot;&gt;Computing Homework 3&lt;/h1&gt;
&lt;div class=&quot;level1&quot;&gt;

&lt;p&gt;
Each of you receive 200 numbers, denoted as $X_1,\dots,X_{n}$ where $n=200$. It is known that $X_i\sim Unif(0,\theta)$ for $\theta$ unknown.
&lt;/p&gt;
&lt;ol&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; report the maximum of the 200 numbers.&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; slightly adjust the maximum of the 200 numbers so that it becomes an unbiased estimator for $\theta$, then report the realized value based on the data.&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; report the mean of the 200 numbers.&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; slightly adjust the mean of the 200 numbers so that it becomes an unbiased estimator for $\theta$, then report the realized value based on the data.&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; use the pivotal method to find a 95% confidence interval for $\theta$. The pivotal quantity is based on the maximum of the 200 numbers. See Ex. 8.43. Then report the confidence lower and upper limits as the answers to (5a) and (5b) in the Google Form.&lt;/div&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;/div&gt;
&lt;!-- EDIT8 SECTION &quot;Computing Homework 3&quot; [7442-8268] --&gt;
&lt;h1 class=&quot;sectionedit9&quot; id=&quot;computing_homework_2&quot;&gt;Computing Homework 2&lt;/h1&gt;
&lt;div class=&quot;level1&quot;&gt;

&lt;p&gt;
Each of you receive 1000 numbers, denoted as $X_1,\dots,X_{n}$ where $n=1000$
&lt;/p&gt;
&lt;ol&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; calculate the sum of the squares: $\sum_{i=1}^{n}X_i^2$&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; calculate $\sum_{i=1}^{n}(X_i-\bar X)^2$, where $\bar X$ is the sample mean.&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; calculate $\sum_{i=1}^{n}X_i^2-n(\bar X)^2$&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; use R command &lt;code&gt;var(x)&lt;/code&gt; to calculate the following: &lt;code&gt;var(x)*(n-1)&lt;/code&gt;, where &lt;code&gt;x&lt;/code&gt; is the vector of your sample (the 1000 numbers in the data file).&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; Suppose $X_1,\dots,X_{n}$ follow some distribution with mean $\mu$. &lt;/div&gt;
&lt;ol&gt;
&lt;li class=&quot;level2&quot;&gt;&lt;div class=&quot;li&quot;&gt; Based on your sample, give an unbiased estimate of $\mu$&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level2&quot;&gt;&lt;div class=&quot;li&quot;&gt; &lt;strong&gt;(Bonus points)&lt;/strong&gt; Based on your sample, provide the standard error of the estimator.&lt;/div&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;!-- EDIT10 PLUGIN_WRAP_START [0-] --&gt;&lt;div class=&quot;wrap_center wrap_round wrap_info plugin_wrap&quot; style=&quot;width:60%;&quot;&gt;
&lt;p&gt;
Here are some R codes which might help you
&lt;/p&gt;
&lt;/div&gt;&lt;!-- EDIT11 PLUGIN_WRAP_END [0-] --&gt;
&lt;p&gt;
Please install R before the beginning of the semester. In addition to R, some may find RStudio to be handy.
Downloads:
&lt;/p&gt;
&lt;ul&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; &lt;a href=&quot;http://cran.cnr.berkeley.edu/&quot; class=&quot;urlextern&quot; title=&quot;http://cran.cnr.berkeley.edu/&quot;&gt;R&lt;/a&gt; - mirror hosted at UC Berkeley. For Windows machines, use the “base” binaries  for the time being.&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; &lt;a href=&quot;http://www.rstudio.com/products/rstudio/download/&quot; class=&quot;urlextern&quot; title=&quot;http://www.rstudio.com/products/rstudio/download/&quot;&gt;R Studio&lt;/a&gt; - a more user friendly platform for R.&lt;/div&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;pre class=&quot;code&quot;&gt;####Set the R working directory###
 setwd(&amp;quot;C:/&amp;quot;) 
###read in the data file, make sure your data file is under the working directory## 
dat = read.csv(&amp;#039;data_2.csv&amp;#039;,header=FALSE) 
### the &amp;quot;dat&amp;quot; you just read into R is a data frame, need to convert it into a matrix
dat &amp;lt;- as.matrix(dat) ##dat now is a 1000x1 matrix
x &amp;lt;- dat[1,]  ##take the first row of this matrix as your sample&lt;/pre&gt;

&lt;p&gt;
—At this point, you have loaded the data into your R program, whose name is “x”. You can start manipulating the data–
&lt;/p&gt;
&lt;pre class=&quot;code&quot;&gt;###Assign a value to a variable &amp;quot;n&amp;quot;
n &amp;lt;- 1000
###Find the length of a vector &amp;quot;x&amp;quot;
n &amp;lt;- length(x)
#### find the average of all elements in vector &amp;quot;x&amp;quot; and assign this value to &amp;quot;xbar&amp;quot; ######
xbar &amp;lt;- mean(x)
#### find the summation of all data points in vector &amp;quot;x&amp;quot; and assign this value to &amp;quot;Sx&amp;quot;
Sx &amp;lt;- sum(x)
#### define a new vector &amp;quot;y&amp;quot; whose elements are square of all elements in &amp;quot;x&amp;quot;
y &amp;lt;- x^2
#### subtract a number &amp;quot;z&amp;quot; from a vector x and define this new vector as &amp;quot;xsz&amp;quot;
z &amp;lt;- 10
xsz &amp;lt;- x-z
#####Compute the sample variance of data points in vector &amp;quot;x&amp;quot;
var(x)
#####multiply &amp;quot;*&amp;quot;
3*2&lt;/pre&gt;

&lt;p&gt;
&lt;strong&gt;—You should read this book for more detailed explanations 
&lt;/strong&gt;&lt;a href=&quot;https://cran.r-project.org/doc/manuals/R-intro.pdf&quot; class=&quot;urlextern&quot; title=&quot;https://cran.r-project.org/doc/manuals/R-intro.pdf&quot;&gt;An Introduction to R&lt;/a&gt;  &lt;strong&gt;(Chapters 2.1,2.2,2.3 are most relevant to this homework)&lt;/strong&gt;
&lt;/p&gt;

&lt;/div&gt;
&lt;!-- EDIT9 SECTION &quot;Computing Homework 2&quot; [8269-] --&gt;</summary>
    </entry>
    <entry>
        <title>Computing assignments (Spring 2016)</title>
        <link rel="alternate" type="text/html" href="https://www2.math.binghamton.edu/p/people/qiao/teach/448/448_cp_sp2016"/>
        <published>2016-02-20T15:14:06-04:00</published>
        <updated>2016-02-20T15:14:06-04:00</updated>
        <id>https://www2.math.binghamton.edu/p/people/qiao/teach/448/448_cp_sp2016</id>
        <summary>
&lt;h2 class=&quot;sectionedit1&quot; id=&quot;computing_assignments_spring_2016&quot;&gt;Computing assignments (Spring 2016)&lt;/h2&gt;
&lt;div class=&quot;level2&quot;&gt;

&lt;/div&gt;
&lt;!-- EDIT1 SECTION &quot;Computing assignments (Spring 2016)&quot; [1-48] --&gt;
&lt;h3 class=&quot;sectionedit2&quot; id=&quot;homework_3&quot;&gt;Homework 3&lt;/h3&gt;
&lt;div class=&quot;level3&quot;&gt;

&lt;p&gt;
&lt;em class=&quot;u&quot;&gt;Due: 8 pm, Feb. 2&lt;/em&gt;
&lt;/p&gt;

&lt;p&gt;
Each of you receive 500 numbers, denoted as $X_1,\dots,X_{500}$, all of which follow a normal distribution with an &lt;strong&gt;unknown&lt;/strong&gt; mean and an &lt;strong&gt;unknown&lt;/strong&gt; variance. &lt;em class=&quot;u&quot;&gt;&lt;strong&gt;Please read following questions carefully. Note that not all numbers will be used!!!&lt;/strong&gt;&lt;/em&gt;
&lt;/p&gt;

&lt;p&gt;
The goals include finding a point estimator and a confidence interval for $\mu$ with good accuracy.
&lt;/p&gt;
&lt;ol&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; (1pt) Choose the first 10 observations, $X_1,\dots,X_{10}$, as your sample. Treat this as a pilot sample (an experimental and preliminary sample.) Report an estimate of the unknown population variance. In R, the following command extracts the first 10 elements of vector &lt;code&gt;x&lt;/code&gt; and save them as a new vector &lt;code&gt;y&lt;/code&gt;.&lt;pre class=&quot;code&quot;&gt;y=x[1:10]&lt;/pre&gt;

&lt;p&gt;
Recall that &lt;code&gt;sum( (y-mean(y))^2 )/(length(y)-1)&lt;/code&gt; gives your the sample variance. So does &lt;code&gt;var(y)&lt;/code&gt;. Check both commands to see if they match.
&lt;/p&gt;
&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level2&quot;&gt;&lt;div class=&quot;li&quot;&gt; (2pts) &lt;span class=&quot;wrap_hi &quot;&gt;Pretend that 10 is a large number&lt;/span&gt; (although it is not that large, let&amp;#039;s pretend it that way for now.) Construct a 90% confidence interval using only the first 10 observations (the sample mean is based on the 10 observations, and the sample standard deviation is also based on the 10 observations.) &lt;span class=&quot;wrap_hi &quot;&gt;Report the two-sided confidence interval: lower bound in (2a) and upper bound in (2b).&lt;/span&gt; &lt;span class=&quot;wrap_hi &quot;&gt;For this question, ignore the materials in Section 8.8. That is, please use the formula in Section 8.6.&lt;/span&gt;&lt;!-- EDIT3 PLUGIN_WRAP_START [0-] --&gt;&lt;div class=&quot;wrap_center wrap_round wrap_info plugin_wrap&quot; style=&quot;width:70%;&quot;&gt;
&lt;p&gt;
To find the percentile point of a standard normal distribution, do not use the SOA normal table or Table 4 in the text book. Instead, you can use command &lt;code&gt;z=qnorm(0.995)&lt;/code&gt;, &lt;code&gt;z=qnorm(0.975)&lt;/code&gt;, &lt;code&gt;z=qnorm(0.95)&lt;/code&gt;, etc, to find the percentile point. This request is to help us grade your answer numerically. For example, &lt;code&gt;qnorm(0.975)&lt;/code&gt; gives 1.959964, which corresponds to 1.96 in the normal tables. If you use 1.96 instead of 1.959964, your answer may be mistakenly graded as incorrect. Type in &lt;code&gt;?qnorm&lt;/code&gt; in R for more information on the function &lt;code&gt;qnorm()&lt;/code&gt;.
&lt;/p&gt;
&lt;/div&gt;&lt;!-- EDIT4 PLUGIN_WRAP_END [0-] --&gt;&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level2&quot;&gt;&lt;div class=&quot;li&quot;&gt; (2pts) Redo the last question except that you use the formula in Section 8.8, that is, provide a 90% confidence interval for a small sample.&lt;!-- EDIT5 PLUGIN_WRAP_START [0-] --&gt;&lt;div class=&quot;wrap_center wrap_round wrap_info plugin_wrap&quot; style=&quot;width:70%;&quot;&gt;
&lt;p&gt;
To find the percentile point of a t distribution, do not use Table 5 in the text book. Instead, you can use command &lt;code&gt;t=qt(0.995,9)&lt;/code&gt;, &lt;code&gt;z=qt(0.975,9)&lt;/code&gt;, &lt;code&gt;z=qt(0.95,9)&lt;/code&gt;, etc, to find the percentile point. The first argument is the left-tail (not right-tail) probability and the second argument is the degrees of freedom (which is 9 here). For example, &lt;code&gt;qt(0.995,9)&lt;/code&gt; gives 3.249836, which corresponds to 3.250 in Table 5, row 9, last column. Compare &lt;code&gt;qt(0.995,9)&lt;/code&gt;, &lt;code&gt;qt(0.99,9)&lt;/code&gt;, &lt;code&gt;qt(0.975,9)&lt;/code&gt;, or &lt;code&gt;qt(0.95,9)&lt;/code&gt; with row 9 of Table 5. Type in &lt;code&gt;?qt&lt;/code&gt; in R for more information on the function &lt;code&gt;qt()&lt;/code&gt;.
&lt;/p&gt;
&lt;/div&gt;&lt;!-- EDIT6 PLUGIN_WRAP_END [0-] --&gt;&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level2&quot;&gt;&lt;div class=&quot;li&quot;&gt; (2pts) The confidence interval obtained above does not provide a lot of information about  $\mu$ since it is too wide. In addition, the sample mean based on only 10 observations is also unlikely to be accurate. &lt;span class=&quot;wrap_hi &quot;&gt;&lt;strong&gt;In this problem, we want to find an estimator (sample mean) whose error of estimation is no greater than 0.5 with a probability of 99%.&lt;/strong&gt;&lt;/span&gt; One way to achieve this goal is to increase your sample size. Your answer in Question 1 above (based on 10 data points) gives you an estimate to the true population variance of random variables $X_i$&amp;#039;s. Now calculate the minimum sample size (i.e. number of observations) needed to achieve the desired accuracy (that is, the error of estimation has to be less than 0.5 with a probability 0.99). Round your answer up as an integer. Denote the required sample size as $n$. &lt;span class=&quot;wrap_hi &quot;&gt;Report the total cost of collecting these $n$ observations&lt;/span&gt; (remember: each observation costs 12 dollars.)&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level2&quot;&gt;&lt;div class=&quot;li&quot;&gt; (1pt) Use the first $n$ data points in the data set that you have received, and use these $n$ observations to calculate an unbiased point estimate for $\mu$. Recall the command &lt;code&gt;x[1:n]&lt;/code&gt;. Here $n$ is the minimum sample size that you obtained in the last question.&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level2&quot;&gt;&lt;div class=&quot;li&quot;&gt; (2pts) Again, use the first $n$ data points, provide a 90% confidence interval for $\mu$. Note that with $n&amp;gt;10$ observations in your hand (which you have paid for $\$$12 each), you can get a more accurate estimate of the population variance. Remember to use the $n$ data points to calculate a new sample mean and a new standard error. &lt;span class=&quot;wrap_hi &quot;&gt;Report the 90% two-sided confidence interval, the lower bond in (6a) and the upper bound in (6b).&lt;/span&gt; &lt;/div&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;/div&gt;

&lt;h4 id=&quot;answer_key&quot;&gt;Answer key&lt;/h4&gt;
&lt;div class=&quot;level4&quot;&gt;
&lt;pre class=&quot;code&quot;&gt;setwd(&amp;quot;C:/448wd&amp;quot;)
dat = read.csv(&amp;#039;data_3.txt&amp;#039;,header=FALSE) 
dat &amp;lt;- as.matrix(dat)
x &amp;lt;- dat[1,]

pilot = x[1:10]

ans1 = var(pilot)

ans2a = mean(pilot) - qnorm(0.95) * sqrt(ans1/10)
ans2b = mean(pilot) + qnorm(0.95) * sqrt(ans1/10)

ans3a = mean(pilot) - qt(0.95,9) * sqrt(ans1/10)
ans3b = mean(pilot) + qt(0.95,9) * sqrt(ans1/10)

nsize = ceiling( ( qnorm(0.995)/0.5*sqrt(ans1) )^2 )
# note: ceiling takes a single numeric argument x and returns a numeric vector 
#   containing the smallest integers not less than the corresponding elements of x.
ans4 &amp;lt;- nsize*12

newdata = x[1:nsize]
ans5 = mean(newdata)
ans6a = mean(newdata) + qnorm(0.95) * sd(newdata)/sqrt(nsize)
ans6b = mean(newdata) + qnorm(0.95) * sd(newdata)/sqrt(nsize)

print( c(ans1,ans2a,ans2b,ans3a,ans3b,ans4,ans5,ans6a,ans6b) )&lt;/pre&gt;

&lt;/div&gt;
&lt;!-- EDIT2 SECTION &quot;Homework 3&quot; [49-5402] --&gt;
&lt;h3 class=&quot;sectionedit7&quot; id=&quot;homework_2&quot;&gt;Homework 2&lt;/h3&gt;
&lt;div class=&quot;level3&quot;&gt;

&lt;p&gt;
&lt;strong&gt;Round to at least 3 decimal places unless otherwise stated.&lt;/strong&gt;
&lt;/p&gt;

&lt;p&gt;
Each of you receive 225 numbers, denoted as $X_1,\dots,X_{n}$, where $n=225$. It is known that $X_i\sim Unif(0,\theta)$ independently with $\theta$ unknown.
&lt;/p&gt;
&lt;ol&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; (1pt) report the sample mean of these 225 numbers.&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; (1pt) correct the sample mean above so that it becomes unbiased for the purpose of estimating $\theta$.&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; (2pts) derive the standard error of the unbiased estimator above (not the sample mean in Question 1, but the corrected one in Question 2!!!) as a function of $\theta$, and then report the “2-standard-error bound” on the error of estimation by replacing the unknown $\theta$ in the standard error by the unbiased estimate obtained in Question 2. Round to 5 decimal places for this question.&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; (1pt) An alternative way to approximate the standard error is to estimate the (population) standard derivation of $X_1$ directly by using the sample standard derivation based on the data, then the standard error can be approximated by the sample standard deviation, divided by square root $n$. Please report the “2-standard-error bound” on the error of estimation obtained in this way. Hint: You may use &lt;code&gt;sd(x)&lt;/code&gt; to find the sample standard deviation, but you may want to use &lt;code&gt;sqrt(sum( (x-mean(x) )^2)/(length(x)-1))&lt;/code&gt; to help you familiarize with the calculation (they should give you the same answer). Moreover, the “2-standard-error bound” you find here should be reasonably close to that in the last question. Round to 5 decimal places for this question.&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; (1pt) report the maximum of the 225 numbers.&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; (2pt) correct the maximum of the 225 numbers so that it becomes an unbiased estimator for $\theta$, then report the observed value of this unbiased estimator based on the given data.&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; (1pt + 1pt) use the pivotal method to find a 95% confidence interval for $\theta$. The pivotal quantity is transformed from the maximum of the 225 numbers. See Ex. 8.43. Then report the confidence lower and upper limits as the answers to (7a) and (7b) in the Google Form. Round to 5 decimal places for this question.&lt;/div&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;/div&gt;

&lt;h4 id=&quot;answer_key1&quot;&gt;Answer key&lt;/h4&gt;
&lt;div class=&quot;level4&quot;&gt;
&lt;pre class=&quot;code&quot;&gt;setwd(&amp;quot;C:/448wd&amp;quot;)
dat = read.csv(&amp;#039;data_2.txt&amp;#039;,header=FALSE) 
dat &amp;lt;- as.matrix(dat)
x &amp;lt;- dat[1,]

ans1 = mean(x)
ans2 = 2*mean(x)
ans3 = 2*2*mean(x)/sqrt(12)/sqrt(length(x))
ans4 = 2*sd(x)/sqrt(length(x))
ans5 = max(x)
ans6 = max(x)*(length(x)+1)/length(x)
ans7a = max(x)/(0.975^(1/length(x)))
ans7b = max(x)/(0.025^(1/length(x)))

print( c(ans1,ans2,ans3,ans4,ans5,ans6,ans7a,ans7b) )&lt;/pre&gt;

&lt;/div&gt;
&lt;!-- EDIT7 SECTION &quot;Homework 2&quot; [5403-7951] --&gt;
&lt;h3 class=&quot;sectionedit8&quot; id=&quot;homework_1&quot;&gt;Homework 1&lt;/h3&gt;
&lt;div class=&quot;level3&quot;&gt;

&lt;p&gt;
Each of you are given a different data set of 64 observations drawn from an unknown distribution. Please submit your answers to &lt;a href=&quot;https://docs.google.com/a/binghamton.edu/forms/d/16jhF5eUpPY7pXXmzaAHAgiFZY7S8JypBYKcEfSUcBjE/viewform&quot; class=&quot;urlextern&quot; title=&quot;https://docs.google.com/a/binghamton.edu/forms/d/16jhF5eUpPY7pXXmzaAHAgiFZY7S8JypBYKcEfSUcBjE/viewform&quot;&gt;https://docs.google.com/a/binghamton.edu/forms/d/16jhF5eUpPY7pXXmzaAHAgiFZY7S8JypBYKcEfSUcBjE/viewform&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;
Note that you need to login your Bmail account to submit the answers. Please do this by 7 pm on Feb. 2.
&lt;/p&gt;
&lt;ol&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; Find an unbiased estimate of the population mean based on the data set you are given.&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; Find the maximum of the observations that you receive.&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; Find the minimum of the observations that you receive.&lt;/div&gt;
&lt;/li&gt;
&lt;li class=&quot;level1&quot;&gt;&lt;div class=&quot;li&quot;&gt; Suppose we are interested in the population proportion of those observations which are greater than 4. Find an unbiased estimate of this population proportion.&lt;/div&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;
The following R code may be able to help you get started. Copy each line to the console of R and press “enter”.
&lt;/p&gt;
&lt;pre class=&quot;code&quot;&gt;##### Assume that you have a Windows machine. First create a folder called &amp;quot;448wd&amp;quot; under C drive. 
##### I trust that you can do this on your own. If not, search a solution on Google or Youtube.
##### Set the R working directory
setwd(&amp;quot;C:/448wd&amp;quot;)

### Read the data file. Make sure that your data file has been copied to the folder.
dat = read.csv(&amp;#039;data_1.txt&amp;#039;,header=FALSE) 

### The variable named &amp;quot;dat&amp;quot; that you just read into R is a data frame.
### We need to convert it to a matrix

dat &amp;lt;- as.matrix(dat) ##dat now is a 1x64 matrix (1 row and 64 columns)

x &amp;lt;- dat[1,]  ## Take the first row of this matrix as your sample

# Try the following.
mean(x)  # sample mean
median(x) # sample median
max(x)  # maximum
min(x)  # minimum
y = (x &amp;gt; 2)
y   # We can see that Y is a logical vector of TRUE and FALSE.
### We can operate directly on a logical vector with the convention that TRUE = 1 and FALSE = 0. For example
mean(y)
sum(y)

## Ok. You are ready to answer the questions.&lt;/pre&gt;

&lt;/div&gt;

&lt;h4 id=&quot;answer_key2&quot;&gt;Answer key&lt;/h4&gt;
&lt;div class=&quot;level4&quot;&gt;
&lt;pre class=&quot;code&quot;&gt;setwd(&amp;quot;C:/448wd&amp;quot;)
dat = read.csv(&amp;#039;data_1.txt&amp;#039;,header=FALSE) 
dat &amp;lt;- as.matrix(dat)
x &amp;lt;- dat[1,]
ans1 &amp;lt;- mean(x)
ans2 &amp;lt;- max(x)
ans3 &amp;lt;- min(x)
ans4 &amp;lt;- mean( x &amp;gt; 4 )
print( c(ans1,ans2,ans3,ans4) )&lt;/pre&gt;

&lt;/div&gt;
&lt;!-- EDIT8 SECTION &quot;Homework 1&quot; [7952-] --&gt;</summary>
    </entry>
</feed>
