联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Algorithm 算法作业Algorithm 算法作业

日期:2023-08-13 10:36

Assignment 1 MAST90125: Bayesian Statistical Learning

There are places in this assignment where R code will be required. Therefore set the random

seed so assignment is reproducible.

set.seed(123456) #Please change random seed to your student id number.

Please save this R markdown document and write your answers in it. Between your answer

to each question, ensure there is sufficient space for marker comments by using the command

\newpage

Question One (5 marks)

In some cases, the data generative models, e.g., g(θ), are black-box and likelihood functions cannot be

obtained. Assume that there is only one parameter θ in the data generative model, and we have two data

observations: y1 = 33, and y2 = 54 that are i.i.d. given the generative model.

a) In such cases, if we want to analyze the posterior of θ, how could we obtain it? Please write down the

procedures step by step. hint: Approximate Bayesian Computation

b) Could we estimate the posteriors Pr(θ|y1) and Pr(θ|y2) seperately and then obtain the posterior by

Pr(θ|y1, y2) = Pr(θ|y1) Pr(θ|y2)? Please justify your answer using the definition of conditional probability.

c) Based on your result in b), please answer the question: if we want to obtain the posterior distribution

regarding parameters of interest in complex situations (many parameters and many observations), is

the Approximate Bayesian Computation method suitable given limited computing resources? Briefly

justify your answer.

Question Two (5 marks)

Medical researchers are wishing to investigate the performance of a diagnostic test. Prior studies suggest

the underlying probability of disease (event A) is a. To determine the effectiveness of the diagnostic test

(event B = testing positive), a case-control study was undertaken. Both cases and controls were added to

the study until d1 cases tested positive, and d2 controls tested negative.

a) Identify an appropriate distribution for the likelihood of nBˉ|A, the number of cases testing negative,

and nB|Aˉ, the number of controls testing positive, including the parameter(s) of these probability mass

functions.

b) Identify a suitable conjugate prior for the parameters determined in a). Hint: Each of the priors will

depend on two hyper-parameters.

c) Determine the posterior distribution for the parameters identified in a).

1

Question Three (5 marks)

As part of an investigation into traffic flows, a study was proposed to count the number of vehicles passing

through an intersection each minute between 5 pm and 6 pm for one week. The Researchers have decided

to assume the resulting counts are i.i.d. both within and between days.

a) Specifying an appropriate likelihood for the situation above, calculate Jeffreys’ prior.

b) In this example, is Jeffreys’ prior improper? Justify your answer.

c) In this example does Jeffreys’ prior satisfy the criterion:

Posterior ∝ Likelihood.

Justify your answer.

Question Four (11 marks)

To comply with food labelling regulations, a manufacturer of certain kind of products must prove that 99 %

of its products each weighing more than 250 g are within 7 g of the stated weight.

Company A knows that the average weight of its products of this kind is equal to the stated weight. It

also knows that the machinery is designed such that the respective weights of the products are normally

distributed with mean equal to stated weight and constant variance. To test the machinery is working to the

specification, the company randomly selected 100 products (each with the stated weight being more than

250 g) from the production lines and calculated the residual (yi,product j ? μproduct j ) weight. They reported

the sum of squared residuals SSR =

Pn

i=1(yi,product j ? μproduct j )

2 was 572.78.

a) Identify a parameter θ whose value will allow you to answer the question about the precision of manufacturing you wish to make inference on? By appropriate manipulation of the likelihood, demonstrate

that SSR is the sufficient statistic.

b) By choosing an appropriate one-to-one transformation f(θ) of the parameter identified in a), write

down a conjugate prior for this problem.

Hint: The prior will be defined by two parameters.

c) Determine the posterior distribution Pr(f(θ)|y1, . . . , yn). Substituting 1 for both prior parameters,

determine the 95 % central credible interval for θ.

d) For the posterior distribution in b), determine the 95 % highest posterior density interval for θ, and

compare to the 95 % central credible interval.

e) Do you believe, based on posterior inference, that the machinery used by Company A satisfies the

requirement that 99 % of products weighing more than 250 g are within 7 g of the stated weight?

Question Five (9 marks)

There are N = 112 students enrolled in the Master of Science in the School of Mathematics and Statistics.

At the end of the semester, n = 35 responses were received to an experience survey sent to Master’s students.

Among the questions asked was whether they felt adequate support was provided by the School. In y = 17

of the responses, the answer was yes.

a) If the students responding is an example of sampling with replacement, write down an appropriate

single parameter distribution for the likelihood Pr(y|θ).

2

b) Identify a prior distribution, p(θ) that is conjugate with the likelihood chosen in a). Hint the prior will

depend on two hyper-parameters.

c) Determine the posterior distribution p(θ|y) based on your choices of likelihood and prior in a) and b).

d) Determine the posterior predictive distribution p(?y|y).

Some useful density functions

? Normal distribution


相关文章

版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp