Forensic statistic, Probability & Hypothesis testing
Probability
(i)    In 1993 a team of scientists from John Hopkins University and the University of Helsinki
reported in Science [1993,vol.260,p751] the discovery of a genetic marker for so-called familial cancer of
the colon. The scientists estimated that one person in 200 carries the defective gene, that 95% of people
with the gene will develop cancer, and that of those who get cancer, 60% will get cancer of the colon.
(a)    From these figures what percentage of people will develop cancer of the colon from this
mechanism?
Some people are considered to be at increased risk of developing cancer of the colon because of a
strong family history of the disease. It is believed that 75% of these will find that they do not have the
genetic marker and that these people bear only the average risk of developing cancer of the colon,
which is 1 chance in 20.
(b)    What proportion of those with a strong family history will get cancer of the colon?
(c)    What proportion of those who get cancer of the colon carry the defective gene?
(ii)    You are suspicious about a coin but it is not in your hands; you cannot look at it. You think it
may be two-headed or it may be a fair coin with a head and a tail. Suppose there is an equal chance of
either of these and there is no other possibility.
(a)    Calculate the odds ratio for the coin being two headed
(b)    You watch the coin being tossed 10 times and ten heads come up. Calculate the likelihood ratio
and hence the posterior odds for this evidence.
(c)    Use Tables 11.3 and 11.4 in Lucy to give a verbal interpretation of the result.
Part 2: Hypothesis testing
(i)    The frequencies of three blood types A, AB and B among 151 children from parents whose blood
types are both AB are shown in the following table:
Blood type    A    AB    B    Total
Number observed    39    70    42    151
A law of genetics postulates that the ratios of A:AB:B are 1:2:1. Do the observations support the law?
Carry out a hypothesis test to answer this question. Ensure that you include all the steps for hypothesis
testing.
(ii)    A statistical model was built for predicting reconviction based on three years of post-prison
follow up of 347 men who had been imprisoned for crimes against persons or property. It was possible to
classify the prisoner as having either low risk, medium risk or high risk of re-offending. To see how useful
the classification was a further 225 prisoners were studied on leaving prison. The results are shown in the
table below.
Risk group
Low    Medium    High    Total
Reconvicted    23    50    53    126
Not reconvicted    52    25    22    99
Total    75    75    75    225
Is there an association between the risk group and whether the prisoner is reconvicted? Carry out a
hypothesis test to answer this question. Ensure that you include all the steps for hypothesis testing.
(iii)    A small random sample of cannabis seizures in Australia over a few years is given in the
assignment data in the tab labelled Cannabis. It contains the weight of the cannabis seized in grams and
the seizure type, either in a mail item or other method of entry into the country.
(a)    Draw a histogram of the weight of cannabis seized by mail and by other; that is draw two
histograms.
(b)    Describe any problems that you see.
(c)    What one measure of location and one measure of spread would you use to describe these two
data sets?
(d)    Transform the weight variable to log(weight).
(e)    Redraw the histograms using log(weight) and comment on any differences you see between
these histograms and the ones drawn in (a).
(f)    Assess the Normality for the two data sets for weight and log(weight).
(g)    Perform a hypothesis test to determine whether the weight of mail item seizures differs from
other seizures. Pay careful attention to the distributional assumptions.
or