Answer:

Given:

  1. \(R_{i}\ ,\)Event that it rains on day \(i\)
  2. \(R_{i}^{c}\) ,Event that it does not rain on day \(i\)
  3. \(P\left ( R_{0}\right ) =p,\) Probability of rain on day \(0\)
  4. \(P\left ( R_{i}|R_{i-1}\right ) =\alpha \) ,Probability of rain on day \(i\) given it rained on day \(i-1\)
  5. \(P\left ( R_{i}^{c}|R_{i-1}^{c}\right ) =\beta \) ,Probability of no rain on day \(i\) given it did not rain on day \(i-1\)
  6. Only today’s weather is relevant to predicting tomorrow rain

Find:

Probability of rain in \(n\) days and what happen as \(n\rightarrow \infty \)

Solution:

Consider the experiment that generates today’s weather. Hence possible outcomes can be divided into 2 disjoint events: rain and no rain (A day can either be rainy or not, hence this division contains all possible outcomes).  

Hence

\[ \Omega =\left \{ R_{0},R_{0}^{c}\right \} \]

Now using the law of total probability, we write

\begin{equation} P\left ( R_{1}\right ) =P\left ( R_{1}|R_{0}\right ) P\left ( R_{0}\right ) +P\left ( R_{1}|R_{0}^{c}\right ) P\left ( R_{0}^{c}\right ) \tag {1}\end{equation}

But

\begin{align} P\left ( R_{1}|R_{0}^{c}\right ) & =1-P\left ( R_{1}|R_{0}\right ) \nonumber \\ & =1-\beta \ \tag {2}\end{align}

Note: To proof the above, we can utilize a simple state transition diagram as follows

Now, substitute (2) into (1) and given that \(P\left ( R_{1}|R_{0}\right ) =\alpha \) and \(P\left ( R_{0}\right ) =p\) and \(P\left ( R_{0}^{c}\right ) =1-p\), then (1) becomes

\begin{equation} P\left ( R_{1}\right ) =\alpha p+\left ( 1-\beta \right ) \left ( 1-p\right ) \tag {3}\end{equation}

Now we can recursively apply the above to find probability of rain on the day after tomorrow. Let \(R_{0}\rightarrow R_{1}\) and \(R_{1}\rightarrow R_{2}\), hence the above (1) becomes

\begin{equation} P\left ( R_{2}\right ) =P\left ( R_{2}|R_{1}\right ) P\left ( R_{1}\right ) +P\left ( R_{2}|R_{1}^{c}\right ) P\left ( R_{1}^{c}\right ) \tag {4}\end{equation}

Now using (3) for \(P\left ( R_{1}\right ) ,\) and given that \(P\left ( R_{2}|R_{1}\right ) =\alpha \) (This probability does not change, since we are told only today’s weather is relevant), and given that \(P\left ( R_{1}^{c}\right ) =\left ( 1-P\left ( R_{1}\right ) \right ) \) and that \(P\left ( R_{2}|R_{1}^{c}\right ) =\left ( 1-\beta \right ) \), then (4) becomes

\begin{align*} P\left ( R_{2}\right ) & =\alpha \overset {P(R_{1})}{\overbrace {\left [ \alpha p+\left ( 1-\beta \right ) \left ( 1-p\right ) \right ] }}+\left ( 1-\beta \right ) \overset {P\left ( R_{1}^{c}\right ) }{\overbrace {\left ( 1-\left [ \alpha p+\left ( 1-\beta \right ) \left ( 1-p\right ) \right ] \right ) }}\\ & =p+\alpha +\beta -2p\alpha -2p\beta -\beta ^{2}-\alpha \beta +p\alpha ^{2}+p\beta ^{2}+2p\allowbreak \alpha \beta \\ & =p\left ( 1-2\alpha -2\beta +2\alpha \beta \right ) +\alpha +\beta -\alpha \beta +(p\alpha ^{2}+p\beta ^{2}-\beta ^{2})\\ & =p\left ( 1-2\alpha -2\beta +2\alpha \beta \right ) +\alpha +\beta -\alpha \beta +\left [ \text {terms with higher powers in }\alpha \text { and }\beta \right ] \end{align*}

We see that as we continue with the above process, terms will be generated with the form (something)\(\times \alpha ^{m}\) and (something)\(\times \beta ^{r}\), where the powers \(m,r\) are getting larger and larger as \(n\) gets larger. But since \(\alpha ,\beta <1\), hence all these terms go to zero. So we only need to look at the terms which do not contain a product of \(\alpha ^{\prime }s\) and product of \(\beta ^{\prime }s\)

Hence the above reduces

\[ P\left ( R_{2}\right ) \approx p\left ( 1-2\alpha -2\beta +2\alpha \beta \right ) +\alpha +\beta -\alpha \beta \]

There is a pattern here, to see it more clearly, I generated more \(P\left ( R_{i}\right ) \) for \(i=3,4,5,6,7\) using a small piece of code and removed all terms of higher powers of \(\alpha ,\beta \) as described above, and I get the following table

\(i\) \(P\left ( R_{i}\right ) \)
\(0\) \(p\)
\(1\) \(1-\beta +p\left ( -1+\alpha +\beta \right ) \)
\(2\) \(\alpha +\beta -\alpha \beta +p\left ( 1-2\alpha -2\beta +2\alpha \beta \right ) \)
\(3\) \(1-\alpha -2\beta +3\alpha \beta +p\left ( -1+3\alpha +3\beta -6\alpha \beta \right ) \)
\(4\) \(2\alpha +2\beta -6\alpha \beta +p\left ( 1-4\alpha -4\beta +12\alpha \beta \right ) \)
\(5\) \(1-2\alpha -3\beta +10\alpha \beta +p\left ( -1+5\alpha +5\beta -20\alpha \beta \right ) \)
\(6\) \(3\alpha +3\beta -15\alpha \beta +p\left ( 1-6\alpha -6\beta +30\alpha \beta \right ) \)

Hence the pattern can be seen as the following

\begin{align*} P\left ( R_{n}\right ) & =\operatorname {mod}\left ( n,2\right ) +\left ( -1\right ) ^{\left ( n\right ) }\left \lfloor \frac {n}{2}\right \rfloor \alpha +\left ( -1\right ) ^{\left ( n\right ) }\left \lceil \frac {n}{2}\right \rceil \beta +\left ( -1\right ) ^{\left ( n+1\right ) }\left ( \sum _{i=1}^{n-1}i\right ) \alpha \beta +\\ & p\left ( \left ( -1\right ) ^{n}+\left ( -1\right ) ^{n+1}n\alpha +\left ( -1\right ) ^{n+1}n\beta +\left ( -1\right ) ^{n}\left [ n^{2}-n\right ] \alpha \beta \right ) \end{align*}

Where \(\operatorname {mod}\left ( n,2\right ) =0\) for even \(n\) and \(1\) for odd \(n\), and \(\left \lfloor \frac {n}{2}\right \rfloor \) means to round to nearest lower integer and \(\left \lceil \frac {n}{2}\right \rceil \) means to round upper.

The above is valid for very large \(n\).

As \(n\rightarrow \infty \) \(P\left ( R_{n}\right ) \) will reach a fixed value (I first though it will always go to 1, but that turned out not to be the case). I could not find an exact expression for \(P\left ( R_{n}\right ) \) as \(n\rightarrow \infty \), but I wrote a small program which simulates the above, and generates a table. Here is a table for few values as \(n\) gets large, these are all for \(\alpha =.3,\beta =.6,p=.4,\) notice that \(P\left ( R_{n}\right ) \) fluctuates up and down from one day to the next as it converges to its limit.

Given: Conditional probabilities exist

Show: \(P\left ( A_{1}\cap A_{2}\cap A_{3}\cap \cdots A_{n}\right ) =P\left ( A_{1}\right ) +P\left ( A_{2}|A_{1}\cap A_{2}\right ) +\cdots +P\left ( A_{n}|A_{1}\cap A_{2}\cap \cdots \cap A_{n-1}\right ) \)

Solution:

Since Conditional probabilities exist, then we know that the following is true

\[ P\left ( X\cap Y\right ) =P\left ( X|Y\right ) P\left ( Y\right ) \]

Let \(X=A_{n}\) and \(Y=A_{1}\cap A_{2}\cap A_{3}\cap \cdots \cap A_{n-1}\) hence the above becomes

\[ P\left ( A_{1}\cap A_{2}\cap \cdots A_{n}\right ) =P\left ( A_{n}|A_{1}\cap A_{2}\cap \cdots \cap A_{n-1}\right ) P\left ( A_{1}\cap A_{2}\cap \cdots \cap A_{n-1}\right ) \]

Now apply the same idea to the last term above. In other words, we write

\[ P\left ( A_{1}\cap A_{2}\cap \cdots \cap A_{n-1}\right ) =P\left ( A_{n-1}|A_{1}\cap A_{2}\cap \cdots \cap A_{n-2}\right ) P\left ( A_{1}\cap A_{2}\cap \cdots \cap A_{n-2}\right ) \]

We repeat the process until we obtain \(P\left ( A_{1}\cap A_{2}\right ) =P\left ( A_{2}|A_{1}\right ) P\left ( A_{1}\right ) \)

Hence, putting all the above together, we write

\begin{align*} P\left ( A_{1}\cap A_{2}\cap \cdots A_{n}\right ) & =P\left ( A_{n}|A_{1}\cap A_{2}\cap \cdots \cap A_{n-1}\right ) P\left ( A_{n-1}|A_{1}\cap A_{2}\cap \cdots \cap A_{n-2}\right ) \\ P\left ( A_{n-2}|A_{1}\cap A_{2}\cap \cdots A_{n-3}\right ) & \cdots P\left ( A_{2}|A_{1}\right ) P\left ( A_{1}\right ) \end{align*}

The above is what is required to show (terms are just rewritten is reverse order from the problem statement, rearranging, we obtain

\[ P\left ( A_{1}\cap A_{2}\cap \cdots A_{n}\right ) =P\left ( A_{1}\right ) P\left ( A_{2}|A_{1}\right ) \cdots P\left ( A_{n-1}|A_{1}\cap A_{2}\cap \cdots \cap A_{n-2}\right ) P\left ( A_{n}|A_{1}\cap A_{2}\cap \cdots \cap A_{n-1}\right ) \]

Given:

Axioms of probability:

  1. \(P\left ( \Omega \right ) =1\)
  2. if \(A\subset \Omega \) then \(P\left ( A\right ) \geq 0\)
  3. if \(A,B\) are disjoint events (i.e. \(A\cap B=\varnothing \)) then \(P\left ( A_{1}\cup A_{2}\cup A_{3}\cup \cdots \cup A_{n}\right ) =P\left ( A_{1}\right ) +P\left ( A_{2}\right ) +\cdots +P\left ( A_{n}\right ) \)

Show that \(P\left ( A\cup B\right ) \leq P\left ( A\right ) +P\left ( B\right ) \)

Solution:

There are 4 possible cases.

  1. \(A,B\) are disjoint
  2. \(A\subset B\)
  3. \(B\subset A\)
  4. \(A,B\) have some common events between them. In other words \(A\cap B=C\neq \varnothing \)

Case 1: If \(A,B\) are disjoint then \(A\cup B=A+B\) by set theory. Now apply the probability operator on both sides we obtain that

\[ P\left ( A\cup B\right ) =P\left ( A+B\right ) \]

Now, by Axiom 3, \(P\left ( A+B\right ) =P\left ( A\right ) +P\left ( B\right ) \) hence the above becomes

\[ P\left ( A\cup B\right ) =P\left ( A\right ) +P\left ( B\right ) \]

Case 2: If \(A\subset B\) then \(A\cup B=B\) by set theory. Now apply the probability operator on both sides we obtain that

\[ P\left ( A\cup B\right ) =P\left ( B\right ) \]

But \(P\left ( B\right ) \leq P\left ( B\right ) +P\left ( A\right ) \) since \(A\in \Omega \) and so \(P\left ( A\right ) \geq 0\) by axiom 2. Hence the above becomes

\begin{equation} P\left ( A\cup B\right ) \leq P\left ( B\right ) +P\left ( A\right ) \tag {0}\end{equation}

Case 3: This is the same as case 2, just exchange \(A\) and \(B\)

case 4: Since, by set theory

\[ A=A\cap B+A\cap B^{c}\]

Then apply Probability operator on both sides

\[ P\left ( A\right ) =P(A\cap B+A\cap B^{c}) \]

But by set theory \(A\cap B\) is disjoint from \(A\cap B^{c}\), then by axiom 3 the above becomes

\begin{equation} P\left ( A\right ) =P\left ( A\cap B\right ) +P\left ( A\cap B^{c}\right ) \tag {1}\end{equation}

Similarly, by set theory

\[ B=B\cap A+B\cap A^{c}\]

Then apply Probability operator on both sides

\[ P\left ( B\right ) =P\left ( B\cap A+B\cap A^{c}\right ) \]

But \(B\cap A\) is disjoint from \(B\cap A^{c}\), by set theory, then by axiom 3 the above becomes

\begin{equation} P\left ( B\right ) =P\left ( B\cap A\right ) +P\left ( B\cap A^{c}\right ) \tag {2}\end{equation}

Now by set theory

\[ A\cup B=A\cap B+A\cap B^{c}+B\cap A^{c}\]

Apply the probability operator on the above

\[ P\left ( A\cup B\right ) =P\left ( A\cap B+A\cap B^{c}+B\cap A^{c}\right ) \]

But \(A\cap B,A\cap B^{c},\) and \(B\cap A^{c}\) are disjoint by set theory, then above can be written using axiom 3 as

\begin{equation} P\left ( A\cup B\right ) =P\left ( A\cap B\right ) +P\left ( A\cap B^{c}\right ) +P\left ( B\cap A^{c}\right ) \tag {3}\end{equation}

Add (1)+(2)

\[ P\left ( A\right ) +P\left ( B\right ) =P\left ( A\cap B\right ) +P\left ( A\cap B^{c}\right ) +P\left ( B\cap A\right ) +P\left ( B\cap A^{c}\right ) \]

subtract the above from (3)

\begin{align*} P\left ( A\cup B\right ) -\left [ P\left ( A\right ) +P\left ( B\right ) \right ] & =\left [ P\left ( A\cap B\right ) +P\left ( A\cap B^{c}\right ) +P\left ( B\cap A^{c}\right ) \right ] -\\ & \left [ P\left ( A\cap B\right ) +P\left ( A\cap B^{c}\right ) +P\left ( B\cap A\right ) +P\left ( B\cap A^{c}\right ) \right ] \end{align*}

Cancel terms (Arithmetic)

\[ P\left ( A\cup B\right ) -\left [ P\left ( A\right ) +P\left ( B\right ) \right ] =-P\left ( B\cap A\right ) \]

or (algebra)

\[ P\left ( A\cup B\right ) =P\left ( A\right ) +P\left ( B\right ) -P\left ( B\cap A\right ) \]

Since \(B\cap A\) is an event in \(\Omega \) then \(P\left ( B\cap A\right ) \geq 0\) by axiom 2, hence the above can be written as

\begin{equation} P\left ( A\cup B\right ) \leq P\left ( A\right ) +P\left ( B\right ) \tag {4}\end{equation}

conclusion: We have looked at all 4 possible cases, and found that \(P\left ( A\cup B\right ) =P\left ( A\right ) +P\left ( B\right ) \) or \(P\left ( A\cup B\right ) \leq P\left ( A\right ) +P\left ( B\right ) \), hence \(P\left ( A\cup B\right ) \leq P\left ( A\right ) +P\left ( B\right ) \)

Note: I tried, really tried, to find a method which would require me to use the hint given in the problem that if \(A\subset B\), then \(P\left ( A\right ) \leq P\left ( B\right ) \) but I did not need to use such a relationship in the above. But I still show a proof for this identity below

Given: \(A\subset B\,\ \), Show \(P\left ( A\right ) \leq P\left ( B\right ) \)

proof:

\(B=A\cup A^{c}\) by set theory

\(P\left ( B\right ) =P\left ( A\cup A^{c}\right ) \) by applying probability to each side.

But \(A,A^{c}\) are disjoint by set theory, hence \(P\left ( A\cup A^{c}\right ) =P\left ( A\right ) +P\left ( A^{c}\right ) \) by axiom 3.

Hence \(P(B)=P\left ( A\right ) +P\left ( A^{c}\right ) \), or \(P\left ( A\right ) =P\left ( B\right ) -P\left ( A^{c}\right ) \)

But by axiom 2, \(P\left ( A^{c}\right ) \geq 0\), hence \(P\left ( A\right ) \leq P\left ( B\right ) \), QED

Given: \(X\) binomial r.v., i.e. \(P\left ( X=k\right ) =\begin {pmatrix} n\\ k \end {pmatrix} p^{k}\left ( 1-p\right ) ^{n-k},\) Find the mode. This is the value \(k\) for which \(P\left ( X=k\right ) \) is maximum

The mode is where \(P\left ( X\right ) \) is maximum. Consider 2 terms, when \(X=k\), and \(X=k-1\), hence \(P\left ( X\right ) \) will be increasing when \(\frac {P\left ( X=k\right ) }{P\left ( X=k-1\right ) }>1\)

But

\[ P\left ( X=k-1\right ) =\begin {pmatrix} n\\ k-1 \end {pmatrix} p^{\left ( k-1\right ) }\left ( 1-p\right ) ^{n-\left ( k-1\right ) }\]

Hence

\begin{align*} \frac {P\left ( X=k\right ) }{P\left ( X=k-1\right ) } & =\frac {\begin {pmatrix} n\\ k \end {pmatrix} p^{k}\left ( 1-p\right ) ^{n-k}}{\begin {pmatrix} n\\ k-1 \end {pmatrix} p^{\left ( k-1\right ) }\left ( 1-p\right ) ^{n-\left ( k-1\right ) }}=\frac {\frac {n!}{\left ( n-k\right ) !\ \left ( k\right ) !}\ p^{k}\left ( 1-p\right ) ^{n-k}}{\frac {n!}{\left ( n-k+1\right ) !\ \left ( k-1\right ) !}\ p^{\left ( k-1\right ) }\left ( 1-p\right ) ^{n-\left ( k-1\right ) }}\\ & =\frac {\left ( n-k+1\right ) !\ \left ( k-1\right ) !\ }{\left ( n-k\right ) !\ \left ( k\right ) !}\frac {\left ( 1-p\right ) }{p}\\ & =\frac {\left ( n-k\right ) \ }{\ k}\frac {\left ( 1-p\right ) }{p}\end{align*}

so \(P\left ( X\right ) \) is getting larger when \(\frac {\left ( n-k\right ) \ }{\ k}\frac {\left ( 1-p\right ) }{p}>1\) or

\begin{align*} \left ( n-k\right ) \left ( 1-p\right ) & >kp\\ n-np-k+kp & >kp\\ np+p & >k\\ p\left ( 1+n\right ) & >k \end{align*}

So as long as \(k<p\left ( 1+n\right ) \), pmf is increasing. Since \(k\) is an integer, then we need the largest integer such that it is \(<p\left ( 1+n\right ) \), hence

\[ k=\left \lfloor p\left ( 1+n\right ) \right \rfloor \]

Given:

\(P\left ( D\right ) =1/1000\)

members are affected independently

Find: probability 2 individuals are affected in population of size 100,000

part(a)

In Binomial random variable we ask: How many are infected in a trial of length \(n\) given that the probability of being infected in each trial to be \(p.\) Here we view each trial as testing an individual. Consider it a ’hit’ if the individual is infected. The number of trials is \(100,000\), which is \(n\), and \(p=1/1000\).

Therefore, \(X=\)how many are infected in population of 100000

Hence the probability of getting \(k=2\) hits is, using binomial r.v. is (\(k=2\) in this case)

\[ P\left ( X=2\right ) =\begin {pmatrix} n\\ k \end {pmatrix}p^{k}\left ( 1-p\right ) ^{n-k}\]

or numerically

\[ P\left ( X=2\right ) =\begin {pmatrix} 100000\\ 2 \end {pmatrix} 0.001^{2}\left ( 1-0.001\right ) ^{100000-2}\]

(b)Using Poisson r.v. Poisson is a generalization of Binomial. X is the number of successes in infinite number of trials, but with the probability of success in each one trial going to zero in such a way that \(np=\lambda \) .We compute \(p(X=k)=\frac {\lambda ^{k}}{k!}e^{-\lambda }\) \(,k=0,1,2,....\)

Hence here \(X=\) how many are infected as \(n\) gets very large and \(p\) , the probability of infection in each individual goes very small in such a way to keep \(np\) fixed at a parameter \(\lambda \). Since here \(n\) is large and \(p\) is small, we approximate binomial to Poisson using \(\lambda =np=100000\times 0.001=\allowbreak 100.0\)

Hence

\[ p\left ( X=2\right ) =\frac {100^{2}}{2!}e^{-100}\]

ps. computing a numerical value for the above, shows that using Binomial model, we obtain \(P\left ( X=2\right ) \)

and using Poisson model

I am not sure, these are such small values, this means there is almost no chance of finding 2 individuals infected in a population of 100,000? I would have expected to see a much higher probability than the above. I do not see what I am doing wrong if anything.