Consider the experiment that generates today’s weather. Hence possible outcomes can be divided into 2 disjoint events: rain and no rain (A day can either be rainy or not, hence this division contains all possible outcomes).

\begin{equation} P\left ( R_{1}\right ) =P\left ( R_{1}|R_{0}\right ) P\left ( R_{0}\right ) +P\left ( R_{1}|R_{0}^{c}\right ) P\left ( R_{0}^{c}\right ) \tag {1}\end{equation}

\begin{align} P\left ( R_{1}|R_{0}^{c}\right ) & =1-P\left ( R_{1}|R_{0}\right ) \nonumber \\ & =1-\beta \ \tag {2}\end{align}

Note: To proof the above, we can utilize a simple state transition diagram as follows

Now, substitute (2) into (1) and given that \(P\left ( R_{1}|R_{0}\right ) =\alpha \) and \(P\left ( R_{0}\right ) =p\) and \(P\left ( R_{0}^{c}\right ) =1-p\), then (1) becomes

\begin{equation} P\left ( R_{1}\right ) =\alpha p+\left ( 1-\beta \right ) \left ( 1-p\right ) \tag {3}\end{equation}

Now we can recursively apply the above to ﬁnd probability of rain on the day after tomorrow. Let \(R_{0}\rightarrow R_{1}\) and \(R_{1}\rightarrow R_{2}\), hence the above (1) becomes

\begin{equation} P\left ( R_{2}\right ) =P\left ( R_{2}|R_{1}\right ) P\left ( R_{1}\right ) +P\left ( R_{2}|R_{1}^{c}\right ) P\left ( R_{1}^{c}\right ) \tag {4}\end{equation}

Now using (3) for \(P\left ( R_{1}\right ) ,\) and given that \(P\left ( R_{2}|R_{1}\right ) =\alpha \) (This probability does not change, since we are told only today’s weather is relevant), and given that \(P\left ( R_{1}^{c}\right ) =\left ( 1-P\left ( R_{1}\right ) \right ) \) and that \(P\left ( R_{2}|R_{1}^{c}\right ) =\left ( 1-\beta \right ) \), then (4) becomes

\begin{align*} P\left ( R_{2}\right ) & =\alpha \overset {P(R_{1})}{\overbrace {\left [ \alpha p+\left ( 1-\beta \right ) \left ( 1-p\right ) \right ] }}+\left ( 1-\beta \right ) \overset {P\left ( R_{1}^{c}\right ) }{\overbrace {\left ( 1-\left [ \alpha p+\left ( 1-\beta \right ) \left ( 1-p\right ) \right ] \right ) }}\\ & =p+\alpha +\beta -2p\alpha -2p\beta -\beta ^{2}-\alpha \beta +p\alpha ^{2}+p\beta ^{2}+2p\allowbreak \alpha \beta \\ & =p\left ( 1-2\alpha -2\beta +2\alpha \beta \right ) +\alpha +\beta -\alpha \beta +(p\alpha ^{2}+p\beta ^{2}-\beta ^{2})\\ & =p\left ( 1-2\alpha -2\beta +2\alpha \beta \right ) +\alpha +\beta -\alpha \beta +\left [ \text {terms with higher powers in }\alpha \text { and }\beta \right ] \end{align*}

We see that as we continue with the above process, terms will be generated with the form (something)\(\times \alpha ^{m}\) and (something)\(\times \beta ^{r}\), where the powers \(m,r\) are getting larger and larger as \(n\) gets larger. But since \(\alpha ,\beta <1\), hence all these terms go to zero. So we only need to look at the terms which do not contain a product of \(\alpha ^{\prime }s\) and product of \(\beta ^{\prime }s\)

There is a pattern here, to see it more clearly, I generated more \(P\left ( R_{i}\right ) \) for \(i=3,4,5,6,7\) using a small piece of code and removed all terms of higher powers of \(\alpha ,\beta \) as described above, and I get the following table

\(i\)	\(P\left ( R_{i}\right ) \)
\(0\)	\(p\)
\(1\)	\(1-\beta +p\left ( -1+\alpha +\beta \right ) \)
\(2\)	\(\alpha +\beta -\alpha \beta +p\left ( 1-2\alpha -2\beta +2\alpha \beta \right ) \)
\(3\)	\(1-\alpha -2\beta +3\alpha \beta +p\left ( -1+3\alpha +3\beta -6\alpha \beta \right ) \)
\(4\)	\(2\alpha +2\beta -6\alpha \beta +p\left ( 1-4\alpha -4\beta +12\alpha \beta \right ) \)
\(5\)	\(1-2\alpha -3\beta +10\alpha \beta +p\left ( -1+5\alpha +5\beta -20\alpha \beta \right ) \)
\(6\)	\(3\alpha +3\beta -15\alpha \beta +p\left ( 1-6\alpha -6\beta +30\alpha \beta \right ) \)

\begin{align*} P\left ( R_{n}\right ) & =\operatorname {mod}\left ( n,2\right ) +\left ( -1\right ) ^{\left ( n\right ) }\left \lfloor \frac {n}{2}\right \rfloor \alpha +\left ( -1\right ) ^{\left ( n\right ) }\left \lceil \frac {n}{2}\right \rceil \beta +\left ( -1\right ) ^{\left ( n+1\right ) }\left ( \sum _{i=1}^{n-1}i\right ) \alpha \beta +\\ & p\left ( \left ( -1\right ) ^{n}+\left ( -1\right ) ^{n+1}n\alpha +\left ( -1\right ) ^{n+1}n\beta +\left ( -1\right ) ^{n}\left [ n^{2}-n\right ] \alpha \beta \right ) \end{align*}

Where \(\operatorname {mod}\left ( n,2\right ) =0\) for even \(n\) and \(1\) for odd \(n\), and \(\left \lfloor \frac {n}{2}\right \rfloor \) means to round to nearest lower integer and \(\left \lceil \frac {n}{2}\right \rceil \) means to round upper.

As \(n\rightarrow \infty \) \(P\left ( R_{n}\right ) \) will reach a ﬁxed value (I ﬁrst though it will always go to 1, but that turned out not to be the case). I could not ﬁnd an exact expression for \(P\left ( R_{n}\right ) \) as \(n\rightarrow \infty \), but I wrote a small program which simulates the above, and generates a table. Here is a table for few values as \(n\) gets large, these are all for \(\alpha =.3,\beta =.6,p=.4,\) notice that \(P\left ( R_{n}\right ) \) ﬂuctuates up and down from one day to the next as it converges to its limit.

Show: \(P\left ( A_{1}\cap A_{2}\cap A_{3}\cap \cdots A_{n}\right ) =P\left ( A_{1}\right ) +P\left ( A_{2}|A_{1}\cap A_{2}\right ) +\cdots +P\left ( A_{n}|A_{1}\cap A_{2}\cap \cdots \cap A_{n-1}\right ) \)

Let \(X=A_{n}\) and \(Y=A_{1}\cap A_{2}\cap A_{3}\cap \cdots \cap A_{n-1}\) hence the above becomes

\[ P\left ( A_{1}\cap A_{2}\cap \cdots A_{n}\right ) =P\left ( A_{n}|A_{1}\cap A_{2}\cap \cdots \cap A_{n-1}\right ) P\left ( A_{1}\cap A_{2}\cap \cdots \cap A_{n-1}\right ) \]

\[ P\left ( A_{1}\cap A_{2}\cap \cdots \cap A_{n-1}\right ) =P\left ( A_{n-1}|A_{1}\cap A_{2}\cap \cdots \cap A_{n-2}\right ) P\left ( A_{1}\cap A_{2}\cap \cdots \cap A_{n-2}\right ) \]

We repeat the process until we obtain \(P\left ( A_{1}\cap A_{2}\right ) =P\left ( A_{2}|A_{1}\right ) P\left ( A_{1}\right ) \)

\begin{align*} P\left ( A_{1}\cap A_{2}\cap \cdots A_{n}\right ) & =P\left ( A_{n}|A_{1}\cap A_{2}\cap \cdots \cap A_{n-1}\right ) P\left ( A_{n-1}|A_{1}\cap A_{2}\cap \cdots \cap A_{n-2}\right ) \\ P\left ( A_{n-2}|A_{1}\cap A_{2}\cap \cdots A_{n-3}\right ) & \cdots P\left ( A_{2}|A_{1}\right ) P\left ( A_{1}\right ) \end{align*}

The above is what is required to show (terms are just rewritten is reverse order from the problem statement, rearranging, we obtain

\[ P\left ( A_{1}\cap A_{2}\cap \cdots A_{n}\right ) =P\left ( A_{1}\right ) P\left ( A_{2}|A_{1}\right ) \cdots P\left ( A_{n-1}|A_{1}\cap A_{2}\cap \cdots \cap A_{n-2}\right ) P\left ( A_{n}|A_{1}\cap A_{2}\cap \cdots \cap A_{n-1}\right ) \]

Show that \(P\left ( A\cup B\right ) \leq P\left ( A\right ) +P\left ( B\right ) \)

Case 1: If \(A,B\) are disjoint then \(A\cup B=A+B\) by set theory. Now apply the probability operator on both sides we obtain that

Now, by Axiom 3, \(P\left ( A+B\right ) =P\left ( A\right ) +P\left ( B\right ) \) hence the above becomes

Case 2: If \(A\subset B\) then \(A\cup B=B\) by set theory. Now apply the probability operator on both sides we obtain that

But \(P\left ( B\right ) \leq P\left ( B\right ) +P\left ( A\right ) \) since \(A\in \Omega \) and so \(P\left ( A\right ) \geq 0\) by axiom 2. Hence the above becomes

\begin{equation} P\left ( A\cup B\right ) \leq P\left ( B\right ) +P\left ( A\right ) \tag {0}\end{equation}

But by set theory \(A\cap B\) is disjoint from \(A\cap B^{c}\), then by axiom 3 the above becomes

\begin{equation} P\left ( A\right ) =P\left ( A\cap B\right ) +P\left ( A\cap B^{c}\right ) \tag {1}\end{equation}

But \(B\cap A\) is disjoint from \(B\cap A^{c}\), by set theory, then by axiom 3 the above becomes

\begin{equation} P\left ( B\right ) =P\left ( B\cap A\right ) +P\left ( B\cap A^{c}\right ) \tag {2}\end{equation}

But \(A\cap B,A\cap B^{c},\) and \(B\cap A^{c}\) are disjoint by set theory, then above can be written using axiom 3 as

\begin{equation} P\left ( A\cup B\right ) =P\left ( A\cap B\right ) +P\left ( A\cap B^{c}\right ) +P\left ( B\cap A^{c}\right ) \tag {3}\end{equation}

\[ P\left ( A\right ) +P\left ( B\right ) =P\left ( A\cap B\right ) +P\left ( A\cap B^{c}\right ) +P\left ( B\cap A\right ) +P\left ( B\cap A^{c}\right ) \]

\begin{align*} P\left ( A\cup B\right ) -\left [ P\left ( A\right ) +P\left ( B\right ) \right ] & =\left [ P\left ( A\cap B\right ) +P\left ( A\cap B^{c}\right ) +P\left ( B\cap A^{c}\right ) \right ] -\\ & \left [ P\left ( A\cap B\right ) +P\left ( A\cap B^{c}\right ) +P\left ( B\cap A\right ) +P\left ( B\cap A^{c}\right ) \right ] \end{align*}

Since \(B\cap A\) is an event in \(\Omega \) then \(P\left ( B\cap A\right ) \geq 0\) by axiom 2, hence the above can be written as

\begin{equation} P\left ( A\cup B\right ) \leq P\left ( A\right ) +P\left ( B\right ) \tag {4}\end{equation}

conclusion: We have looked at all 4 possible cases, and found that \(P\left ( A\cup B\right ) =P\left ( A\right ) +P\left ( B\right ) \) or \(P\left ( A\cup B\right ) \leq P\left ( A\right ) +P\left ( B\right ) \), hence \(P\left ( A\cup B\right ) \leq P\left ( A\right ) +P\left ( B\right ) \)

Note: I tried, really tried, to ﬁnd a method which would require me to use the hint given in the problem that if \(A\subset B\), then \(P\left ( A\right ) \leq P\left ( B\right ) \) but I did not need to use such a relationship in the above. But I still show a proof for this identity below

\(P\left ( B\right ) =P\left ( A\cup A^{c}\right ) \) by applying probability to each side.

But \(A,A^{c}\) are disjoint by set theory, hence \(P\left ( A\cup A^{c}\right ) =P\left ( A\right ) +P\left ( A^{c}\right ) \) by axiom 3.

Hence \(P(B)=P\left ( A\right ) +P\left ( A^{c}\right ) \), or \(P\left ( A\right ) =P\left ( B\right ) -P\left ( A^{c}\right ) \)

But by axiom 2, \(P\left ( A^{c}\right ) \geq 0\), hence \(P\left ( A\right ) \leq P\left ( B\right ) \), QED

Given: \(X\) binomial r.v., i.e. \(P\left ( X=k\right ) =\begin {pmatrix} n\\ k \end {pmatrix} p^{k}\left ( 1-p\right ) ^{n-k},\) Find the mode. This is the value \(k\) for which \(P\left ( X=k\right ) \) is maximum

The mode is where \(P\left ( X\right ) \) is maximum. Consider 2 terms, when \(X=k\), and \(X=k-1\), hence \(P\left ( X\right ) \) will be increasing when \(\frac {P\left ( X=k\right ) }{P\left ( X=k-1\right ) }>1\)

\begin{align*} \frac {P\left ( X=k\right ) }{P\left ( X=k-1\right ) } & =\frac {\begin {pmatrix} n\\ k \end {pmatrix} p^{k}\left ( 1-p\right ) ^{n-k}}{\begin {pmatrix} n\\ k-1 \end {pmatrix} p^{\left ( k-1\right ) }\left ( 1-p\right ) ^{n-\left ( k-1\right ) }}=\frac {\frac {n!}{\left ( n-k\right ) !\ \left ( k\right ) !}\ p^{k}\left ( 1-p\right ) ^{n-k}}{\frac {n!}{\left ( n-k+1\right ) !\ \left ( k-1\right ) !}\ p^{\left ( k-1\right ) }\left ( 1-p\right ) ^{n-\left ( k-1\right ) }}\\ & =\frac {\left ( n-k+1\right ) !\ \left ( k-1\right ) !\ }{\left ( n-k\right ) !\ \left ( k\right ) !}\frac {\left ( 1-p\right ) }{p}\\ & =\frac {\left ( n-k\right ) \ }{\ k}\frac {\left ( 1-p\right ) }{p}\end{align*}

so \(P\left ( X\right ) \) is getting larger when \(\frac {\left ( n-k\right ) \ }{\ k}\frac {\left ( 1-p\right ) }{p}>1\) or

\begin{align*} \left ( n-k\right ) \left ( 1-p\right ) & >kp\\ n-np-k+kp & >kp\\ np+p & >k\\ p\left ( 1+n\right ) & >k \end{align*}

So as long as \(k<p\left ( 1+n\right ) \), pmf is increasing. Since \(k\) is an integer, then we need the largest integer such that it is \(<p\left ( 1+n\right ) \), hence

In Binomial random variable we ask: How many are infected in a trial of length \(n\) given that the probability of being infected in each trial to be \(p.\) Here we view each trial as testing an individual. Consider it a ’hit’ if the individual is infected. The number of trials is \(100,000\), which is \(n\), and \(p=1/1000\).

Hence the probability of getting \(k=2\) hits is, using binomial r.v. is (\(k=2\) in this case)

(b)Using Poisson r.v. Poisson is a generalization of Binomial. X is the number of successes in inﬁnite number of trials, but with the probability of success in each one trial going to zero in such a way that \(np=\lambda \) .We compute \(p(X=k)=\frac {\lambda ^{k}}{k!}e^{-\lambda }\) \(,k=0,1,2,....\)

Hence here \(X=\) how many are infected as \(n\) gets very large and \(p\) , the probability of infection in each individual goes very small in such a way to keep \(np\) ﬁxed at a parameter \(\lambda \). Since here \(n\) is large and \(p\) is small, we approximate binomial to Poisson using \(\lambda =np=100000\times 0.001=\allowbreak 100.0\)

ps. computing a numerical value for the above, shows that using Binomial model, we obtain \(P\left ( X=2\right ) \)

I am not sure, these are such small values, this means there is almost no chance of ﬁnding 2 individuals infected in a population of 100,000? I would have expected to see a much higher probability than the above. I do not see what I am doing wrong if anything.