Jan 22nd, 2021

By the end of class you should understand the notion of probabilistic independence.

https://www.gradescope.com/courses/226051/assignments/956842

**Q:** Hi! This is a test question

**A1: ** live answered

**A2: ** test answer

**Q:** why is the chain rule P(E | F) * P(F) ?

**A1: ** It is actually just an algebraic rearrangment of the definition of conditional probability - move the terms around in the first equation and you get the second.

**Q:** what exactly is the normalization constant?

**A1: ** Could you specify the context you are referring to? A normalization in probability usually refers to a term that you multiply by some other term by to ensure the result is a valid probability (between 0 & 1 inclusive)

**Q:** why cant we consider P(E3|E1 or E3|E2) and not just P(E3|E2E1)

**A1: ** Using multiple conditional probability bars is an ill-defined concept as far as we are concerned.

**A2: ** It's subtle, but the conditional probability bar "|" is not a set operator. That is, it is not the same "kind" of thing as union "∩", intersection "∩", etc. The definition of P(A|B) is DEFINED to be P(A,B) / P(B)

**A3: ** could you please explain why that is?

**Q:** why is the generalized chain rule true?

**A1: ** Consider how the intersection of several events is itself and event. If you have P(A,B,C), you could also think of it as P(A,D) where D is A

**A2: ** So the generalized chain rule applies as a logical extention of the rule with just two events.
In our example, we would have P(A,D) = P(A|D)P(D). If we expand out D, we will have P(A,B,C) = P(A|B,C)P(B,C). We can apply the chain rule again to the P(B,C) term to end up with P(A,B,C) = P(A|B,C)P(B|C)P(C)

**A3: ** *whoop accidentally hit submit*
where D is B∩C, the same rules apply

**Q:** is P(A, B, C) the same as P(ABC) if not how are they different?

**A1: ** They are the same - just a notational difference! You can also see P(A∩B∩C) (I used the intersection unicode symbol, hopefully they render for you)

**Q:** so is P(EF) = P(FE) = P(F|E)P(E) = P(E|F)P(F)?

**A1: ** Yes!

**Q:** what does P (E|F) look like in a venn diagram?

**A1: ** We consider the event F as our "new universe". It is like a new (smaller) sample space. We now we ask "how probable is E in this new universe?"

**Q:** coudl you explain the mathematical independene again. i didnt understand the venn diagram

**A1: ** Check out slide 22. The second diagram is an example of independence, but you could have a more complicated overlapping of events that was still independent.

**A2: ** Thinking about indpendence in terms of a venn diagrams is not always straightforward. If E and F are independent, then E will have the same probability whether or not we consider the whole sample space, or we are only considering the universe of "F".

**Q:** so if you have four you would need to consider all pairs and all threes

**A1: ** Saw your other question. It depends on what you are trying to prove. There is a concept called "mutual independence" and there is a concept called "pairwise independence" and they are not the same thing.

**A2: ** Could you elaborate on what you mean about "having four"?

**Q:** Is there a situation where P(E1,E2,E3) = P(E1)P(E2)P(E3) but P(E1,E2) != P(E1)P(E2)?

**A1: ** I believe the former implies the latter, but the latter need not imply the former.

**A2: ** I think a silly example could be when P(E3) = 0? Then we're saying that P(E1, E2, E3) = P(E1) * P(E2) * 0 = 0, but E1 and E2 don't need to be independent

**A3: ** Oh yeah that's a good point. Thanks!

**Q:** like E F G H

**A1: ** Please comment on your previous question for future reference!

**Q:** back to Jerry's previous slide

**A1: ** Please comment on your previous question for future reference!

**Q:** so we can determine the D1 + D2 = 7 rolls are independent because the probabilities are all 1/36, which shows that it’s the same as rolling two dice? and this wouldn’t be the case for D1 + D2 = 5?

**A1: ** got it! and if the probability of rolling a certain sum from two dice is 1/36, could we say it’s independent?

**A2: ** The general principle is that the event of rolling a 7 does not tell us any information about what an individual die rolled because they are equally likely (knowing that the sum was 7 makes it no more likely that the first die was a 1 or a 2 or a 3 etc.)

**A3: ** Yes, for D1+D2=5, rolling a 6 on the first die would make it impossible to roll a 5

**Q:** could you reexplain the network reliability slide, im a bit confused

**A1: ** live answered

**Q:** Is the pset 2 on next week's topics too or just this week's?

**A1: ** (Answered by Jerry) get back to you on that - You should be in good shape to start!