Tuesday, December 27, 2011

30% Chance of Rain

We are all familiar with the weather forecasts.  When we look at the weather forecasts, they are usually accompanied with some symbols that match the prediction. For example, if the chance of rain is 40%, perhaps you will get a cloud picture with a few rain drops underneath.  For 20%, you get a cloud picture with no rain drops depicted.

Snow pictures can be fun too.  For 30% chance of snow you may see a cloud with 3 different sized snowflakes underneath. For 60% or more chance, you see the exact same picture except the snowflakes are a lot bigger!  The larger snowflakes symbolically represent an increased chance of snow.  The symbolic representation does not imply the size of the snowflakes, nor does it imply the amount of snow that will fall.

The symbols may or may not be helpful to you when you are looking at the weather at a glance; more information is contained in the precipitation percentage.

So what does it mean when the weather forecast says that the change of precipitation is 30% tomorrow? Does that mean it will rain 30% of the time tomorrow?  Or does it mean that 30% of the area covered by the forecast will have rain? In fact it means neither.  The forecast is interpreted as follows: Given the same weather conditions that exist at this moment, 30% of the time rain occurred the day following this condition at at least one point within the forecast area.  This also means 70% of the time rain did not occur anywhere at anytime the day following the weather condition that exists now.  The chance of not raining is more revealing.  The chance of rain forecast does not include information on how long it will rain and the amount of rain we will have if it does in fact rain.    

By nature, we attempt to simplify the probability information we are given by translating this uncertain information into more deterministic information.  We want to know whether we need to carry an umbrella tomorrow. 

Now to make matters worse, the weather forecast is continually changing on an hourly or more frequent basis.  Our 30% chance of rain may suddenly change into a no chance of rain an hour later. Why?  Because the meteorologists have more recent meteorological data that is fed into the computer model resulting in a revised prediction.

The probability of precipitation (PoP) is calculated using the following formula:
PoP = C X A, where "C" is the confidence that precipitation will occur somewhere in the forecast area and "A" is the percent of the forecast area that will receive measurable precipitation, if it does occur.

So if the meteorologist is 50% confident that is will rain tomorrow in Dallas, and scattered showers are expected covering 20% of the Dallas area, then the chance that you will have rain on your head sometime tomorrow is:

PoP = 50% X 20% = 10%

If the weather center is 100% confident that it's going to rain somewhere in Dallas tomorrow with the same scattered showers, our forecast is:

PoP = 100% X 20% = 20%

Note, we get exactly the same chance of rain if the forecaster is only 20% sure that rain will occur tomorrow, but if it does occur the entire forecast area will have rain:

PoP = 20% X 100% = 20%

Most weather forecasts include hourly forecasts extended from the present time up to 24 hours in the future.  When looking at this information, the chance of rain prediction applies to the time interval of the hour specified as opposed to the entire day for the daily forecasts.

The smaller the forecast area, the more useful the prediction. If we can discern chance of rain differences for smaller areas, then we can create separate forecasts.  If our coverage area is initially 10 square miles with a chance of rain of 36% over 80% of the area, but we know the chance of rain in a particular 2 square mile area is 80% over 100% of the area, then 2 separate forecasts  can be made:

Area 1 (2 square miles): PoP = 80% X 100% = 80%
Area 2 (8 square miles): PoP = 20% X 80% = 16%

And those 2 forecasts combined equal to the PoP forecast for the larger area.
Total = 80% * 20% + 16% * 80% = 29%

So if you are living in forecast Area 1, you would find the PoP for your specific area (80%) more accurate and useful than for the combined areas (29%).

The following assumptions are used when calculating the PoP:
  1. The amount of rain that will fall if it does rain is at least 0.01"
  2. The measurement applies to the the liquid precipitation (snow is converted to its equivalent liquid form)
  3. The probability is for the specified time (hourly, today, this afternoon, tonight, Wednesday, etc.)
  4. The forecast applies to any point within the forecast area.
Predicting rain is a bit more tricky than predicting the temperature (depending on temperature and humidity predictions among other things).  If the weather forecast if for a high of 70 tomorrow and it turns out the high was only 70, the our prediction was off by 5 degrees.  But if rain is predicted and none occurs, how do we then calculate the accuracy of the forecast?  But here's a way to forecast the chance of rain with 70% accuracy.  Every day simply state that the chance of rain today is 0%.  Since on average it only rains or snows on any given day 30% of the time.

The less it rains on average in a particular area, the more accurate your formulaic predictions can be.  If you live in a desert area that gets rain rarely during the summer months, you could predict no rain every day from June 1st to September 1st. If by chance it rains once during these 3 months, then you were wrong once over a 92 day period, giving you an accuracy of 98.9%. Great job! Non-precipitation is the weather expectation.

It may help in your planning to consider the inverse of PoP.  When the forecaster says there is a 10% chance of rain tomorrow, simply rephrase that to: There is a 90% chance that it will not rain where I am at anytime tomorrow.

So do I need that umbrella?  Maybe.... Maybe not.

Monday, December 5, 2011

The Painted Die

Let's play a simple dice game, no numbers - just the colors red and green.  We take a die and paint four of the faces green and 2 of the faces red.  The die is fair so any one of the 6 sides is equally likely to appear.  Now we roll the die 20 times and record the sequence of red and black events. But before the die is rolled 20 times you get to choose one of the following sequences:

Sequence 1: RGRRR
Sequence 2: GRGRRR
Sequence 3: GRRRRR

If your chosen sequence appears anywhere within the sequence of 20 events, you will receive $25.  So which sequence would you pick?

Researchers conducted this experiment with volunteers.  It turned out that 60% of the volunteers picked Sequence 2. Surprisingly, the scenario most likely to occur is actually Sequence 1!  Why?

Look carefully at Sequence 1 and you will notice that it is a subset of Sequence 2. I.e. anytime Sequence 2 occurs, Sequence 1 will ALWAYS occur. But when Sequence 1 occurs, Sequence 2 may or may not occur.

We are drawn to Sequence 2 since it contains 2 green results whereas the other sequences contain just 1 green result. And since green is more likely than red, we choose Sequence 2.

Our intuition which guides us in determining the perceived likelihood of the events does not misleads us. Once we learn the basics of probability we will be better equipped to challenge our intuitive assumptions and be better able to determine the correct results.

Friday, December 2, 2011

One Third Chance

Let's say you are given a bag with 3 identical marble inside. One of the marbles is white and the other 2 are black.  If you can't see the marbles and you select one from the bag, what are the chances the marble is white?

Since only 1 out of 3 is white, the probability is 1/3.

Ok, now let's say I give you 3 chances to win.  You get to select a marble 3 times.  What are the chances that after 3 attempts of choosing a marble, at least one of the times you correctly select a white marble?

Well, if your chances are 1/3 each time wouldn't the answer be 1/3 for the first time, 1/3 for the second time and 1/3 for the third time?  Add these chances up and you get 1/3+1/3+1/3 = 1.  That is, after 3 tries you are guarenteed to choose a white marble even though each attempt is independent and random.

Our intuition does not serve us well in this case. The probability of 1 is incorrect.  Let's take it step-by-step.

For our first attempt the chance of picking a white marble is 1/3.  So far so good.  Now for the second attempt, we need to consider the case in which we did not select a white marble in the first attempt.  That is, 2/3 of the time in the first case we will not select a white marble.  So given that we do not select a white marble in the first attempt, selecting a white marble in the second attempt is 2/3*1/3, which gives us 2/9. 

Now for the third attempt, we need to consider the case in which our first and second attempts were not successful. 1/3 probability for the first attempt plus 2/9 probability for the second attempt gives us 5/9 probability we will be successful on the first and second attempts.  Thus we are not successful on the first and second attempts 1-5/9 or 4/9 of the time. For the third attempt we have 1/3 chance of being successful given the first and second attempts fail.  This gives us 1/3*4/9=4/27.

Now all we have to do is add these 3 numbers together: 1/3+2/9+4/27 = 19/27.

We have 19/27 or 70.4% chance of drawing at least one white marble from the bag after 3 attempts.  Whew!

A simpler way is to think of the probability of NOT getting a white marble.  2/3 chance for the first attempt and 2/3 chance for the second attempt and 2/3 attempt for the third attempt.  Since each attempt is independent, we can simply multiply the 3 probabilities together: (2/3)*(2/3)*(2/3)=8/27.  We have a 8/27 probability of NOT getting a white marble.  The probability of getting a white marble is 1 minus this probability or 1-8/27 giving us the same answer above.

Ok, but you may say why doesnt mulitplying 1/3 * 1/3 * 1/3 work?  When we multiply 1/3 3 times, what we are really saying is: What is the probability I get a white marble the first time and get a white marble the second time and get a white marble the third time?  The chances of getting 3 white marbles in a row is 1.3*1/3*1/3 = 1/27.  We would expect this probability to be different and smaller than the probability of getting at least one white marble after 3 attempts.

Finally, you may ask how come adding 1/3+1/3+1/3 =1 does not work?  This is a common but logical fallacy in reasoning most of us are guilty of at first glance.  Adding together implies mutual exclusion, a subject to be covered later.

We can also obtain the answer by considering all of the possible scenarios and the probabilities of each scenario where B represents a black marble and W represents a white marble for an attempt. We can list all possible outcomes of the 3 attempts:  

1. BBB = (2/3)(2/3)(2/3)=8/27
2. BBW = (2/3)(2/3)(1/3)=4/27
3. BWB = (2/3)(1/3)(2/3)=4/27
4. BWW = (2/3)(1/3)(1/3)=2/27
5. WBB = (1/3)(2/3)(2/3)=4/27
6. WBW = (1/3)(2/3)(1/3)=2/27
7. WWB = (1/3)(1/3)(2/3)=2/27
8. WWW = (1/3)(1/3)(1/3)=1/27

Now add up the probabilities for all of the scenarios in which you get at least one white marble, i.e. all of the scenarios except the first one: 4/27+4/27+2/27+4/27+2/27+2/27+1/27=19/27, which is the same as we obtained earlier.

Note how a seemingly simple problem can easily lead us astray.  More on the way!

Tuesday, November 22, 2011

Pick a Number

Here's an easy game. Pick a random number between 1 and 10. No, not the rating of your favorite TV show. Just a random number. Run the numbers through your head and pick one. Got it? Don't read further until you settle on a number.

Your number is most likely 3 or 7.

We don't want to pick 1 or 10 since those we feel are outliers. And 5 is definitely not random since it is right in the middle. So some number greater than 1 and less than 5 or a number greater than 5 or less than 10 sounds like a reasonably random number. 2, 4, 6, or 9 may not be random enough since they are so close to the middel and extreme values. So we pick 3 or 7.

Of course in reality ALL numbers are equally random.

Us humans generally do not do a very good job picking random numbers nor ascertaining probabilities or likely events for any situation beyond the most simple scenarios using our intuition or common sense alone.

Thus many probability problems that apply to very common scenarios become paradoxical and fun to explore!

We will explore more of these paradoxes in this blog.

Friday, November 18, 2011

Pick a Card Part 2

Continuation of our puzzle from yesterday.

Let's solve this problem using brute force. We do this by looking at all of the possibilities. Let's label the cards as follows:
R1 - the first red card
R2 - the second red card
B1 - the first black card
B2 - the second black card

Next, list all of the possibilities of the cards from left to right:

1. R1 R2 B1 B2
2. R1 R2 B2 B1
3. R2 R1 B1 B2
4. R2 R1 B2 B1
5. R1 B1 R2 B2
6. R1 B1 B2 R2
7. B1 R1 R2 B2
8. B1 R1 B2 R2
9. B1 B2 R1 R2
10. B1 B2 R2 R1
11. B2 B1 R1 R2
12. B2 B1 R2 R1
13. B2 R2 R1 B1
14. B2 R2 B1 R1
15. R2 B2 R1 B1
16. R2 B2 B1 R1

We have exactly 16 different ways these cards can be dealt. Check it for yourself.

Now since these cards are dealt face down, we don't know the card values. Let's say we pick the left 2 cards. How many of the 16 possibilities gives us either 2 red cards or 2 black cards? Just check the list above and count them. We get 8.

Thus our probability of getting a pair of cards of the same color is 8/24 or 1/3!

Is that our final answer? Yes it is! :) Not 2/3, nor 1/2 as we might have expected intuitively.

Next time we will see if we can devise some shortcuts and see if we can extend this to more general problems.








Thursday, November 17, 2011

Pick a card, any card

Pick a card, any card. Better yet, pick 2! 

Let's say I have a small deck of 4 cards. 2  black and 2 red.  I shuffle this deck, place them face down, and ask you to pick any 2 cards. What is the probability that these card are the same color?

One person may say there are 3 possibilities. Both are black, both are red, or both are different colors. Therefore, the probability is 2/3.

Hold on! Says another person. That isn't right. They can  both be black, both be red, the first card you picked is black and the other is red, or the first card you picked is black and the other is red.  Thus, the cards match in 2 out 4 cases, giving us a probability of 1/2.

In fact, both are wrong!

What is the true probability and why?

Wednesday, November 9, 2011

Dice Paradox and Conditional Probability

Let's look at another example.

The probability of rolling a 7 with 2 dice is 1/6. We can verify this by first listing all possible combinations of the 2 dice:

(1,1), (1,2), (1,3), (1,4), (1,5), (1,6), (2,1), (2,2), (2,3), (2,4), (2,5), (2,6), (3,1), (3,2), (3,3), (3,4), (3,5), (3,6), (4,1), (4,2), (4,3), (4,4), (4,5), (4,6), (5,1), (5,2), (5,3), (5,4), (5,5), (5,6), (6,1), (6,2), (6,3), (6,4), (6,5), (6,6).

Now, we scan through this list of 36 possible events and count all events in which the sum of the 2 dice is 7. The number of events in which this is true is 6. Check for yourself. THus the probability of obtaining a 7 is 6/36 or 1/6. This is true regardless of whether you throw one die followed by the second die or throw them both at the same time.

If I bet on 7 for any random throw of the dice, I would have a chance of winning 1/6 of the time. Now, being an astute gambler, I would like to improve the odds of me winning. Let's have an impartial observer take a look at the dice after they are rolled, but not allow me to see the result. Let's say our favorite number is '4', so anytime the observer sees a '4' on at least one of the die thrown, he shouts out "I see a four!" With this extra bit of information provided before we make our bet, we have improved our odds of winning from 1 in 6 to 2 in 11! We have improved our odds in spite of the fact that we have not changed the probability of the resulting roll of the dice.

How can that be? Well, we just need to go back to our list of all possible events and count the number of events in which a '4' appears on at least one of the die. 11 of the 36 combinations have at least one '4' appear. Now of those 11, how many have a sum equal to 7? Exactly 2. Since we consider only consider betting on those dice rolls in which the number '4' is called out, we have only 11 possibilities and of those only 2 can sum up to 7.

This is an example of conditional probability. This extra bit of information we receive can affect the original probability. The conditional probability theorem gives us a shortcut method to obtain the probability without having to count all combinations and subset of combinations as we did in this example. We will explore the formulation of this conditional probability equation next time.

To be continued...

Sunday, October 23, 2011

A Game Of Dice Continued

Recall from our previous example we obtained the impossible probability of rolling a single die a number of times and obtaining a value greater than one. To help understand the flaw in our reasoning let's make this even simpler. What is the probability of rolling at least one '4' after 2 tosses of a single die?

Intuitively we might say 1/6 probability of rolling a '4' on the first toss plus 1/6 probability of rolling a '4' on the second toss gives us 1/3. So is 1/3 the right answer?

What about the case where we roll a '4' on the first toss and a '4' on the second toss?

If we make a list of all possible combinations of the value of the first roll of the die and the value of the second roll of the die, we will end up with with 36 combinations:
(1,1), (1,2), (1,3), (1,4), (1,5), (1,6), (2,1), (2,2), (2,3), (2,4), (2,5), (2,6), (3,1), (3,2), (3,3), (3,4), (3,5), (3,6), (4,1), (4,2), (4,3), (4,4), (4,5), (4,6), (5,1), (5,2), (5,3), (5,4), (5,6), (6,1), (6,2), (6,3), (6,4), (6,5), (6,6).

Recall our original question: What is the probability of rolling at least one '4' after two rolls of the die?

So all we have to do is to count the number of times we see at least one '4' show up in the 36 possible combinations. From the list we see eleven instances in which at least one '4' is present.

To find the probability we take the ratio of the number of combinations in which at least one '4' is present and the total possible combinations we can have.

Our answer is 11/36.

We have an 11/36 probability of rolling at least one '4' after 2 tosses of a die.

Mysterious? Counter intuitive. Absolutely! But with some straight forward logical reasoning we have come up with the true probability.

To be continued...




Saturday, October 22, 2011

A Taste of Monte Carlo Simulation

There are many areas in which we want to predict the future.

For example what will happen to my retirement account in five years? What will the weather be like next week? What is the inflation rate going to be over the next few years? How likely is it that my iPhone will fail or break within a year?

In traditional forecasts, we create a model that projects outcomes based on certain inputs or variables. These are entered into a model which relates the inputs mathematically to produce outputs. These inputs are single-point estimates or our "best guess." Since the inputs are point estimates, the outputs will be single point estimates as well. In other words, we have to have complete confidence in the accuracy of our inputs (e.g. inflation rate, failure rates, investment rate of return, GDP, etc.) if we want to believe the output is exactly correct.

Since forecast inputs are effectively predicting the future, for any real world phenomena the actual values are not known with absolute certainty.

To incorporate this uncertainty, each single-point input estimate can be replaced by a probability distribution that more accurately reflects the range of possibilities for that input.

The output will then be a range of possibilities or a probability distribution. To obtain this probability distribution on the output, we sample the input. The input is sampled based on the probability distribution associated with this input. For each sample we obtain one output value.

Using Monte Carlo simulation we repeat this process over and over multiple times to get a range of outputs based on a range of inputs.
This range of outputs represents the output probability distribution which gives us a more realistic set of possibilities for the future.

With this information we can begin to answer questions such as:
How likely is it that I will achieve millionaire status when I retire?
How likely is it that 100 of these units will fail within the next 10 years? what is the most important contributing factor to sales growth?

Thursday, October 20, 2011

A Game of Dice

Let’s play a game. Take one die and roll it. Remember the value of the roll. Then roll it again. What is the probability that after two rolls ‘4’ appears at least once? Ok, we reason, for the first roll of the die we have a 1/6 chance of a ‘4’ since the die has 6 possible values. And since the roll of the die the second time is not influenced by the value of the die we obtained after the first roll, we assign a 1/6 chance of a ‘4’ for the second roll as well. Now since we have 2 chances of getting a ‘4’ our odds should double then. So we add the two probabilities together. 1/6 + 1/6 = 1/3. We expect a 1/3 probability of seeing at least one ‘4’ after rolling the die twice. Now I ask “What is the probability that after 10 rolls of the die, a ‘4’ appears at least once. Ok, simple, let’s add them up. 1/6 probability for each roll of the die times the number of times we roll the die, i.e. 10 should give us the number. 1/6 + 1/6 + 1/6 + 1/6 + 1/6 + /1/6 + 1/6 + 1/6 + 1/6 + 1/6 = 10/6. “10/6 probability of getting a ‘4’”, we proudly say.

Wait a minute! How come you have a probability that exceeds 1? I thought a probability value can only range from 0 (absolutely impossible) to 1 (happens always every time). Obtaining 10/6 or 167% must mean that we are absolutely certain this will happen after 10 rolls and just to make sure we’ve added 67% padding on top for good measure!

Well, something must be wrong in our calculations or thinking. We know a probability of any event or series of events can never exceed 1. Where did we go wrong?

Politics In Organizations

Why is it that so many of us are dissapointed with the performance of politicians and wall street bankers? Some answers may be found in the new book "The Dictator's Handbook."