Tuesday, December 27, 2011

30% Chance of Rain

We are all familiar with the weather forecasts.  When we look at the weather forecasts, they are usually accompanied with some symbols that match the prediction. For example, if the chance of rain is 40%, perhaps you will get a cloud picture with a few rain drops underneath.  For 20%, you get a cloud picture with no rain drops depicted.

Snow pictures can be fun too.  For 30% chance of snow you may see a cloud with 3 different sized snowflakes underneath. For 60% or more chance, you see the exact same picture except the snowflakes are a lot bigger!  The larger snowflakes symbolically represent an increased chance of snow.  The symbolic representation does not imply the size of the snowflakes, nor does it imply the amount of snow that will fall.

The symbols may or may not be helpful to you when you are looking at the weather at a glance; more information is contained in the precipitation percentage.

So what does it mean when the weather forecast says that the change of precipitation is 30% tomorrow? Does that mean it will rain 30% of the time tomorrow?  Or does it mean that 30% of the area covered by the forecast will have rain? In fact it means neither.  The forecast is interpreted as follows: Given the same weather conditions that exist at this moment, 30% of the time rain occurred the day following this condition at at least one point within the forecast area.  This also means 70% of the time rain did not occur anywhere at anytime the day following the weather condition that exists now.  The chance of not raining is more revealing.  The chance of rain forecast does not include information on how long it will rain and the amount of rain we will have if it does in fact rain.    

By nature, we attempt to simplify the probability information we are given by translating this uncertain information into more deterministic information.  We want to know whether we need to carry an umbrella tomorrow. 

Now to make matters worse, the weather forecast is continually changing on an hourly or more frequent basis.  Our 30% chance of rain may suddenly change into a no chance of rain an hour later. Why?  Because the meteorologists have more recent meteorological data that is fed into the computer model resulting in a revised prediction.

The probability of precipitation (PoP) is calculated using the following formula:
PoP = C X A, where "C" is the confidence that precipitation will occur somewhere in the forecast area and "A" is the percent of the forecast area that will receive measurable precipitation, if it does occur.

So if the meteorologist is 50% confident that is will rain tomorrow in Dallas, and scattered showers are expected covering 20% of the Dallas area, then the chance that you will have rain on your head sometime tomorrow is:

PoP = 50% X 20% = 10%

If the weather center is 100% confident that it's going to rain somewhere in Dallas tomorrow with the same scattered showers, our forecast is:

PoP = 100% X 20% = 20%

Note, we get exactly the same chance of rain if the forecaster is only 20% sure that rain will occur tomorrow, but if it does occur the entire forecast area will have rain:

PoP = 20% X 100% = 20%

Most weather forecasts include hourly forecasts extended from the present time up to 24 hours in the future.  When looking at this information, the chance of rain prediction applies to the time interval of the hour specified as opposed to the entire day for the daily forecasts.

The smaller the forecast area, the more useful the prediction. If we can discern chance of rain differences for smaller areas, then we can create separate forecasts.  If our coverage area is initially 10 square miles with a chance of rain of 36% over 80% of the area, but we know the chance of rain in a particular 2 square mile area is 80% over 100% of the area, then 2 separate forecasts  can be made:

Area 1 (2 square miles): PoP = 80% X 100% = 80%
Area 2 (8 square miles): PoP = 20% X 80% = 16%

And those 2 forecasts combined equal to the PoP forecast for the larger area.
Total = 80% * 20% + 16% * 80% = 29%

So if you are living in forecast Area 1, you would find the PoP for your specific area (80%) more accurate and useful than for the combined areas (29%).

The following assumptions are used when calculating the PoP:
  1. The amount of rain that will fall if it does rain is at least 0.01"
  2. The measurement applies to the the liquid precipitation (snow is converted to its equivalent liquid form)
  3. The probability is for the specified time (hourly, today, this afternoon, tonight, Wednesday, etc.)
  4. The forecast applies to any point within the forecast area.
Predicting rain is a bit more tricky than predicting the temperature (depending on temperature and humidity predictions among other things).  If the weather forecast if for a high of 70 tomorrow and it turns out the high was only 70, the our prediction was off by 5 degrees.  But if rain is predicted and none occurs, how do we then calculate the accuracy of the forecast?  But here's a way to forecast the chance of rain with 70% accuracy.  Every day simply state that the chance of rain today is 0%.  Since on average it only rains or snows on any given day 30% of the time.

The less it rains on average in a particular area, the more accurate your formulaic predictions can be.  If you live in a desert area that gets rain rarely during the summer months, you could predict no rain every day from June 1st to September 1st. If by chance it rains once during these 3 months, then you were wrong once over a 92 day period, giving you an accuracy of 98.9%. Great job! Non-precipitation is the weather expectation.

It may help in your planning to consider the inverse of PoP.  When the forecaster says there is a 10% chance of rain tomorrow, simply rephrase that to: There is a 90% chance that it will not rain where I am at anytime tomorrow.

So do I need that umbrella?  Maybe.... Maybe not.

Monday, December 5, 2011

The Painted Die

Let's play a simple dice game, no numbers - just the colors red and green.  We take a die and paint four of the faces green and 2 of the faces red.  The die is fair so any one of the 6 sides is equally likely to appear.  Now we roll the die 20 times and record the sequence of red and black events. But before the die is rolled 20 times you get to choose one of the following sequences:

Sequence 1: RGRRR
Sequence 2: GRGRRR
Sequence 3: GRRRRR

If your chosen sequence appears anywhere within the sequence of 20 events, you will receive $25.  So which sequence would you pick?

Researchers conducted this experiment with volunteers.  It turned out that 60% of the volunteers picked Sequence 2. Surprisingly, the scenario most likely to occur is actually Sequence 1!  Why?

Look carefully at Sequence 1 and you will notice that it is a subset of Sequence 2. I.e. anytime Sequence 2 occurs, Sequence 1 will ALWAYS occur. But when Sequence 1 occurs, Sequence 2 may or may not occur.

We are drawn to Sequence 2 since it contains 2 green results whereas the other sequences contain just 1 green result. And since green is more likely than red, we choose Sequence 2.

Our intuition which guides us in determining the perceived likelihood of the events does not misleads us. Once we learn the basics of probability we will be better equipped to challenge our intuitive assumptions and be better able to determine the correct results.

Friday, December 2, 2011

One Third Chance

Let's say you are given a bag with 3 identical marble inside. One of the marbles is white and the other 2 are black.  If you can't see the marbles and you select one from the bag, what are the chances the marble is white?

Since only 1 out of 3 is white, the probability is 1/3.

Ok, now let's say I give you 3 chances to win.  You get to select a marble 3 times.  What are the chances that after 3 attempts of choosing a marble, at least one of the times you correctly select a white marble?

Well, if your chances are 1/3 each time wouldn't the answer be 1/3 for the first time, 1/3 for the second time and 1/3 for the third time?  Add these chances up and you get 1/3+1/3+1/3 = 1.  That is, after 3 tries you are guarenteed to choose a white marble even though each attempt is independent and random.

Our intuition does not serve us well in this case. The probability of 1 is incorrect.  Let's take it step-by-step.

For our first attempt the chance of picking a white marble is 1/3.  So far so good.  Now for the second attempt, we need to consider the case in which we did not select a white marble in the first attempt.  That is, 2/3 of the time in the first case we will not select a white marble.  So given that we do not select a white marble in the first attempt, selecting a white marble in the second attempt is 2/3*1/3, which gives us 2/9. 

Now for the third attempt, we need to consider the case in which our first and second attempts were not successful. 1/3 probability for the first attempt plus 2/9 probability for the second attempt gives us 5/9 probability we will be successful on the first and second attempts.  Thus we are not successful on the first and second attempts 1-5/9 or 4/9 of the time. For the third attempt we have 1/3 chance of being successful given the first and second attempts fail.  This gives us 1/3*4/9=4/27.

Now all we have to do is add these 3 numbers together: 1/3+2/9+4/27 = 19/27.

We have 19/27 or 70.4% chance of drawing at least one white marble from the bag after 3 attempts.  Whew!

A simpler way is to think of the probability of NOT getting a white marble.  2/3 chance for the first attempt and 2/3 chance for the second attempt and 2/3 attempt for the third attempt.  Since each attempt is independent, we can simply multiply the 3 probabilities together: (2/3)*(2/3)*(2/3)=8/27.  We have a 8/27 probability of NOT getting a white marble.  The probability of getting a white marble is 1 minus this probability or 1-8/27 giving us the same answer above.

Ok, but you may say why doesnt mulitplying 1/3 * 1/3 * 1/3 work?  When we multiply 1/3 3 times, what we are really saying is: What is the probability I get a white marble the first time and get a white marble the second time and get a white marble the third time?  The chances of getting 3 white marbles in a row is 1.3*1/3*1/3 = 1/27.  We would expect this probability to be different and smaller than the probability of getting at least one white marble after 3 attempts.

Finally, you may ask how come adding 1/3+1/3+1/3 =1 does not work?  This is a common but logical fallacy in reasoning most of us are guilty of at first glance.  Adding together implies mutual exclusion, a subject to be covered later.

We can also obtain the answer by considering all of the possible scenarios and the probabilities of each scenario where B represents a black marble and W represents a white marble for an attempt. We can list all possible outcomes of the 3 attempts:  

1. BBB = (2/3)(2/3)(2/3)=8/27
2. BBW = (2/3)(2/3)(1/3)=4/27
3. BWB = (2/3)(1/3)(2/3)=4/27
4. BWW = (2/3)(1/3)(1/3)=2/27
5. WBB = (1/3)(2/3)(2/3)=4/27
6. WBW = (1/3)(2/3)(1/3)=2/27
7. WWB = (1/3)(1/3)(2/3)=2/27
8. WWW = (1/3)(1/3)(1/3)=1/27

Now add up the probabilities for all of the scenarios in which you get at least one white marble, i.e. all of the scenarios except the first one: 4/27+4/27+2/27+4/27+2/27+2/27+1/27=19/27, which is the same as we obtained earlier.

Note how a seemingly simple problem can easily lead us astray.  More on the way!