This is the San Diego Section 703   Baseball Sampling Stats   Web Page.     

 

Search our site


 

 

Jobs

 

Meetings

 

News

 

Calendar

 

Education

 

 

 

 

Articles

 

Contacts

 

Membership

 

Author Guide

 

Cert Info

 

Home Page

A Look Behind the Umpire's Mask: Probabilities in Sampling©


By John Haury, Ph.D., CQM, CQE
Senior Consultant, PQC Consulting, Inc. www.pqcconsulting.com

Copyright J. Haury, 2009. All rights reserved.

 

 

Introduction

It is important to know how sampling works. To do so, let's examine probabilities. We'll use a ‘lot’ of baseballs contaminated with some foul balls (defectives) as our running example. Balls and strikes will play a part as well.

Baseballs fail their quality requirements in as many ways as there are quality attributes, (e.g. roundness, color, color consistency and surface blemishes). Failure to meet requirements means defects are found. When a defect is significant, it causes the baseball to be declared defective (unacceptable). A minor-‘league’ blemish may not disqualify the baseball, but too many of them or important defects causes the ball to be thrown out (defective). Consider the quality of a pitch with no swing by the batter: The umpire calls it a 'ball' or a 'strike.' The umpire is the 'infallible inspector.' His two choices are the binomial results (two-names only: ball or strike); same situation when counting defectives. Words such as pass (fail) or accept (reject) describe inspection results using the binomial. For the 2-choice situation (e.g. counting defectives), the binomial distribution is in effect. Complicated things (e.g. automobiles, bulk newsprint, or loads of cement) are inspected for defects and only when that count is extreme do we use the word "defective" for such complex items. If the max count can not be greater than the number inspected, it is binomial. If the max count can be greater than the number inspected (unlimited count due to multiple defects per unit), it is a Poisson distribution.

“Let’s play binomial ball.”



How probabilities work for “accept-if-zero-defectives” plans

How do we check a batch of balls prior to use? The most conservative method is to examine every ball. This may be necessary if bad balls cause big problems. If we have 100% confidence in our production, then no inspections would be needed. A middle ground is to inspect a sample as disaster-avoidance. Making decisions based on inspecting samples is called acceptance sampling.

Without hitting the details of sampling and inspecting lots of baseballs, we recognize that our inspection sample should 'represent' the lot. Each ball should have an equal chance of being picked for inspection.

Let's examine the most restrictive acceptance sampling plan: The inspection sample must show zero defectives to accept the lot. Such a plan will vary in its 'strictness' based only on the number of baseballs examined: Inspecting 10 balls will be easier to pass than inspecting 100. The opportunity for failure is 10-fold greater in the latter case.

Although intuitively true that the probability of lot acceptance improves with lot quality, it is probability that makes it true. As the number of bad balls increases in the batch, the chance of finding one goes up. This is true when inspecting 10 or 100 or 1000. If the lot has only perfect balls, no defective balls can ever be found.

Of extreme importance is the ability of the acceptance sampling plan to detect low levels of defectives. If we want to detect lots with a 2% failure rate, what plan will we use? If we limit our choice to plans that only accept when no defective balls are found in the inspection sample, we can only change the sample size.

The math behind detecting 'bad' lots using 'accept-only-if-zero-defectives' is as follows: Each item drawn into the inspection sample has a chance (ranging from 0 to 1) of being defective. One minus the probability of being defective is the probability of being non-defective. What if we examine 10 items from a batch that has 2% defective balls? Being 2% defective means the lot is 98% perfect. Each draw from the batch has 0.98 probability of being non-defective. A sample size of one has 98% chance of not detecting a defective ball. Each additional ball reduces that 98% by the same amount (98%). For a sample of 10, the math is 0.98 times itself 10 times which is approx 82%. Thus sampling 10 and accepting only-if-zero-defectives has 82% chance of accepting a lot with 2% defectives. Placing that math (as represented by Eq. I) into a spreadsheet, one can search for the number of trials (n) which will bring that 82% chance of acceptance down to only 10%, which will give 90% chance of detecting lots with 2% defectives.

P(a) = P(non-defective ball)^n                                    Equation I

 

Read Eq. I as follows:  “The probability of lot-acceptance equals the probability of catching/finding a non-defective ball in said lot, raised to the nth power, where n is the sample size, i.e. the number of catches from that lot.” Eq I only works if you 'accept-the-lot-only-if-zero-defectives are found in the catch sample.'

Examples


What sample size is needed to detect 2% defectives (at 90% probability of detection)? Trying different 'n' values in Eq. I where the P(non-defective ball) is 0.98 resulted in n=114 for P(a) to be 0.10. This result means that a 114-sample-plan (accept only if zero defectives) has 90% probability of detecting 2% defective rate.

Using Eq. I, let's examine two specific plans and their ability to detect defectives with 90% confidence. The following two plans (accepting only if zero defectives) can detect 25% and 10% failure rates respectively: Use n=8 to detect 25% failure rate and n=22 to detect 10% failure rate. For comparison, recall that n=114 is required to detect 2% defectives (all at 90% probability of detection). Let me suggest that you never sample fewer than 8 because 8 will have a high probability of detecting lots with 25% failures. The justification for being insensitive to poor quality until it reaches 10% or 25% defectives is best stated as follows: "Our manufacturing process with its controls, control charts, feedback systems, continuous process monitoring, fail-safe-Six-Sigma procedures and periodic audits, requires acceptance sampling for product-release only as a disaster check. Furthermore, breakdowns in our process, while rare, cause large numbers of defectives. Therefore detecting 10% or 25% failure rate is sufficient to protect against process breakdown."



Cautions (Read these only when ready for the Major Leagues)

Using ‘accept-only-if-zero-defectives’ plans may be stricter than you think. This article has focused on detecting low levels of defectives while ignoring the fact that sampling variability may cause a false reject even when defectives are rare. Accept-only-if-zero plans tend to have more false rejects than plans that accept-on-one-or-more-defectives. The (low but real) risk of calling a strike when outside the strike zone (false reject), is the umpire’s “acceptable quality level.” Accept-if-zero plans will have false-reject errors with surprisingly low levels of true defectives. Major leaguers (the pros) must know about operating characteristic curves, a topic for future discussions.


Eq. I is a simplification of the binomial equation and works only for zero defectives in the inspection sample. If a count of 3 were acceptable, then we would add the probabilities of 0, 1, 2 and 3 counts which, of course, are all acceptable. That requires the binomial distribution, a topic for my next article. Perhaps the general topic of distributions should be discussed as well.

Furthermore, some qualities, such as baseball roundness are measured. This yields continuous variables. Acceptance sampling for variables requires a normal distribution and fewer samples to give the same level of detection as attributes plans.

Finally, the inspection sample should be no larger than 10% of the lot. Small samples from large lots will not materially impact the distribution of defectives in the lot. Large samples from small lots impact the probabilities and require the use of the hypergeometric distribution.

 

Take me out to the ballgame

 

Baseball prides itself in tracking every statistic and factual datum possible. Now let’s use acceptance sampling in a ball game. If each batter’s batting average is 0.2 (batting 200 in baseball lingo) how many at-bats (walks do not count) will occur in today’s game until we have 90% chance of a hit? This is analogous to a lot with 20% defectives. We need to turn baseball upside-down: For this game, a hit is a ‘defective.’ Each at-bat from said lot has 80% chance of being non-defective (an out). We multiply 0.8 by itself X times until we find an integer, X, which allows the result to be less than 10% (0.1). For X=2 at bats P=0.64, not near the required 0.1. Testing X= 3, 4 etc. we arrive at X=11 as shown in Equation II.

 

0.8^11 power = 0.8x0.8x.08x0.8x0.8x0.8x0.8x0.8x0.8x0.8x0.8=0.086   Equation II

 

In our hypothetical ballgame, the 11th at-bat (no walks and no hits yet) has greater than 90% probability of a hit occurring (1-0.086[from Eq.II] = 0.914 = 91.4%). Please realize that batter number 11 does NOT have an increased probability of a hit because the prior 10 at-bats did not get a hit. Individual probabilities of a hit remain 20%. We are estimating probabilities of events over many replications of the same situation. At the end of the season looking back, we would see that hit-less streaks of 10 to 12 at-bats were reasonably rare (under our strict assumption of each hitter maintaining a 200 batting average against all pitchers that they faced).  Table I summarizes the 3 accept-on-zero-defectives plans discussed in this article.

 

Table I:  Ability of 3 Plans to Detect Given Levels of Defectives (@90% probability)

Defectives Sampling Plan

Can detect this @90% prob

Lot Tol. % Defective*

Sample 8, accept if 0

25% defectives rate

25.01%

Sample 11, accept if 0

20% defectives rate

18.89%

Sample 22, accept if 0

10% defectives rate

  9.94%

*Lot Tolerance % Defective (LTPD) is the lot-quality that has 90% chance of being detected and rejected. That is why the % values in cols 2 & 3 are nearly the same.

 

One may be tempted to use the batting average itself (20% probability of a hit) to estimate how-many-batters-before-a-hit. The idea that 20% times 20% means that it takes only 2 consecutive batters (=0.04 or 4%) for a high chance of a hit (96%) is wrong. Oops! That strategy instead yields the probability of 2-consecutive batters each getting a hit, when their batting average is 200. The question was: How many at bats before the probability of a hit gets to at least 90%.

 

In sum, and in fun

 

Baseball is a fine place to learn probabilities and review probabilities: In this case, probabilities associated with attributes acceptance sampling. As mentioned earlier, there are more topics to discuss (with or without baseballs) such as the binomial, the Poisson and the hypergeometric distributions. The probabilities of a hit based on the number of at-bats has another dimension: The variability of when that hit comes! Each of the afore-mentioned distributions has variability. Without variability, the games we play would be boring. For example, with zero variability the batter averaging 200 would go hitless for 4 at-bats, then get a hit on the 5th at-bat, every-time. Games are fun, partly due to the fact that ‘Chance’ is playing along with us too.

 

Remember:

Predictability is boring when at play.

Predictability is excellent when at work (e.g. safety and efficiency).

 

GMP & ISO Compliance

 

Click Here For Information

Refresher courses held at National University.  See our Education page 

Click www.pqsinc.org or call, or email sharky.watkins@pqsinc.org 

 

Quality & Productivity Resources

Lean Six Sigma Black Belt

Lean Kaizen Simulations

Six Sigma Black Belt

Click Here For Information

or Call 858.204.2656, email: info@qprs.com or see http://www.qprs.com/

Lean Six-Sigma:  Green-Belt, Black-Belt & Master B-Belt 

Lead & facilitate Lean Six-Sigma implementation in your company.  Click Here for More Info.

Pay now with credit card or PayPal.

visa logo


San Diego Bay View

ASQ San Diego

P.O. Box 928457
San Diego, CA 92192-8457

Page Last Updated:

  Monday December 21, 2009 

 

Copyright © 2009

All rights reserved

 

Best results with:

Set to medium size text

Some files require:

Free reader here