Wednesday, January 28, 2009

Understand how statistical distributions affect simulation behaviour

 
IBM Certification Test 992.3 - Simulation
Understand how statistical distributions affect simulation behavior
 
 

Specifying distributions

Measurements using any variable, even the same variable on the same subject, result in different outcomes. The pattern of different outcomes is called the distribution, which can be described mathematically and graphically. The distribution describes the relative number of times each possible outcome will occur in a number of trials.
 
To specify a distribution, complete the following steps:
  1. Select Distribution.
  2. Select a type of distribution from the drop-down list. The settings that you can specify depend on the type of distribution you select, as shown in the following table:
    Distribution type Settings Notes Parameter constraints
    Beta A, B Useful for Bayesian statistical models, which represent degrees of belief. For example, you could use this distribution to model the degree to which the results of a medical test can be believed to be a valid result rather than a false positive. A and B must be greater than 0.
    Continuous Value, Probability

    Useful for specifying ranges of values and a probability for each range. Within each range that you define, values generated are evenly distributed as in a Uniform distribution.

    Click Add one or more times to add new values. For each value that you add, specify a number and assign a probability. Each probability that you specify is the probability that the distribution result will occur in the range between the current value and the next lowest value. The lowest value that you specify should have a probability of 0, unless you want the distribution to generate results exactly equal to the lowest value.

    To remove a value and its probability, select it and click Remove.

    Each probability must be a value between 0 and 100. The values can be anything but must be in increasing order.
    Erlang Exp mean, K Useful for representing waiting times in queueing systems, such as waiting times of customers in a call center. Both parameters must be greater than 0.
    Exponential Mean Useful for characterizing random variables that can take only positive values. Completely determined by its mean. A distribution that fits time-series data, such as arrival times, where you expect arrivals at a constant rate.  Mean must be greater than 0.
    Gamma Alpha, Beta Useful for continuous random variables that are constrained to be greater than or equal to 0. Characterized by the Shape (Alpha) and Scale (Beta) parameters. Can be used relative to waiting times. Alpha and Beta must be greater than 0.
    Johnson Gamma, Delta, Lambda, Xi, and one of the following types:
    • sn (Normal form)
    • sb (Bounded form)
    • su (Unbounded form)
    • sl (Lognormal form)
    Also known as best fit distribution. Useful for defining a distribution based on available data. This distribution allows a high degree of control in adjusting the distribution curve to match the available data. Delta must not be 0.
    Lognormal Log mean, Log standard Useful for random variables constrained to be greater than 0. A positively skewed distribution that can take on various shapes. Can be used to describe returns calculated over periods of a year or more. The standard deviation must be greater than 0.
    Normal Mean, Standard Deviation The well-known bell curve. Useful in characterizing a large variety and type of data. Also called Gaussian distribution. No restrictions.
    Poisson Mean Useful in characterizing discrete events occurring independently of one another in time. Used to count a number of events across time or over an area. Often used when the probability of an event is small and the number of opportunities for the event is large. Mean must be greater than 0.
    Random list List of values Provides a list of values, any of which can be selected with equal probability. Click Add one or more times to create list items. For each list item, assign a value. To remove a list item, select it and click Remove. Each element can be any value.
    Triangular Minimum, Maximum, Mode Useful for approximate modeling of time required to complete a task when no real-world results are available. The mode defines the most likely value. Minimum must be less than Maximum.
    Uniform Minimum, Maximum Distributes values evenly over a range. Used for data ranging between two defined limits, where each possible value is equally likely. Any value, provided Minimum is less than Maximum.
    Weibull Alpha, Beta Useful in modeling reliability, failure rates, and natural phenomena such as wind speeds in a particular location. Beta must not be 0.
    Weighted list Value, Probability Provides a weighted probability for each value you define. Click Add one or more times to create list items. For each list item, assign a value and a probability. To remove a list item, select it and click Remove. The size of Value must be equal to the size of Probability. Each Probability must be greater than 0.
 

Evidence and changing beliefs 

 
Bayesian inference is statistical inference in which evidence or observations are used to update or to newly infer the probability that a hypothesis may be true.
 

Bayesian inference uses aspects of the scientific method, which involves collecting evidence that is meant to be consistent or inconsistent with a given hypothesis. As evidence accumulates, the degree of belief in a hypothesis ought to change. With enough evidence, it should become very high or very low. Thus, proponents of Bayesian inference say that it can be used to discriminate between conflicting hypotheses: hypotheses with very high support should be accepted as true and those with very low support should be rejected as false. However, detractors say that this inference method may be biased due to initial beliefs that one holds before any evidence is ever collected.

 
 
Probability distributions can be assigned to:
  • Token creation
  • Task completion times
  • Task costs
  • Task revenue
  • Decision paths
 

Related links

 

No comments: