It can be frustrating when the same probability distribution has two different parameterizations, but such is the life of a statistical programmer. I previously wrote an article about the gamma distribution, which has two common parameterizations: one that uses a scale parameter (β) and another that uses a rate parameter
Author
A genetic algorithm (GA) is a heuristic optimization technique. The method tries to mimic natural selection and evolution by starting with a population of random candidates. Candidates are evaluated for "fitness" by plugging them into the objective function. The characteristics of the better candidates are combined to create a new
This article uses an example to introduce to genetic algorithms (GAs) for optimization. It discusses two operators (mutation and crossover) that are important in implementing a genetic algorithm. It discusses choices that you must make when you implement these operations. Some programmers love using genetic algorithms. Genetic algorithms are heuristic
Sometimes we can learn as much from our mistakes as we do from our successes. Recently, I needed to solve an optimization problem for which the solution vector was a binary vector subject to a constraint. I was in a hurry. Without thinking much about what I was doing, I
Many optimization problems in statistics and machine learning involve continuous parameters. For example, maximum likelihood estimation involves optimizing a log-likelihood function over a continuous domain, possibly with constraints. Recently, however, I had to solve an optimization problem for which the solution vector was a 0/1 binary variable. To solve the
In a matrix-vector language such as SAS/IML, it is useful to always remember that the fundamental objects are matrices and that all operations are designed to work on matrices. (And vectors, which are matrices that have only one row or one column.) By using matrix operations, you can often eliminate
A reader asked whether it is possible to find a bootstrap sample that has some desirable properties. I am using the term "bootstrap sample" to refer to the result of randomly resampling with replacement from a data set. Specifically, he wanted to find a bootstrap sample that has a specific
Graphing data is almost always more informative than displaying a table of summary statistics. In a recent article about "dynamite plots," I briefly mentioned that graphs such as box plots and strip plots are better at showing data than graphs that merely show the mean and standard deviation. This article
The field of probability and statistics is full of asymptotic results. The Law of Large Numbers and the Central Limit Theorem are two famous examples. An asymptotic result can be both a blessing and a curse. For example, consider a result that says that the distribution of some statistic converges
The SAS/IML language supports lists, which are containers that store other objects, such as matrices and other lists. A primary use of lists is to pack objects of various types into a single symbol that can be passed to and from modules. A useful feature of using lists is that