I was sitting in a model railroad club meeting when one of our more enthusiastic young members said, "Wouldn't it be cool if we could make a computer simulation, with trains going between stations and all. We could have cars and engines assigned to each train and timetables and…" So,
Search Results: simulation (462)
Recently I wrote about how to compute the Kolmogorov D statistic, which is used to determine whether a sample has a particular distribution. One of the beautiful facts about modern computational statistics is that if you can compute a statistic, you can use simulation to estimate the sampling distribution of
In SAS/IML programs, a common task is to write values in a matrix to a SAS data set. For some programs, the values you want to write are in a matrix and you use the CREATE FROM/APPEND FROM syntax to create the data set, as follows: proc iml; X =
Here's a simulation tip: When you simulate a fixed-effect generalized linear regression model, don't add a random normal error to the linear predictor. Only the response variable should be random. This tip applies to models that apply a link function to a linear predictor, including logistic regression, Poisson regression, and
SAS Viya for Learners offers free access to AI and machine learning software for higher education teaching and learning, hosted in a new learning portal by SAS.
I think every course in exploratory data analysis should begin by studying Anscombe's quartet. Anscombe's quartet is a set of four data sets (N=11) that have nearly identical descriptive statistics but different graphical properties. They are a great reminder of why you should graph your data. You can read about
Which topics and priorities will be the focus for chief risk officers in 2019? A range of experts and representatives of major banking institutions addressed these questions at a recent event, CRO and CFO Banking Agenda 2019, organised by CeTIF, the Research Center on Technologies, Innovation and Financial Services, in collaboration
I've previously written about how to deal with nonconvergence when fitting generalized linear regression models. Most generalized linear and mixed models use an iterative optimization process, such as maximum likelihood estimation, to fit parameters. The optimization might not converge, either because the initial guess is poor or because the model
Many SAS procedures support the BY statement, which enables you to perform an analysis for subgroups of the data set. Although the SAS/IML language does not have a built-in "BY statement," there are various techniques that enable you to perform a BY-group analysis. The two I use most often are
Der Sales Manager kann sich bezüglich des zu erwartenden Jahresergebnisses doch nicht so in Sicherheit wiegen, wie er dachte. Hans Huber aus unserem Callcenter hat eine höhere Wahrscheinlichkeit zu kündigen als Petra Hafner aus dem Controlling. Die Transaktionsverläufe der Kunden 42911, 85022 und 91294 passen ja gar nicht zu deren
In simulation studies, sometimes you need to simulate outliers. For example, in a simulation study of regression techniques, you might want to generate outliers in the explanatory variables to see how the technique handles high-leverage points. This article shows how to generate outliers in multivariate normal data that are a
„Die wichtigsten Dinge schreibt man am besten gleich in die Einleitung! Eventuell lesen einige ja gar nicht bis zum Hauptteil weiter“. Einen ähnlichen Gedanken hatte ich bei meinem aktuellen Buch Applying Data Science – Business Case Studies Using SAS auch. Da sind bereits in der Einleitung die Mehrwerte aufgezählt, die
Versicherungen arbeiten intensiv daran, ihre Geschäftsmodelle zu erneuern. Ein modernisiertes Aktuariat spielt dabei eine Schlüsselrolle. Warum? Das habe ich meinen Kollegen und ausgebildeten Aktuar Diego Rivas gefragt. Das Versicherungsgeschäft wirkt von außen wie ein langer, ruhiger Fluss. Trügt der Schein? Heute – eindeutig ja. Der Markt ist längst gesättigt, und
This episode covers one of the greatest challenges in Dutch data science: how to distribute €43 billion (no, that’s not a typo) among all Dutch health care insurers in a fair, equal and transparent way. To learn more, I visited the biggest health insurer of the country, Zilveren Kruis, and
Machine learning differs from classical statistics in the way it assesses and compares competing models. In classical statistics, you use all the data to fit each model. You choose between models by using a statistic (such as AIC, AICC, SBC, ...) that measures both the goodness of fit and the
This article shows how to use SAS to simulate data that fits a linear regression model that has categorical regressors (also called explanatory or CLASS variables). Simulating data is a useful skill for both researchers and statistical programmers. You can use simulation for answering research questions, but you can also
IFRS 9 ist seit dem 1. Januar 2018 in Kraft. Heute lässt sich feststellen: Die Umsetzung hat länger gedauert als erwartet, langfristige Auswirkungen sind noch nicht absehbar. Wie in meinem vorangegangenen Blogbeitrag beschrieben, haben Banken im Großen und Ganzen den Prozess jedoch recht gut bewältigt. Angesichts der immensen Kosten stellt
„Hallo, Herr Kaiser!“ Kennen Sie ihn noch? Von den 1970er-Jahren bis in die frühen 2000er kam er vor jeder Tagesschau in unser Wohnzimmer. Als Versicherungsvertreter verkörperte er Vertrauen, Nähe und Fairness. Egal ob Unfall-, Sach-, Haftpflicht- oder Kfz-Versicherung: Günter Kaiser wusste, was Sache ist. Er war das bekannteste Werbegesicht Deutschlands.
Recently I was asked to explain the result of an ANOVA analysis that I posted to a statistical discussion forum. My program included some simulated data for an ANOVA model and a call to the GLM procedure to estimate the parameters. I was asked why the parameter estimates from PROC
Recently I was asked to explain the result of an ANOVA analysis that I posted to a statistical discussion forum. My program included some simulated data for an ANOVA model and a call to the GLM procedure to estimate the parameters. I was asked why the parameter estimates from PROC
AI seems to be mentioned everywhere these days. But how can AI be used in day-to-day work? Here, Katherine Taylor explains an example of "practical AI" in banking using SAS Visual Data Mining and Machine Learning. She'll explore more business problems and industries in future posts.
This article describes best practices and techniques that every data analyst should know before bootstrapping in SAS. The bootstrap method is a powerful statistical technique, but it can be a challenge to implement it efficiently. An inefficient bootstrap program can take hours to run, whereas a well-written program can give
Ich muss gestehen: ich bin leidenschaftlicher Gamer. Man könnte auch sagen, ein „Nerd“. Ich liebe Computerspiele, nicht nur, sie zu spielen, sondern ich möchte auch wissen, wie sie gemacht werden, wie sie funktionieren und wohin die Entwicklung in Zukunft führen wird. Ganz besonders interessiert mich, wie künstliche Intelligenz in Spielen
If you want to bootstrap the parameters in a statistical regression model, you have two primary choices. The first, case resampling, is discussed in a previous article. This article describes the second choice, which is resampling residuals (also called model-based resampling). This article shows how to implement residual resampling in
If you want to bootstrap the parameters in a statistical regression model, you have two primary choices. The first is case resampling, which is also called resampling observations or resampling pairs. In case resampling, you create the bootstrap sample by randomly selecting observations (with replacement) from the original data. The
What if you could automatically detect supply chain anomalies as they happen, or even predict them in advance? You'd be able to take timely corrective action and help maximize revenue, margins, customer satisfaction and shareholder value. There's no question: Supply chain planning and execution is complex. From design and sourcing, to
By using a format, you can change the tick values and create values that range from 100 to 50 to 100 to display the probable outcome of a sporting event.
Deep learning (DL) is a subset of neural networks, which have been around since the 1960’s. Computing resources and the need for a lot of data during training were the crippling factor for neural networks. But with the growing availability of computing resources such as multi-core machines, graphics processing units
Every day I’m shufflin'. Shufflin', shufflin'. -- "Party Rock Anthem," LMFAO The most popular way to mix a deck of cards is the riffle shuffle, which separates the deck into two pieces and interleaves the cards from each piece. Besides being popular with card players, the riffle shuffle is
How can you use analytics to design better biopsies and improve outcomes? This high school student has some ideas, and she presented them at Analytics Experience 2018.