@philsimon on what we can learn about data quality from Jeff Bezos's behemoth.
@philsimon on what we can learn about data quality from Jeff Bezos's behemoth.
Anknüpfend an meinen Einstieg in die Big-Data-Welt und nach meiner Reise in die Vergangenheit mit „In-Memory“ hat mich die Neugier gepackt. Was hat es mit anderen Technologien auf sich, die gerade dabei sind, unsere Welt zu revolutionieren? Blicken wir zunächst einmal auf „Event Stream Processing“ (ESP). Ein Thema, das gerade
Last week Robert Allison showed how to download NBA data into SAS and create graphs such as the location where Stephen Curry took shots in the 2015-16 season to date. The graph at left shows the kind of graphs that Robert created. I've reversed the colors from Robert's version, so
The new book Business Forecasting: Practical Problems and Solutions contains a large section of recent articles on forecasting performance evaluation and reporting. Among the contributing authors is Rob Hyndman, Professor of Statistics at Monash University in Australia. To anyone needing an introduction, Hyndman's credentials include: Editor-in-chief of International Journal of
As the big data era continues to evolve, Hadoop remains the workhorse for distributed computing environments. MapReduce has been the dominant workload in Hadoop, but Spark -- due to its superior in-memory performance -- is seeing rapid acceptance and growing adoption. As the Hadoop ecosystem matures, users need the flexibility to use either traditional MapReduce
Im diesem Gastbeitrag von Accantec geht es um den Datenschutz im Big Data Umfeld. Accantec präsentiert sich auf dem diesjährigen SAS Forum in Bonn (28. April) mit einem eigenen Stand. Lassen wir ab jetzt Gero Hentschel von Accantec sprechen. Big Data ist längst keine Modeerscheinung mehr, sondern in vielen Unternehmen mittlerweile
Many of us here at SAS are hustling to prepare for SAS Global Forum set to begin April 18 in Las Vegas. This year’s agenda is shaping up to be a “can’t-miss” event. Monday, before SAS Global Forum begins, there are two SAS Certification testing events. Then after the conference
In an upcoming paper for SAS Global Forum, several of us from the SAS Text Analytics team explore shifting the context of our underlying representation from documents to the sentences that are within the documents. We then look at how this shift can allow us to answer new text mining
At a recent TDWI conference, I was strolling the exhibition floor when I noticed an interesting phenomenon. A surprising percentage of the exhibiting vendors fell into one of two product categories. One group was selling cloud-based or hosted data warehousing and/or analytics services. The other group was selling data integration products. Of
There are several ways to simulate multinomial data in SAS. In the SAS/IML matrix language, you can use the RANDMULTINOMIAL function to generate samples from the multinomial distribution. If you don't have a SAS/IML license, I have previously written about how to use the SAS DATA step or PROC SURVEYSELECT
Nothing works today without an efficient data management – also in insurance business. A standard data model can be an important component of it. This article explains why. “Make or Buy”? This question has been raised very often by insurance companies planning to introduce a consistent data structure – a
Streaming analytics is a red hot topic in many industries. As the Internet of Things continues to grow, the ability to process and analyze data from new sources like sensors, mobile phones, and web clickstreams will set you apart from your competition. Event stream processing is a popular way to
In my SAS Press book Business Statistics Made Easy in SAS® I place a strong focus on the skill of extrapolating analytics/statistical outcomes to key business implications (similar techniques can be used to link statistics to other key societal outcomes). Unfortunately, business analytics often stops short of defining the impact
Math lovers, do you know what day it is? It's Pi Day, which we celebrate every year on March 14 because the date 3-14 matches the first three digits of pi, 3.14. This year, I'm celebrating with poetry, combining my love of math with my love of language. Word Spy explains that a pi-ku is
People have always been fascinated by sports statistics, and with the recent popularity of fantasy sports there is an increased demand for custom analyses of the sports data. With those folks in mind, I have created a simple example that SAS programmers can use as a starting point for analyzing NBA
In the past, we've always protected our data to create an integrated environment for reporting and analytics. And we tried to protect people from themselves when using and accessing data, which sometimes could have been considered a bottleneck in the process. We instituted guidelines and procedures around: Certification of the data
Today is March 14th, which is annually celebrated as Pi Day. Today's date, written as 3/14/16, represents the best five-digit approximation of pi. On Pi Day, many people blog about how to approximate pi. This article uses a Monte Carlo simulation to estimate pi, in spite of the fact that
I was answering questions about SAS in a forum the other day, and it struck me how much easier it is to help folks if they can provide a snippet of data to go along with their program when asking others to help troubleshoot. This makes it easy to run
Administration einer SAS Umgebung: Wir holen das Beste aus Ihrer SAS Umgebung heraus – das ist unser Motto. Sie stehen vor einem umfangreichen SAS Metadaten Security Review? Einer SAS Migration oder einem Umzug auf eine neue Infrastruktur? Sie schieben die längst notwendigen Aktivitäten immer wieder vor sich her, weil Sie
How many of us have used the phrases… It’s a piece of cake Anyone can do it It’s as easy as ABC I could do it with my eyes shut When it comes to business intelligence it should be “easy peasy” but for many organization it can still be a
The importance of data analytics in the UK public sector and wider society was in the spotlight earlier this year, following a report from Policy Exchange. It called for elected mayors to set up an Office of Data Analytics. If enacted, these teams of experts will have one central aim:
When you spend long enough writing and working in any industry, you inevitably see trends emerge and reach varying levels of maturity. Data governance is one such trend, as you can see from the following Google Trends chart:
Do you like a good horror story? Then may I suggest “Future Crimes” by Marc Goodman. When it comes to this genre, Wes Craven, John Carpenter and Stephen King have got nothing on Goodman, primarily because Goodman’s story is non-fiction. Scene 1: The present – Your workstation or data center Whether
You can use histograms to visualize the distribution of data. A comparative histogram enables you to compare two or more distributions, which usually represent subpopulations in the data. Common subpopulations include males versus females or a control group versus an experimental group. There are two common ways to construct a
Our colleagues at the SAS office in Korea recently had the opportunity to interview two customers from KT, one of the biggest telecommunications companies in Korea, about getting SAS certified. Sung-chul Hwang and Gyu-seob Lee both have four SAS certifications – Base Programmer, Advanced Programmer, Statistical Business Analyst and Predictive
We recently had a flooding event at Jordan Lake where the water rose almost 20 feet above normal. This blog details that flooding event in both photos and graphs. If you're intrigued by weather, boats, or lakes then this blog's for you! In NC's Research Triangle Park area, there are basically two
.@philsimon lists the gravest data-quality errors.
Händler und Handel haben heutzutage Zugang zu einer enormen Menge an Daten – und damit die Grundlage für eine personalisierte Ansprache, die Kunden inzwischen erwarten. Richtig eingesetzt, kann Analytics der Schlüssel für alle möglichen Geschäftsvorteile sein – sei es, dass es darum geht, ein besseres Online-Erlebnis für den Kunden zu
Most SAS regression procedures support the "stars and bars" operators, which enable you to create models that include main effects and all higher-order interaction effects. You can also easily create models that include all n-way interactions up to a specified value of n. However, it can be a challenge to
In previous articles, I've shared tips about how you can work with SAS and ZIP files without requiring an external tool like WinZip, gzip, or 7-Zip. I've covered: How to create ZIP files with ODS PACKAGE ZIP (available since SAS 9.2) How to "unzip" and read ZIP files using FILENAME