One of the first and most important steps in analyzing data, whether for descriptive or inferential statistical tasks, is to check for possible errors in your data. In my book, Cody's Data Cleaning Techniques Using SAS, Third Edition, I describe a macro called %Auto_Outliers. This macro allows you to search
Tag: data management
In part 1 of this post, we looked at setting up Spark jobs from Cloud Analytics Services (CAS) to load and save data to and from Hadoop. Now we are moving on to the next step in the analytic cycle, scoring data in Hadoop and executing SAS code as a
This article is not a tutorial on Hadoop, Spark, or big data. At the same time, no prerequisite knowledge of these technologies is required for understanding. We’ll give you enough background prior to diving into the details. In simplest terms, the Hadoop framework maintains the data and Spark controls and
In the first post of a two-part series, Phil Simon plays point-counterpoint with himself.
One of my favorite parts of summer is a relaxing weekend by the pool. Summer is the time I get to finally catch up on my reading list, which has been building over the year. So, if expanding your knowledge is a goal of yours this summer, SAS Press has
What's the impact of using data governance and analytics for the business side of education? It's an interesting question, and during a video interview, Dale Pietrzak, Ed.D.,Former Director of Institutional Effectiveness and Accreditation (IEA) at the University of Idaho shared details on the results they're realizing from using SAS for
I am obsessed with jigsaw puzzles. Specifically, 1000-piece mystery puzzles, entertaining not just for their pictorial humor, but also for the challenge. Unlike traditional puzzles, you don't know what you are putting together because the completed puzzle isn't pictured on the box. Mystery puzzles are constructed so that you must
Recently, I worked on a cybersecurity project that entailed processing a staggering number of raw text files about web traffic. Millions of rows had to be read and parsed to extract variable values. The problem was complicated by the varying records composition. Each external raw file was a collection of
My New Year's resolution: “Unclutter your life” and I hope this post will help you do the same. Here I share with you a data preparation approach and SAS coding technique that will significantly simplify, unclutter and streamline your SAS programming life by using data templates. Dictionary.com defines template as
K(o)ennen Sie schon „DevOps“? Machen Sie SAS? Dann lohnt sich eventuell ein frischer Blick auf die Kombination! Denn immer mehr Unternehmen probieren, ihren produktiven Betrieb auch in die Hände der Software-Entwickler zu legen (2 von 3 laut Jenkins) – speziell in der Analyse, insbesondere beim agilen Modellieren und dem Veredeln
By now you’ve seen the headlines and the hype proclaiming data as the new oil. The well-meaning intent of these proclamations is to cast data in the role of primary economic driver for the 21st century, just as oil was for the 20th century. As analogies go, it’s not too
Data management gets lost in the enthusiasm around Artificial intelligence (AI) and machine learning (ML). Not surprising, when it's an algorithm that decides what search results to show you, guides the self-driving cars on the roads, and powers the anti-fraud bots that monitor every credit card transaction we make. Charles
This blog post outlines how to create your own CAS functions using the CAS Language. It also includes a partial list of both CASL built-in and common functions for reference.
SAS Technical Support has had several requests from customers who want to use SAS® software to help download their files from a website when there is no application programming interface (API) to do it. This post shows how to automate downloads using PROC HTTP and DATA step, and how to use the HTTP DEBUG statement.
Los objetivos que deben concretar los líderes de las áreas de negocio de una organización son tan variados como importantes: mejorar la experiencia de sus clientes, elevar la rentabilidad, reducir los costos al tiempo de hacer más eficiente la cadena de producción o cumplir con las regulaciones que rigen a
With SAS Data Preparation and SAS Decision Manager, you can perform out-of-the-box column and row transformations to increase your data quality and build the foundations for data-driven innovation. This blog will discuss how you can leverage SAS Decision Manager to enrich data when preparing it through SAS Data Preparation.
La aserción de que los datos son nuevos para el cobra y, en especial, un nuevo escenario para el negocio y en el que la economía digital es imparable. Las organizaciones de todo el mundo están cada vez más conscientes de ello y están aprovechando al máximo las innovaciones tecnológicas
SAS Viya is our latest extension of the SAS Platform and interoperable with SAS® 9.4. There were a number of SAS Viya presentations at SAS Global Forum 2018. In this series, we will review several of the most interesting talks. This post reviews Hadley Christoffels’ talk, A Need For Speed: Loading Data via the Cloud.
The European Union’s General Data Protection Regulation (GDPR) taking effect on 25 May 2018 pertains not only to organizations located within the EU; it applies to all companies processing and holding the personal data of data subjects residing in the European Union, regardless of the company’s location. Here are four selected SAS tools for GDPR that allow you to protect personal data in SAS reports by suppressing counts in small demographic group reports.
Aunque el término Big Data lleva solo 20 años de uso a nivel mundial, SAS ya tiene más de cuatro décadas inspirando lo extraordinario, un reto que muchas compañías y empleados ya han asumido, incorporando soluciones de analítica en procesos tan increíbles como la exploración del sol, la protección del
The release of SAS Viya 3.3 has brought some nice data quality features. In addition to the visual applications like Data Studio or Data Explorer that are part of the Data Preparation offering, one can leverage data quality capabilities from a programming perspective. Here is an overview of SAS Data Quality 3.3 programming capabilities.
With SAS Viya 3.3, a new data transfer mechanism Multi Node Data Transfer has been introduced to transfer data between the data source and the SAS’ Cloud Analytics Services. Learn more about this feature.
This is a continuation of my previous blog post on SAS Data Studio and the Code transform. In this post, I will review some additional examples of using the Code transform in a SAS Data Studio data plan to help you prepare your data for analytic reports and/or models. Create
SAS Data Studio is a new application in SAS Viya 3.3 that provides a mechanism for performing simple, self-service data preparation tasks to prepare data for use in SAS Visual Analytics or other applications. It is accessed via the Prepare Data menu item or tile on SAS Home. Note: A
Cuando usted viaja, va de compras o se hospeda en un hotel espera recibir un servicio de primera, un trato amable y que se cumpla lo que le han prometido. Es lo que exigimos de las empresas y organizaciones con las que tratamos: tener una experiencia que cumpla con nuestras
Much has been written about the value that North Carolina’s Criminal Justice Law Enforcement Automated Data Services (CJLEADS) system has brought the state’s court personnel and law enforcement officers. CJLEADS integrates dozens of NC criminal justice and law enforcement data sets, a vast improvement over the state’s legacy processes. Law
Here are some new tips for masking. The new EU General Data Protection Regulation (GDPR) requires your company to implement (quote) all necessary technical and organizational measures and to take into consideration the available technology at the time of the processing and technological developments. So, how can you comply with
You work with data. Data about your customers. It's likely that your customers' identity could be determined from the data you’ve collected. Starting in May 2018, a new data protection law will be in effect. This means you’re going to have to document which technical measures you’ve implemented to prevent your
When using conventional methods to access and analyze data sets from Teradata tables, SAS brings all the rows from a Teradata table to SAS Workspace Server. As the number of rows in the table grows over time, it adds to the network latency to fetch the data from a database
Welche Veränderungen bringt 2018 im Datenmanagement? Ich habe Experten nach ihrer Meinung zu den Technologietrends 2018 gefragt und sie mit meinen eigenen Erwartungen verglichen. Herauskristallisiert haben sich fünf große Trends, die uns meiner Ansicht nach dieses Jahr im Datenmanagement begleiten: 1. Datenbewegung wird wichtiger Cloud-Anbieter haben bereits gezeigt, wie einfach es