Learn how to fit a decision tree and use your decision tree model to score new data. In Part 6 of this series we took our Home Equity data saved in Part 4 and fit a logistic regression to it. In this post we will use the same data and
Tag: data management
Learn how to fit a logistic regression and use your model to score new data. In part 4 of this series, we created and saved our modeling data set with all our updates from imputing missing values and assigning rows to training and validation data. Now we will use this
Learn how to fit a linear regression and use your model to score new data. In part 4 of this series, we created our modeling dataset by including a column to identify the rows to be used for training and validating our model. Here, we will create our first model
Learn how to split your data into a training and validation data set to be used for modeling. In part 3 of this series, we replaced the missing values with imputed values. Our final step in preparing the data for modeling is to split the data into a training and
In part 1 of this series, we examined our data before building any models. Among the discoveries were missing values in some of our columns. Missing values are an inevitable part of data analysis. Whether it's due to a faulty sensor, human error, or simply the absence of information, missing
In part 1 of this series, we examined our data before building any models. Among the discoveries was a column that seemed to contain a SAS date value. Here, we will discuss what exactly is meant by a 'SAS date', how to format it correctly, and how to create a
Welcome to my series on getting started with Python integration to SAS Viya for predictive modeling. Exploring Data - Learn how to explore the data before fitting a model Working with Dates - Learn how to format a SAS Date and calculate a new column Imputing Missing Values - Learn
Welcome to the first post in my series Getting Started with Python Integration to SAS Viya for Predictive Modeling. I'm going to dive right into the content assuming you have minimal knowledge on SAS Cloud Analytic Services (CAS), CAS Actions and Python. For some background on these subjects, refer to
My recent work has focused heavily on migration, especially onto the SAS Viya platform and cloud more generally. Rather unexpectedly during this process, we have found that data observability is becoming increasingly important to customers. They start simply by looking at tracing files, but soon find that it has a
Here at SAS, we understand the importance of having access to cutting-edge professional resources. That’s why, for more than 40 years, we’ve provided individuals in programming, data management and analytics fields with low-cost and no-cost materials that promote success in their educational and professional journeys. And today, as the demand
SAS' Mark Jordan shows you how to modify data using PROC SQL, PROC DATASETS and SAS macros.
Desde hace algún tiempo venimos escuchando que los datos son la nueva moneda de cambio. El nuevo oro, dicen muchos. En un mundo cada vez más digital e informado como el que tenemos esta afirmación en realidad se queda corta: son el oxígeno que nos permite pensar y movernos en
¿Qué tienen en común una empresa norteamericana que desarrolla la próxima generación de automóviles sin conductor; una fintech que otorga préstamos en Colombia sin historial crediticio y una farmacéutica global que desarrolla un tratamiento para virus como el Covid-19? Los datos. Considerados como el “nuevo petróleo” o el motor del
SAS' Leonid Batkhan presents an implementation of parallel processing by spawning multiple SAS sessions using SYSTASK statements with subsequent synchronization.
It’s safe to say that SAS Global Forum is a conference designed for users, by users. As your conference chair, I am excited by this year’s top-notch user sessions. More than 150 sessions are available, many by SAS users just like you. Wherever you work or whatever you do, you’ll
SAS Global Forum 2021 will be jam-packed with inspiring content. Register today to ensure you don't miss a second of this year's event.
The people, the energy, the quality of the content, the demos, the networking opportunities…whew, all of these things combine to make SAS Global Forum great every year. And that is no exception this year. Preparations are in full swing for an unforgettable conference. I hope you’ve seen the notifications that
Find out the most popular SAS Users YouTube channel how to tutorials, and learn a thing or two!
In my previous blog post, I talked about using PROC CAS to accomplish various data preparation tasks. Since then, my colleague Todd Braswell and I worked through some interesting challenges implementing an Extract, Transform, Load (ETL) process that continuously updates data in CAS. (Todd is really the brains behind getting
If you’re like me and the rest of the conference team, you’ve probably attended more virtual events this year than you ever thought possible. You can see the general evolution of virtual events by watching the early ones from April or May and compare them to the recent ones. We
SAS' Leonid Batkhan explains the data cleansing task of removing unwanted repeated characters in SAS character variables.
SAS' Leonid Batkhan reviews SAS functionality related to the character strings quoting/unquoting, then dives deep into unquoting SAS character variables.
Analytics is playing an increasingly strategic role in the ongoing digital transformation of organizations today. However, to succeed and scale your digital transformation efforts, it is critical to enable analytics skills at all tiers of your organization. In a recent blog post covering 4 principles of analytics you cannot ignore,
SAS' Leonid Batkhan reveals how to change lengths for all character variables in a data set and all data sets in a data library to facilitate data migration to Unicode encoding environment.
One of the first and most important steps in analyzing data, whether for descriptive or inferential statistical tasks, is to check for possible errors in your data. In my book, Cody's Data Cleaning Techniques Using SAS, Third Edition, I describe a macro called %Auto_Outliers. This macro allows you to search
In part 1 of this post, we looked at setting up Spark jobs from Cloud Analytics Services (CAS) to load and save data to and from Hadoop. Now we are moving on to the next step in the analytic cycle, scoring data in Hadoop and executing SAS code as a
This article is not a tutorial on Hadoop, Spark, or big data. At the same time, no prerequisite knowledge of these technologies is required for understanding. We’ll give you enough background prior to diving into the details. In simplest terms, the Hadoop framework maintains the data and Spark controls and
In the first post of a two-part series, Phil Simon plays point-counterpoint with himself.
One of my favorite parts of summer is a relaxing weekend by the pool. Summer is the time I get to finally catch up on my reading list, which has been building over the year. So, if expanding your knowledge is a goal of yours this summer, SAS Press has