This article discusses how to use SAS to filter variables in a dataset based on the percentage of missing values or duplicate values. The missing value statistics can be implemented by either DATA step programming on your own or reusing the existing powerful PROC FREQ.
Parts 1 and 2 of this blog post discussed exploring and preparing your data using SASPy. To recap, Part 1 discussed how to explore data using the SASPy interface with Python. Part 2 continued with an explanation of how to prepare your data to use it with a machine-learning model.
Bringing the power of SAS to your Python scripts can be a game changer. An easy way to do that is by using SASPy, a Python interface to SAS allowing Python developers to use SAS® procedures within Python. However, not all SAS procedures are included in the SASPy library. So,
The DATA step remains a popular way to create and manipulate SAS data sets. Whether you are reshaping a data set entirely or simply assigning values to a new variable, there are numerous tips and tricks that you can use to save time and keystrokes.
SAS' Leonid Batkhan shows you how to compare SAS data sets that include common and uncommon columns. You'll learn how to check mark commonalities and color-code differences in data tables side-by-side columns and add a comments field to see greater detail.