Customer Intelligence
Sharad Saxena 0
Assessing classifier performance based on a profit matrix in SAS Viya

The ultimate objective of a churn model is preventing churn by making a retention offer. To determine reasonable values for profit and loss information, consider the outcomes and the actions that you would take given knowledge of these outcomes. For example, the marketing department of a telecommunications company wants to offer a discount to people who are no longer on a fixed-term contract. To prevent churn, the company is willing to make an offer in exchange for a one-year contract extension.

Advanced Analytics | Machine Learning
Kevin Scott 0
SAS® Fast-KPCA: An efficient and innovative nonlinear principal components method

SAS® Fast-KPCA implementation bypasses the limitations of exact KPCA methods. SAS® internally uses k-means to find a representative sample of a subset of points. This row reduction method has the advantage that c centroids are chosen to minimize the variation of points nearest to each centroid and maximize the variation to the other cluster centroids. In some cases, the downstream effect of using k-means on computing the SVD increases numerical stability and improves clustering, discrimination, and classification.

Advanced Analytics | Analytics | Data Management
Estelle Wang 2
Find duplicates and near-duplicates in a corpus with Natural Language Processing

To find exact duplicates, matching all string pairs is the simplest approach, but it is not a very efficient or sufficient technique. Using the MD5 or SHA-1 hash algorithms can get us a correct outcome with a faster speed, yet near-duplicates would still not be on the radar. Text similarity is useful for finding files that look alike. There are various approaches to this and each of them has its own way to define documents that are considered duplicates. Furthermore, the definition of duplicate documents has implications for the type of processing and the results produced. Below are some of the options. Using SAS Visual Text Analytics, you can customize and accomplish this task during your corpus analysis journey either with Python SWAT package or with PROC SQL in SAS.

