Learn how to fit a random forest and use your model to score new data. In Part 6 and Part 7 of this series, we fit a logistic regression and decision tree to the Home Equity data we saved in Part 4. In this post we will fit a Random
Tag: Python
Welcome to the continuation of my series Getting Started with Python Integration to SAS Viya. Given the exciting developments around SAS & Snowflake, I'm eager to demonstrate how to effortlessly connect Snowflake to the massively parallel processing CAS server in SAS Viya with the Python SWAT package. If you're interested
Comparing Logistic Regression and Decision Tree - Which of our models is better at predicting our outcome? Learn how to compare models using misclassification, area under the curve (ROC) charts, and lift charts with validation data. In part 6 and part 7 of this series we fit a logistic regression
Learn how to fit a logistic regression and use your model to score new data. In part 4 of this series, we created and saved our modeling data set with all our updates from imputing missing values and assigning rows to training and validation data. Now we will use this
Learn how to fit a linear regression and use your model to score new data. In part 4 of this series, we created our modeling dataset by including a column to identify the rows to be used for training and validating our model. Here, we will create our first model
Learn how to split your data into a training and validation data set to be used for modeling. In part 3 of this series, we replaced the missing values with imputed values. Our final step in preparing the data for modeling is to split the data into a training and
In part 1 of this series, we examined our data before building any models. Among the discoveries were missing values in some of our columns. Missing values are an inevitable part of data analysis. Whether it's due to a faulty sensor, human error, or simply the absence of information, missing
Welcome back to my SAS Users blog series CAS Action! - a series on fundamentals. In this post, I'll show how to create user defined functions (UDFs) for the distributed CAS server using SAS and CASL code. Once the UDF is created, you can use it on the CAS server with programming
Welcome to the continuation of my series Getting Started with Python Integration to SAS Viya. In this post I'll show how to create user defined functions (UDFs) for the distributed CAS server using the SWAT package. Once the UDF is created you can use it on the CAS server with programming
Welcome to the continuation of my series Getting Started with Python Integration to SAS Viya. In this post I'll show how to impute missing values in a distributed CAS table using the fillna method from the Pandas API in the SWAT package and the impute CAS action. Load and prepare data
SAS Model Manager and the sasctl packages aim to create a seamless ModelOps and MLOps process for Python and R models. Python and R models are not second-class citizens within SAS Model Manager. SAS, Python, and R models can be easily managed using our no-code/low-code interface. This is an interface that can be extended to support a variety of use cases.
Welcome to the continuation of my series Getting Started with Python Integration to SAS Viya. In this post I'll discuss how to remove duplicate rows from a distributed CAS table using the both the Pandas API in the SWAT package and the native CAS action. The Pandas API drop_duplicates method was
SAS expert Leonid Batkhan presents the %embed macro function as a way to embed both “foreign” and SAS native code from a file into a SAS program, preventing clutter in your code.
Welcome to the continuation of my series Getting Started with Python Integration to SAS Viya. In this post I'll discuss how to load multiple CSV files into memory as a single table using the loadTable action. Load and prepare data on the CAS server To start, we need to create multiple
Welcome to the continuation of my series Getting Started with Python Integration to SAS Viya. In this post I'll discuss how to update rows in a distributed CAS table. Load and prepare data in the CAS server I created a script to load and prepare data in the CAS server. This
Welcome to the continuation of my series Getting Started with Python Integration to SAS Viya. In this post I'll discuss saving CAS tables to a caslib's data source as a file. This is similar to saving pandas DataFrames using to_ methods. Load and preview the CAS table First, I imported the
In part 1 of this series, we examined our data before building any models. Among the discoveries was a column that seemed to contain a SAS date value. Here, we will discuss what exactly is meant by a 'SAS date', how to format it correctly, and how to create a
Welcome to the continuation of my series Getting Started with Python Integration to SAS Viya. In this post I'll discuss how to execute SQL with the Python SWAT package in the distributed CAS server. Prepare and load data to the CAS server I created a Python function named createDemoData to prepare
Welcome to the continuation of my series Getting Started with Python Integration to SAS Viya. In this post I'll discuss how to count missing values in a CAS table using the Python SWAT package. Load and prepare data First, I connect my Python client to the distributed CAS server and named
Welcome to my series on getting started with Python integration to SAS Viya for predictive modeling. Exploring Data - Learn how to explore the data before fitting a model Working with Dates - Learn how to format a SAS Date and calculate a new column Imputing Missing Values - Learn
Welcome to the first post in my series Getting Started with Python Integration to SAS Viya for Predictive Modeling. I'm going to dive right into the content assuming you have minimal knowledge on SAS Cloud Analytic Services (CAS), CAS Actions and Python. For some background on these subjects, refer to
Welcome to the continuation of my series Getting Started with Python Integration to SAS Viya. In this post I'll discuss how to bring a distributed CAS table back to your Python client as a DataFrame. In this example, I'm using Python on my laptop (Python client) to connect to the
Welcome to the continuation of my series Getting Started with Python Integration to SAS Viya. In previous posts, I discussed how to connect to the CAS server, how to execute CAS actions, and how your data is organized on the CAS server. In this post I'll discuss loading client-side CSV files into
Welcome to the continuation of my series Getting Started with Python Integration to SAS Viya. In previous posts, I discussed how to connect to the CAS server, working with CAS actions and CASResults objects, and how to summarize columns. Now it's time to focus on how to get the count of unique values
The addition of the PYTHON procedure and Python editor in SAS Viya enables users to execute Python code in SAS Studio. This new capability in SAS Viya adds another tool to SAS's existing collection. With this addition I thought, how can I utilize this new found power? In this example,
I will show you how to deploy multi-stage deep learning (DL) models in SAS Event Stream Processing (ESP) and leverage ESP on Edge via Docker containers to identify events of interest.
Welcome to the continuation of my series Getting Started with Python Integration to SAS Viya. In previous posts, I discussed how to connect to the CAS server, how to execute CAS actions, and how to summarize columns. Now it's time to focus on how to rename columns in CAS tables. Load and explore data
Group and aggregate CAS tables Welcome to the continuation of my series Getting Started with Python Integration to SAS Viya. In previous posts, I discussed how to connect to the CAS server, how to execute CAS actions, and how to summarize columns. Now it's time to focus on how to group and aggregate CAS
Using SAS Viya in combination with open-source capabilities, we were able to develop an automated solution for logo detection that does not require any manual data labeling.
こんにちは!SAS Institute Japanの堀内です。今回も自然言語処理について紹介いたします。 前回の投稿では、実際にSASを使って日本語の文章を扱う自然言語処理の例を解説しました。 最終回の本投稿ではその応用編として、自然言語処理の代表的なタスクとSASによる実装方法を紹介します。なお、ここでいうタスクとは「定式化され一般に共有された課題」といった意味になります。自然言語処理には複数のタスクがあり、タスクごとに、共通する部分はあるとはいえ、問題解決のアプローチ方法は基本的に大きく異なります。SASには各タスクごとに専用のアクションセット1が容易されています。 要約タスク その名の通り文章を要約するタスクです。SASではtextSummarizeアクションセットで対応可能です。 ここでは、NHKのニュース解説記事「気になる頭痛・めまい 天気が影響?対処法は?」(https://www.nhk.or.jp/kaisetsu-blog/700/471220.html) の本文を5センテンスで要約してみましょう。 import swat conn = swat.CAS('mycashost.com', 5570, 'username', 'password') conn.builtins.loadActionSet(actionSet='textSummarization') conn.textSummarization.textSummarize(addEllipses=False, corpusSummaries=dict(name='corpusSummaries', compress=False, replace=True), documentSummaries=dict(name='documentSummaries', compress=False, replace=True), id='Id', numberOfSentences=5, table={'name':CFG.in_cas_table_name}, text='text', useTerms=True, language='JAPANESE') conn.table.fetch(table={'name': 'corpusSummaries'}) numberOfSentencesで要約文のセンテンス数を指定しています。結果は以下の通りです。 'まず体調の変化や天気、気温・湿度・気圧などの日記をつけ、本当に天気が影響しているのか、どういうときに不調になるのかパターンを把握すると役立ちます。 気温・湿度以外にも、気圧が、体調の悪化や、ときに病気の引き金になることもあります。 私たちの体は、いつも耳の奥にある内耳にあると言われている気圧センサーで、気圧の変化を調整しています。 ただ、天気の体への影響を研究している愛知医科大学佐藤客員教授にお話ししを伺ったところ、「台風最接近の前、つまり、気圧が大きく低下する前に、頭が痛いなど体調が悪くなる人は多い」ということです。 内耳が敏感な人は、わずかな気圧の変化で過剰に反応し、脳にその情報を伝えるので、脳がストレスを感じ、体のバランスを整える自律神経が乱れ、血管が収縮したり、筋肉が緊張するなどして、その結果、頭痛・めまいなどの体に様々な不調につながっているのです。' 重要なセンテンスが抽出されていることが分かります。 テキスト分類タスク 文章をいくつかのカテゴリに分類するタスクです。その内、文章の印象がポジティブなのかネガティブなのか分類するものをセンチメント分析と呼びます。ここでは日本語の有価証券報告書の文章をポジティブかネガティブか判定してみます。使用するデータセットは以下になります。 https://github.com/chakki-works/chABSA-dataset (なお、こちらのデータセットには文章ごとにポジティブかネガティブかを示す教師ラベルは元々付与されておりませんが、文章内の特定のフレーズごとに付与されているスコアを合算することで教師ラベルを合成しております。その結果、ポジティブ文章は1670文章、ネガティブ文章は1143文章、合計2813文章になりました。教師ラベルの合成方法詳細はこちらのブログをご覧ください。) pandasデータフレームにデータを格納した状態を確認してみましょう。 df = pd.read_csv(CFG.local_input_file_path) display(df)