Blogs

Blogs

Tag: Hadoop

Advanced Analytics | Analytics | Data Management | Internet of Things | Learn SAS

Oguz BayrakdarJune 22, 2020 0

Perde arkası: Bir analitik mimarına sıkça gelen teknik sorular

Teknik mimari çeşitli bir konudur- ancak bazı sorular tekrar tekrar gündeme gelir. Altyapıyı, açık kaynak politikasını, IoT veri yönetimini, paydaş katılımını kapsayan geniş yelpazenin tamamı, kurumsal mimarların oynadığı rolün giderek daha önemli bir hale geldiğine işaret ediyor. Bu yazımda son 12 ay boyunca en sık sorulan on soru ve müşterilerime

Read More

Analytics | Data Management | Learn SAS | Programming Tips

Kumar Thangamuthu

Kumar ThangamuthuSeptember 26, 2019 0

Data and Analytics Innovation using SAS & Spark - part 2

In part 1 of this post, we looked at setting up Spark jobs from Cloud Analytics Services (CAS) to load and save data to and from Hadoop. Now we are moving on to the next step in the analytic cycle, scoring data in Hadoop and executing SAS code as a

Read More

Data Management | Learn SAS | Programming Tips

SAS Spark Load Save

Kumar Thangamuthu

Kumar ThangamuthuAugust 27, 2019 0

Data and Analytics Innovation using SAS & Spark - part 1

This article is not a tutorial on Hadoop, Spark, or big data. At the same time, no prerequisite knowledge of these technologies is required for understanding. We’ll give you enough background prior to diving into the details. In simplest terms, the Hadoop framework maintains the data and Spark controls and

Read More

SAS Administrators

Joe FurbeeAugust 21, 2019 0

Accessing Databases in the Cloud – SAS Data Connectors and Amazon Web Services

Editor’s note: This is the first article in a series by Conor Hogan, a Solutions Architect at SAS, on SAS and database and storage options on cloud technologies. This article covers the SAS offerings available to connect to and interact with the various database options available in Amazon Web Services.

Read More

Analytics | Data Management

Frederik Vandenberghe

Frederik VandenbergheOctober 31, 2018 0

How to deploy your models with SAS Model Manager to Hadoop

How do you deploy your model so that business processes can make use of it? This post explores how SAS Viya applications can directly add models to a model repository, and specifically focuses on how to deploy them with SAS Model Manager to Hadoop.

Read More

Analytics | Data Management

Andrew KramerSeptember 13, 2018 0

Publishing and running models in Hadoop on SAS Viya

This post goes through the steps to build complex models in SAS Model Studio and publish and run them to Hadoop in SAS Viya.

Read More

SAS Administrators

Stuart RogersApril 12, 2018 0

SAS Viya 3.3 Kerberos Delegation from SAS 9.4M5

You can now enable Kerberos delegation across the SAS Platform, using a single strong authentication mechanism across that single platform. As always with configuring Kerberos authentication the prerequisites, in terms of Service Principal Names, service accounts, delegation settings, and keytabs are important for success.

Read More

Advanced Analytics | SAS Administrators

Stuart RogersFebruary 14, 2018 0

SAS Viya 3.3 - Some Kerberos principles

In this article, I will set out clear principles for how SAS Viya 3.3 will interoperate with Kerberos. My aim is to present some overview concepts for how we can use Kerberos authentication with SAS Viya 3.3. We will look at both SAS Viya 3.3 clients and SAS 9.4M5 clients.

Read More

Data Management

man happy to learn that data management make hadoop easier

Todd WrightJuly 18, 2017 0

Data management makes Hadoop easier

Todd Wright says Hadoop can be difficult – but data management can help.

Read More

Data Management

Looking at big data and Hadoop

Joyce Norris-MontanariJuly 10, 2017 0

Does Hadoop equal big data? What else does big data mean?

Joyce Norris-Montanari explains why it's so important to pick the right tools to manage your big data.

Read More

Data Management

programmers working on big data identity resolution

David LoshinJune 29, 2017 0

Understanding big data identity resolution

David Loshin discusses big data identity resolution in a programming and execution environment.

Read More

Data Management

woman considering the challenges of identity resolution with big data

David LoshinJune 21, 2017 0

Challenges in identity resolution on a big data platform

David Loshin says simple approaches to identity resolution may not scale on a big data platform as data volumes increase.

Read More

Data Management

businesswoman considering data lakes and hadoop

Jim HarrisJune 15, 2017 0

On the way to data lakes and Hadoop – Part 2

In part 2, Jim Harris explains more about why you should address data quality and governance issues on the way to data lakes and Hadoop.

Read More

Data Management

water droplet represents data lakes and hadoop

Jim HarrisJune 5, 2017 0

On the way to data lakes and Hadoop – Part 1

Jim Harris advocates addressing data quality and governance issues on the way to data lakes and Hadoop.

Read More

Data Management

4 people celebrate how SAS makes Hadoop easy

Clark BradleyMay 31, 2017 0

Four ways SAS makes Hadoop easy

Clark Bradley explains how SAS can make Hadoop approachable and accessible.

Read More

Data Management

Businessman considers the future of Hadoop

Phil SimonMay 24, 2017 0

Hadoop's evolution, part 2: Thoughts on the past, present and the future

Phil Simon chimes in on the last five years of Hadoop with an eye toward the future.

Read More

Advanced Analytics | Data Management

businesswoman considering the transition to Hadoop hybrid environment

David LoshinMay 18, 2017 0

Transitioning to a hybrid data environment

David Loshin explores considerations for organizations gradually making the transition to Hadoop.

Read More

Data Management

young IT men work on the evolution of Hadoop

Phil SimonMay 16, 2017 0

Hadoop's evolution, part 1: Lessons from AWS

Phil Simon looks at AWS's evolution before making some predictions about the future of Hadoop.

Read More

Data Management

businessman considers the challenges of moving Hadoop into production

David LoshinMay 8, 2017 0

Challenges of moving Hadoop into production

David Loshin discusses two common roadblocks in moving Hadoop from proof-of-concept to production.

Read More

Data Management

business people discuss whether Hadoop is ready for MDM

Joyce Norris-MontanariMay 4, 2017 0

Is big data just a source? Or is Hadoop ready for MDM?

Joyce Norris-Montanari poses the question: Is Hadoop/big data technology actually ready for MDM?

Read More

Advanced Analytics | Analytics | Data Management

Michael Herrmann

Michael HerrmannApril 27, 2017 0

Data Management für Analytics – Enge Verzahnung von IT und Data Science ist entscheidend

Welche Rolle Datenqualität und Data Governance beim Data Management für Analytics spielen, habe ich mit meinem Kollegen Gerhard Svolba zuletzt an dieser Stelle diskutiert. Doch was genau macht modernes Datenmanagement aus, und welche Rolle spielen dabei neue Technologien à la Hadoop und Co.? Und wie sieht überhaupt die künftige Zusammenarbeit

Read More

Banking | Insurance | Manufacturing

Analytics

Paul GittinsMarch 3, 2017 0

How to apply design thinking to your analytics architecture

There aren’t many things that keep me awake at night but let me share a recent example with you. I’ve been grappling with how to help a local SAS team respond to a customer’s request for a “generic enterprise analytics architecture.” As background, this customer organization had recently embarked on

Read More

Advanced Analytics

SAS PolandJanuary 4, 2017 0

Business Intelligence: 4 key development trends

4 dominant trends can be distinguished in the development of Business Intelligence tools and in the way they are used in modern organisations. These trends will evolve into directions of development for these tools, changing their role in supporting decision processes and building competitive advantages. Trend 1: Self-service models The

Read More

Data Management

importance of metadata

Joyce Norris-MontanariDecember 20, 2016 0

Importance of metadata – Bridging the gap (Part 3: operational metadata usage)

As I discussed in the first two blogs of this series, metadata is useful in a variety of ways. Its importance starts at the source system, and continues through the data movement and transformation processes and into operations. Operational metadata, in particular, gives us information about the execution and completion

Read More

Data Management

abstract programming and computer screen represent the importance of metadata

Joyce Norris-MontanariDecember 9, 2016 0

Importance of metadata – Bridging the gap (Part 2: transformation and movement)

In the first blog of this four-part series, we discussed traditional data management and how we can apply these principles to our big data platforms. We also discussed how metadata can help bridge the gap of understanding the data as we move to newer technologies. Part 2 will focus on

Read More

Data Management

Joyce Norris-MontanariDecember 2, 2016 0

Importance of metadata – Bridging the gap (Part 1: source system)

Traditional data management includes all the disciplines required to manage data resources. More specifically, data management usually includes: Architectures that encompass data, process and infrastructure. Policies and governance surrounding data privacy, data quality and data usage. Procedures that manage a data life cycle from creation of the data to sunset

Read More

Data Management

小林泉December 1, 2016 0

Hadoopだからこそ必要なセルフサービス－そしてアダプティブ・データマネジメントの時代へ

2014 およそ2014年からSAS on Hadoopソリューションを本格展開してきました。時代背景的には、2014頃は依然として、業態の特性からデータが巨大になりがちで、かつそのデータを活用することそのものが競争優位の源泉となる事業を展開する企業にHadoopの活用が限られていたと思います。その頃は、すでにHadoopをお持ちのお客様に対して、SASのインメモリ・アナリティクス・エンジンをご提供するというケースが大半でした。その後、急速にHadoopのコモディティ化が進んだと感じます。 2015 2015頃になると、前述の業態以外においてもビッグデータ・アナリティクスの成熟度が上がりました。データ取得技術の発展も伴い、これまで活用していなかった種類や量のデータを競争優位性のために活用を志向するようになり、蓄積および処理手段としてのHadoopの選択が加速します。この頃になると、数年前には必ずあったHadoopそのものの検証ステップを踏まない企業が増えてきます。データ量、処理規模、拡張性、コスト効率を考えたときに妥当なテクノロジーがHadoopという結論になります。ビッグデータはデータのサイズだけの話ではありませんが、筆者の足で稼いだ統計によると、当時大体10TBくらいが、従来のテクノロジーのまま行くか、Hadoopを採用するかの分岐点として企業・組織は算段していたようです。この時期になると、従来のテクノロジーの代替手段としてのHadoopの適用パターンが見えてきました。新しいデータのための環境従来捨てていた、あるいは新たに取得可能になった新しいデータをとりあえず蓄積して、何か新しいことを始めるためのある程度独立した環境として、コスト効率を考慮してHadoopを採用するパターン既存のデータウェアハウスへ価値を付加（上の発展形であることが多い）新たなデータを使用してHadoop上で加工し、アナリティクス・ベーステーブルにカラムを追加し、アナリティクスの精度を向上 ETL処理負荷やデータ格納場所のHadoopへのオフロード BI & アナリティクスの専用基盤 SQLベースのアプリケーションだけをRDBMSに残し、その他の機械学習、ビジュアライゼーションなどSQLが不向きな処理をすべてHadoop上で実施多くは、インメモリアナリティクスエンジンと併用データレイク（筆者の意見としては）いざ新しいデータを使用しようと思ったときのスピード重視で、直近使用しないデータも含めて、全てのデータを蓄積しておく。よくあるのが、新しいデータを使用しようと思ったときには、まだデータが蓄積されておらず、利用開始までタイムラグが生じてしまうケース。その時間的損失すなわち利益の喪失を重要視し、そのような方針にしている企業が実際に当時から存在します。 2016 海外の事例等では数年前から見られましたが、2016になると、日本でも以下の傾向が見られます既存Hadoopをそのコンセプトどおりスケールアウトしていくケースグローバル・データ・プラットフォームとして、複数のHadoopクラスターを階層的に運用するケース AI、機械学習ブームにより機械学習のためのデータの蓄積環境として IoTの流れにより、ストリーミング処理（SASでいうと、SAS Event Streaming Processingという製品です）と組み合わせてまさに、Hadoopがデータプラットフォームとなる時代がやって来たと思います。その証拠に、SAS on Hadoopソリューションは、日本においても、金融、小売、通信、サービス、製造、製薬といったほぼ全ての業種において活用されています。 Hadoopの目的は、従来型のBI・レポーティングではなく、アナリティクスこのような流れの中で、Hadoopの採用には一つの確固たる特徴が浮かび上がっています。もちろん弊社が単にITシステムの導入をゴールとするのではなく、ビジネス価値創出を提供価値のゴールにしているというバイアスはあるのですが。。。 Hadoopの導入目的は、ビジネス価値を創出するアナリティクスのためであることがほとんどであるしたがって、Hadoopに格納されるデータには主にエンドユーザーがアナリティクス観点の目的志向でアクセスするケースがほとんどであるつまり、ある程度の規模のITシステムではあっても、Hadoopに格納されるデータはアナリティクスの目的ドリブンでしかアクセスされません。主たるユーザーは、分析者やデータ・サイエンティストです。彼らが、「使いたい」と思った瞬間にアクセスできる必要があるのです。このようなユーザーサイドのリクエストは、従来のBIすなわちレポーティングのような固定化された要件定義をするような依頼ではないため、その都度従来のようにIT部門と要件をすり合わせて、IT部門にお願いするという方法では成り立ちません。その数日、数週間というリードタイムが意思決定を遅らせ、企業の業績に悪影響をもたらすからです。あるいはIT部門の担当者を疲弊させてしまいます。つまり、アナリティクスにおいては、分析者・データサイエンティストが自分自身で、Hadoop上のデータにアクセスし、必要な品質で、必要な形式で、必要なスピードで取得するために自由にデータ加工できる必要があるのです。このあたりの話については、下記でも紹介していますので、是非ご覧ください。【ITmedia連載】IT部門のためのアナリティクス入門第2回やっと分かった　ビッグデータアナリティクスでHadoopを使う理由第3回データ分析で成功するためのデータマネジメントとIT部門の新たな役割【関連ブログ】アナリティクスの効果を最大化するデータマネジメント勘所これが、Hadoopにおいて、セルフサービス・データマネージメント（データ準備）ツールが不可欠な理由です。SASはアナリティクスのソフトウェアベンダーとして、このHadoop上でITスキルの高くない分析者・データサイエンティストでも自分自身で自由にデータを取得できるツールを開発し提供しています。それが、SAS Data Loader for Hadoopです。 SAS Data Loader

Read More

Advanced Analytics

Muhammad Asif AbbasiDecember 1, 2016 0

SAS integration with Hadoop - one success story

Nearly every organization has to deal with big data, and that often means dealing with big data problems. For some organizations, especially government agencies, addressing these problems provides more than a competitive advantage, it helps them ensure public confidence in their work or meet standards mandated by law. In this

Read More

Data Management

Thanksgiving and how it relates to self-service data prep, Hadoop and more

Matthew MagneNovember 16, 2016 0

3 Thanksgiving lessons about data warehouses, Hadoop and self-service data prep

It's that time of year again where almost 50 million Americans travel home for Thanksgiving. We'll share a smorgasbord of turkey, stuffing and vegetables and discuss fun political topics, all to celebrate the ironic friendship between colonists and Native Americans. Being part Italian, my family augments the 20-pound turkey with pasta –

Read More

Analytics

Hadoop und BARC

Torsten BeckOctober 17, 2016 0

Hadoop und Data Lakes – Use Cases, Nutzen und Grenzen

Die aktuelle BARC-Studie verrät die Sicht der Unternehmen auf modernes Datenmanagement mittels Hadoop und Data-Lake-Konzepten. Die Anwenderbefragung gibt einen interessanten Blick auf den derzeitigen Status von Hadoop und Data Lakes in Europa und Nordamerika. Wo wird das Ecosystem eingesetzt, was ist der erhoffte Nutzen, und wo sind die Grenzen, um

Read More