All hail the data lake, destroyer of enterprise data warehouses and the solution to all our enterprise data access problems! Ok – well, maybe not. In part four of this series I want to talk about the confusion in the market I am seeing around the data lake phrase, including
Tag: Hadoop
Having spent a quarter of a century working on databases and on database-related technologies, I have developed an aura of skepticism on any new product that hits the market being presented as the best thing we have ever seen. It’s not that I love to revel in “I told you
In this post we dig deeper into the fourth recommended practice for securing the SAS-Hadoop environment through Kerberos authentication: When configuring SAS and Hadoop jointly in a high-performance environment, ensure that all SAS servers are recognized by Kerberos. Before explaining the complex steps in connecting to secure Hadoop within a
I recently caught up with Dr. Tom Davenport, analytics thought-leader and author of Big Data @ Work, in Dublin, where we talked about big data, the Internet of Things and Hadoop. I'll be sharing the conversation here with you in two parts. You'll find part one below, and you can check
Working out where Hadoop might fit alongside, or where it might replace components, of existing IT architectures is a question on the minds of every organization that is being drawn towards the promises of Hadoop. That is the main focus of this blog along with discussions of some of the reasons they
In previous posts, we’ve shared the importance of understanding the fundamentals of Kerberos authentication and how we can simplify processes by placing SAS and Hadoop in the same realm. For SAS applications to interact with a secure Hadoop environment, we must address the third key practice: Ensure Kerberos prerequisites are met
Imagine being able to get into your car and say “Take me to work.” Then, it automatically drives as you read the morning paper. We’re not there yet. But we’re closer than you think. Google has already developed a prototype for a driverless car in the U.S. Driverless cars are
In the last 5 years, the buzzword "big data" has spread like wildfire. One could argue big data had been around prior, but during this time media outlets such as the Wall Street Journal and C-level executives started to take a keen interest in this topic. No longer a problem
SAS has been developing "secret sauce" technology for more than 38 years. Whether it has to do with being platform independent, processing in-database, running across a grid, or analyzing data in-memory like our SAS LASR Analytic Server or our High Performance Analytics offerings, secret sauce makes everything taste or, in
At most banks, data is stored in separate databases and data warehouses. Customer data is stored in marketing databases, fraud analyses are done on transactional data, and risk data is stored in risk data warehouses. Oftentimes even liquidity, credit, market, and operational risk data is stored separately as well. Bringing
In the first installment of this series on Hadoop, I shared a little of Hadoop's genesis, framing it within four phases of connectivity that we are moving through. I also stated my belief that Hadoop has already arrived in the mainstream, and we are currently moving from phases three of connecting people to phase four
So, you've heard the Hadoop hype and you are looking – or have already invested – into Hadoop. Maybe you have also realized some benefits from the Hadoop ecosystem. But now you want to maximize those benefits by using advanced analytics, or you might have heard about algorithms or machine learning libraries available
So, with the simple introduction in Understanding Hadoop security, configuring Kerberos with Hadoop alone looks relatively straightforward. Your Hadoop environment sits in isolation within a separate, independent Kerberos realm with its own Kerberos Key Distribution Center. End users can happily type commands as they log into a machine hosting the
The panel moderator looks out over the audience. It’s a large crowd. For the first time ever, Big Data, Hadoop, and the Internet of Things are appearing on stage together. The conversation has just begun, so let’s listen in for a minute. Big Data: “…and people have been trying to
In the world of IT, very few new technologies emerge that are not built on what came before, combined with a new, emerging need or idea. The history of Hadoop is no exception. To understand how Hadoop came to be, we therefore need to understand what went before Hadoop that led to its creation. To understand
A challenge for you – do a Google search for “Hadoop Security” and see what types of results you get. You’ll find a number of vendor-specific pages talking about a range of projects and products attempting to address the issue of Hadoop security. What you’ll soon learn is that security
Scalability is the key objective of high-performance software solutions. “Scaling out” is a concept which is accomplished by throwing more server machines at a solution so that multiple processes can run in dedicated environments concurrently. This blog post will briefly touch on several scalability concepts that affect SAS.
Okay, let's say your data is in Hadoop. The distributed, open source framework is configured as it should be across low-cost servers and your data is sitting in those clusters. It's been a meaningful effort to get to this point but how does it benefit your organization? If it's not doing something
Sie kennen den kleinen gelben Elefanten schon? Hadoop verändert gerade die Welt – zumindest in der IT. Es gibt Experten, die prophezeien, dass bereits in den nächsten drei Jahren mehr als die Hälfte aller Daten der Welt in Hadoop gespeichert werden. Fakt ist: Bereits heute liegen die durchschnittlichen Kosten pro
For Hadoop to be successful as part of the modern data architecture, it needs to integrate with existing tools. This integration allows you to reuse existing resources (licenses and personnel) and is typically 60% of the evaluation criteria for integration of Hadoop into the data center. One of the most
Even though it sounds like something you hear on a Montessori school playground, this theme “Share your cluster” echoes across many modern Apache Hadoop deployments. Data architects are plotting to assemble all their big data in one system – something that is now achievable thanks to the economics of modern
Perhaps you're a big data expert who is fluent in Pig, Hive, MapR and all the technologies associated with the open source big data framework. Or maybe your role hasn't yet been touched by the increasingly popular Hadoop system. It's certainly worth a few minutes of your time (or perhaps
SAS In-Memory Statistics for Hadoop is a single interactive programming environment for analytics on Hadoop that integrates analytical data preparation, exploration, modeling and deployment. It’s principle components are the IMSTAT procedure (PROC IMSTAT) and the SAS LASR Analytic Engine (or SASIOLA engine for input-output with LASR). Within the SAS In-Memory Statistics
“One does not discover new lands without consenting to lose sight of the shore for a very long time.” - André Gide Ever heard of OpenOffice, Hadoop, Android, Firefox or MySQL? If so, can you identify the common denominator between these software tools and applications? If you answered, “They’re all
“If you build it, he will come.” – From the movie “Field of Dreams” “Build it and they will come” is a popular quote often attributed to the movie Field of Dreams. But guess what? This quote is not from the movie; it’s actually a misquote. [See the actual quote
“I have travelled the length and breadth of this country and talked with the best people, and I can assure you that data processing is a fad that won’t last out the year.” (Editor in charge of business books for Prentice Hall, 1957) Whereby the Analytics Isle tends to be a
We asked Lars George, EMEA Chief Architect at Cloudera, to share his opinion about Hadoop, Big Data and future market trends in Business Analytics. For all those who want to know more about Hadoop we recomment this TDWI whitepaper and how to apply Big Data Analytics. The last few years
Da war doch mal was, Sie erinnern sich, Hoodiejournalismus?! In dieser Diskussion über Digital gegen Print, jung gegen alteingessen, #hoodie vs. #schlipsy, über was ist Premium oder was ist hautnah dabei, über was erscheint modern, zeitgemäß und innovativ oder was bezahlt die Miete am Ende des Monats, ist ein Punkt
It was just a couple of years ago that folks were skeptical about the term "data scientist". It seemed like a simple re-branding of an established job role that carried titles such as "business analyst", "data manager", or "reporting specialist". But today, it seems that the definition of the "Data
El Big Data y la Nube se están volviendo inseparables: “se necesitan recursos en la nube para el almacenamiento y la ejecución de proyectos de big data, y el big data brinda a las compañías una buena ocasión de pasar a la nube”. Podríamos decir que el big data y