The missing link in your data strategy – Part 1

"I skate to where the puck is going to be, not where it has been."

-  Wayne Gretzky

I love this quote from Wayne Gretzky. It sums up how most organizations approach data strategy.

strategy represented by ice hockey playerData strategy typically starts with a strategic plan laid down by the board. The CEO will have a grand vision for a future business model or direction for the organization and this is subsequently broken down into strategic plans for each major function within the company. So the whole organization (in theory) tries to "skate to where the puck is going to be" by aligning people, process and technology toward this future vision, the classic "PPT."

As data professionals, we mostly get involved with the 'T' aspect of this acronym because data strategy typically involves some form of IT transformation.

Perhaps your business model is adapting to include wholesale and retail consumers, or maybe the CEO wants to extend the model to service B2B customers as well as your traditional B2C base. Whatever the change, systems and information will undoubtedly need to be extended or replaced to cope with the grand new vision.

A transformation plan will be drawn up – and somewhere, buried deep within the work breakdown structure, is a step marked system migration. Read More »

Post a Comment

SAS Grid Manager for Hadoop nicely tied into YARN (Part 1)

business people at computerIf you'd like to extend your investment in the Hadoop infrastructure, SAS Grid Manager for Hadoop can help by enabling you to colocate SAS Grid jobs on your Hadoop data nodes. It works because SAS Grid Manager for Hadoop – which is Cloudera certified – is integrated with the native components of your Hadoop ecosystem, specifically YARN and Oozie.

How does it work? The SAS Grid Manager for Hadoop conceptual architecture diagram shown below illustrates the various tiers in a complete SAS deployment. Note that the Cloudera Hadoop cluster and the SAS Grid components are colocated on the same hardware, making this both a data and a server tier. Kerberos is also a required component in this environment.

Cheryl Cloudera diagram 1

Read More »

Post a Comment

What does an effective data strategy look like?

484518555In the previous post, I played devil's advocate over the very concept of a data strategy. I discussed the limitations of all business strategies, including and especially data ones.

Today I'll turn off the cynical side of my brain. At a high level, I'll explore the two necessary components of a contemporary data strategy.

Read More »

Post a Comment

Self-service data preparation transforms data professionals into data rock stars

Transform data professionals into rock stars

Transform data professionals into data rock stars

When my band first started and was in need of a sound system, we bought a pair of cheap yet indestructible Peavey speakers, some Radio Shack microphones and a power mixer. The result? We sounded awful and often split our ear drums from high-pitched feedback and raw, untrained vocals. It took us years of practicing and playing out before we stepped up to the right gear – a Mackie 808s power mixer, a pair of 15" Community speakers and a Shure Beta 58 microphone. The solution was lightweight, enriched our vocals and suppressed feedback. It made playing out a pleasure and fans started actually coming to gigs. Whether you're singing karaoke, ripping a guitar solo live to hundreds of fans or cleaning up your customer data on Hadoop, having the right gear is essential.

Data professionals, including business analysts and data scientists, face similar struggles. IT lacks the agility to respond quickly to their data requests. The data is often raw, noisy and needs to be enriched. And these data professionals lack the skills to cleanse, manage and transform that data on Hadoop. As a result, more time is spent preparing data than generating insight.

Read More »

Post a Comment

Do organizations need to adopt formal data strategies?

Chess: A game of strategyIf data and its "big" counterpart are so important, then it stands to reason that all organizations need to adopt formal data strategies to be successful.

Or do they?

In this four-part series, I'll examine the question in depth.

I'll start today by playing devil's advocate. Do organizations really need to formalize and follow a data strategy?

Generally speaking, there are several main problems with promulgating specific business strategies – and data is no exception to this rule. Read More »

Post a Comment

As the calendar turns – Part 2

In this two-part series, which posts as the calendar turns to a new year, I revisit the top data management topics of 2015 (Part 1) and then try to predict a few of the data management trends of 2016 (Part 2).

Data management in 2016

42-27611110The Internet of Things (IoT) made significant strides in 2015. It made progress in health care, manufacturing, energy, retail and other industries – as well as the mass market – in the form of consumer electronics and household appliances enabled by embedded software and sensors to collect and exchange data. With Gartner predicting 5.5 million new things will get connected every day in the new year, a lot of data management trends in 2016 will be connected to IoT-related initiatives.

The connection I will be looking for throughout the new year, however, is whether the data emanating from IoT can be integrated. IoT will produce a lot of new data sources related to data already being managed by existing applications, systems and processes. For example, sensors added to track the shipping and inventory of manufactured parts and products have to be connected to supply chain management systems. In a previous post I discussed why, without integration, the immense potential of IoT cannot be actualized. To be valuable, IoT data needs to be processed, analyzed and shared, not only between devices but also with other systems and people. This is why I think that 2016 will not only be about connecting things to the Internet, but also the year we ask if everything is connected.

What say you?

Where do you see your data management efforts focused in 2016? Please share your perspectives by posting a comment below.

Watch a demo to find out how SAS can help you manage data beyond boundaries.

Post a Comment

As the calendar turns – Part 1

In this two-part series, which posts as the calendar prepares to turn 2015 into 2016, I revisit the top data management topics of 2015 (Part 1) and then try to predict a few of the data management trends of 2016 (Part 2).

Data management in 2015

Big data continued to make data management headlines in 2015, but there was a noticeable shift in attention to its quality, with a growing realization that data quality still matters in large data sets. Big data, however, continued to challenge (and rightfully so), perspectives about how much quality data needs in order to be useful. As big data was put to more and more business uses, it also reinforced the importance of metadata, especially since a lot of big data is externally created, which makes big data governance such a challenge.calendar

The externality of the majority of big data (e.g., open data, streaming data, cloud data) also influenced the 2015 prioritization of managing data beyond organizational boundaries, what David Loshin refers to as extra-enterprise data management. In 2015 more and more organizations embraced the benefits of managing data where it is – including minimizing data movement, improving productivity, reusing data management techniques, improving data governance and sharing valuable skills such as data stewardship.

In 2015 there was also a lot of effort focused on integrating big data into more traditional and business-critical applications, such as master data management, where identity, relevancy and privacy remain big challenges.

What say you?

Where were your data management efforts focused in 2015? Did any of your data management priorities shift throughout the year? Please share your experiences and perspectives by posting a comment below.

Post a Comment

Agility in data availability

Financial charts on tabletIn my recent posts, I've been exploring the issues of integrating data that originates from beyond the organization. But this post looks at a different facet of extra-enterprise data management: data availability. In many organizations, there's a growing trend of making internal analytical data accessible to external consumers. I can point to some simple examples that we barely even consider as publication of analytical data:

  • Financial account comparisons. Financial services firms often provide online access to customers enabling them to review their own accounts, conduct internal research about other potential investments, and get information about how their investments compare to other, similar customers. They may also be able to see what customers with greater returns on investment are including in their portfolios. This data is intended to positively influence customer investment choices.
  • Utility reporting. Power providers share data with residential customers that compares their energy utilization with other similar households. At the same time, they suggest ways to save energy and reduce costs. This information is intended to help lower overall energy spend.
  • Maintenance service providers. Organizations that broker maintenance services – such as those that provide cleaning services or supplies – share information with customers that compares their overall maintenance spend against other customers' of similar size and revenues. This information is intended to generate ideas about how to be more effective, because it helps show how maintenance activities correlate with financial success.

Read More »

Post a Comment

Beyond the boundaries of structured data: Part two

gift bagHow many times have you gone onto a website, put a few things in a shopping cart, and then exited the Internet? I do it all the time. Sometimes when I log on to that site during my next visit, those same items are still in my cart – ready for purchase. I find that very interesting since I logged on as a "guest."

On the Internet,  web logs collect information from every page you navigate to and every action you take on those pages. These logs are analyzed to help determine:

  • Navigation trends. Let’s say that 30% of the people who navigate to this page exit the website. That could be an indicator that the page is not very friendly. It might be time to consider making some changes to this page.
  • Purchase trends. I use the internet to purchase items (especially for the dogs) quite a bit. Web logs show what I buy and when I purchase, so that information could be used to offer me similar products or discounts on the items I usually purchase.
  • Wish lists. Many sites let you keep a list of the products or services you'd like to purchase in the future. Again, based on this information, the site may offer similar products or discounts. But (trust me), they will never forget that you put these items on your wish list.
Paper: SAS and Hadoop

Paper: SAS and Hadoop

Storing web data for analysis requires disk space. It's important to decide up front how you're going to prepare, retain and use the data. You may want to consider using Hadoop to analyze the data, and return valuable information that you could use for future sales. You may also consider merging this enhanced information with your existing customer data so you can run future analysis on combined data stores.

Post a Comment