Beyond “back-to-school”: Giving everyone the Power to Know® through education

“Back-to-school” is a common theme this time of year, but learning isn’t something that is relegated to a certain point on the calendar or even a particular point in life – it’s a lifelong journey.

Whether you are in early education using mobile technology for learning, a student or adult learner looking for free SAS® software, or an educator looking for new ways to teach, SAS has something to offer. We support education because it is an investment in the future, not just for our company, but for the world.

Learn how SAS can help you on your lifelong education journey (click image)

Learn how SAS can help you on your lifelong education journey (click image)

The Internet of Things and the continued growth of big data will create millions of jobs for data scientists in the coming years. But before that data scientist comes knocking at our door – or the doors of our customers – he or she will have had to take college-level courses in computer science and/or analytics.

Before even getting into college, though, kids need to be comfortable and familiar with subjects all throughout K-12. We can and we must support learning at all levels.

There are many SAS education programs and initiatives making a difference in K-12 and higher education. More than 26,000 teachers and students signed up in August to use SAS Curriculum Pathways free digital learning resources, bringing the total number of users to more than 550,000 around the world. More than 350,000 professors, students and independent learners are taking advantage of the free SAS software and training  offered through SAS Analytics U. I encourage you to check those out and think about how they can help you and/or your children along their journey.

The accompanying infographic shows how those programs and others map to a lifelong commitment to learning.

I find it particularly rewarding to bring students and teachers to SAS to learn about how we can support their interests and goals.

Last week, I met with three budding data scientists, ages 10-11, who used analytics to learn more about their passions. Two focused on sports, one on pet adoption. We arranged a special day where they presented their research, met with a sports analytics expert as well as experts who analyze data on service dog breeding and endangered species preservation.


Kids can get excited about data and analytics if we can help them understand the relevance to their lives. This can put them on a path to rewarding careers in analytics or other STEM disciplines.

SAS hosts other events such as the annual Math Summit for teachers, various STEM days for students and numerous professor trainings where we help them integrate SAS into their instruction.

Next week, we will co-host a Data & Analytics Summit with Achieving the Dream, where more than 160 community and technical college representatives will learn how data and analytics can improve enrollment, course availability and student outcomes. Those schools are critical to building a talented workforce with the skills employers need.

From pre-K to the workforce and beyond, we should never stop learning. What’s the next step in your journey?

Post a Comment

Why you need thick data and thin data

137957211I’ve been running across the term “thick data” lately and even came across a definition earlier this week from Word Spy, an online glossary that highlights new pop culture terms before they’re cool.

So, what is thick data and why does it matter? It’s what we’ve traditionally thought of as qualitative data, and it could become even more important in the future as a balancing point to the thin data at the edges of the Internet of Things. Why? Because thick data can provide deeper meaning – the context, if you will.

After all, you can’t learn everything from 1s and 0s. Quantitative data can help you a lot, but how do you incorporate the softer stuff too? The human stuff that holds the meaning behind the numbers – answering questions such as why the numbers are what they are, and all the other stuff that’s not obvious from hard numbers alone.

A more practical example: Mobile marketing

Now that you have an understanding of thick data, why should you care about it for your business? Let’s look at mobile marketing. What I so often see as a consumer are brands targeting me with things that are absolutely ridiculous.

I can understand how they’re making assumptions about me based on the limited data they have access to, but it doesn’t make sense if you really know me. There are all kinds of possibilities when looking at that limited scope of data, but I might fall out on the wrong side of a decision tree, and rot there on the ground.

How can brands advance traditional data mining and statistical analysis to improve some of these digital promotions? Maybe you look at marrying structured data with unstructured data, including both qualitative and quantitative insights – creating a soup of both thick and thin data, thus using different types of data to improve the way you practice your art.

Thin data might be a burst of information with limited context. It’s not all the information that may affect a given scenario, like what offer might be relevant to me when I’m in my local grocery store late at night on a weekday. It may just be one piece of the equation. Look at iBeacon data streams, for example. This geo-targeted data knows when consumers are close by, or inside, a specific location. It’s proximity-based data and it’s valuable, but there isn’t a lot of marketing insight that can be derived just from that dimension of data, except that you’re there. Or, your device was there. We have to combine that information with something else that’s more robust to make smarter use of the data.

Thin data is valuable to collect and explore in large quantities, but the danger in using thin data alone comes with making direct offers without context. As marketers, let’s not blow it. Let’s recognize that, while all this stuff is coming at us, we still have to be good stewards of the data.

Make sure you know something about who you’re talking to before you send a message to the consumer. Learn to listen to all the data before you speak or initiate a conversation. If initially you come out with an offer just because you have a device that’s within a geo-fenced area, that communication might come back to haunt you. And, naturally, make sure you’re not invading privacy or misusing the data you have.

Data streams generated in the Internet of Things are giving us access to more and more data every day, but our thick data from posted videos, photos, notifications and conversations is growing too. How can we use both the thick and the thin to benefit the consumer and the brand?  If you can get that right, your opportunity to innovate and approach your marketing campaigns differently will be huge.

Post a Comment

3Q 2015 Intelligence Quarterly: How to lead the digital transformation

IQ_3Q_2015_cover_internationalThere it is, staring us in the face: the answer that will forever transform banking as we know it and propel the industry into the digital age. 

As a leader of the bank, you now have two options:

  1. The first is to put big data into the liability or even the risk bucket, and fight it with primitive tools such as costly, rigid and complex core banking and RDBMS systems.
  2. The second is to embrace big data and come out in front of the regulatory-driven compliance tsunami.

Embracing big data will empower the front office staff with the controls they need to make decisions at the point of the transaction and, at the same time, eradicate the complexity fueled by silo-based point solutions.

What is your choice?

To date, 90 percent of your colleagues are choosing option one and only dipping their toes into option two out of plain fear. The latest issue of Intelligence Quarterly focuses on the 10 percent who are boldly embracing option two.

How are they doing it? Many are taking a factory approach to big data and analytics, constantly trying out new ideas in a big data lab, and then taking what works and repeating it with precision. In particular, the factory approach processes information and turns the raw material (data) into something useful.

More often than not, banking executives will point out that they have big data projects started, and they are working to cut through the complexity. With very few exceptions, however, they struggle to embrace the digital age and attempt to solve growing data-intensive problems with yesterday’s tools and approaches.

When it comes to big data, who is in charge?

Any given person, department or function alone cannot change the bank. To really change the bank, it will take the efforts of IT, risk, retail and other departments all working together.

IT does automation and infrastructure, not optimization, nor does it industrialize the model management. It’s not unusual to find dozens of custom-built solutions with millions of incompatible rules and hundreds of copies of the data floating around in the bank. Combine the short-term focus with these silo-based point solutions, and we have a picture of true complexity.

Find out how banks are moving away from this level of complexity and shifting to a strategic state. Part of this process involves moving away from vendor consolidation and asking vendors to take on more by providing software as a service or even as an appliance. Instead of asking the vendors that contributed to the problems in the first place to do more of the same, banking leaders are asking: What will it take for you to replicate results to larger parts of our increasingly complex value chain?

The way that banks look at IT is changing. Indeed, a major banker I respect recently asked, ”If I spend $300 million annually on AML alone, is it not reasonable to think that I should be receiving AML as an appliance?”

How can the risk department help lead the change? Throwing manpower at the problem will not break the growing wave of regulatory-driven compliance. Instead, we must let go of our investigative warehouses and the idea of offshore back offices. For an example, read how the chief model risk officer of Discover Financial Services embraced an analytical factory concept to cut through CCAR requirements with little effort. Others banks are using the analytical factory approach to empower front office staff to take ownership and responsibility of decisions at the point of the transaction.

Most heads of retail banking also are automating their customer interactions instead of optimizing the customer experience across touch points and channels. Why embrace hyperfragmentation through rigid, channel-based structures and systems when you could drive consistency across channels? Instead, you could use analytics to calculate risk-adjusted performance per client while creating personalized experiences and taking the costs out of the system.

Cut through the complexity

The best way to improve client performance while personalizing services to each client is to embrace the technology required to cut through the complexity. Unlike the rules-based technology of the past, ownership of the analytical factory calls for a new set of skills, be it social, demographic, economic or any other faculty conceivable. For example, SAS is working with one global, systemically important bank (G-SIB) that hired a team of astronauts to help find the extreme outliers in financial investigation data.

My favorite case study is about a UK bank that empowered the front office staff to make credit decisions. It asked, why not equip the customer-facing banker to make the credit decision that can be better for both the client and the bank? It makes sense, especially compared to the alternative of reducing the time it takes the risk department to process an internal request.

Why, then, are other banks not copying this approach? My guess: They applied the wrong skills to the job. The answer to too many databases isn’t to create another and call it an enterprise warehouse. The answer is a factory approach, as one insurance firm discovered when it separated risk models from the data, through the creation of a model factory. What do you know about your customer? Probably much less than this major insurer, since they now model customer behavior and analyze client perceptions to improve the customer experience and measure risk-weighted performance at a reduced cost.

Who should lead the digital transformation? It touches all aspects of the bank and needs a true champion at the top. The CEO should personally lead the way and change the bank — because the future of the bank is at stake.

And the best way for the CEO to lead the bank into the digital age is with a factory approach that automates, scales and manages data to support collaboration between departments and streamline analytics projects throughout the bank.

Post a Comment

Cybersecurity and the doomsday case for analytics

cybersecurity_imageTechnology has brought the world a great deal of good, but the downside is that we’re increasingly vulnerable to some seriously scary stuff:

  • Terrorists taking control of airplanes through the in-flight entertainment system.
  • Governments breaking into secure systems and stealing identities.
  • Thugs messing with the steering of self-driving cars.

When everything is connected, everything can be attacked. I’m talking about the unique brand of mayhem caused by the really bad guys – the kind of people who want to bring down a stock exchange, tamper with nuclear weapons or spoil a city’s water supply. If it sounds like the plot of a James Bond movie, it should, because cybercriminals can and do create chaos on a cinematic level.

Whenever I hear these stories, I have mixed emotions. As a citizen concerned with protecting everything I hold dear, I share the very same concerns that you do. But I also have a front-seat view of developments in the field of cybersecurity, so I understand the amazing power of analytics to address what is surely one of the most complicated computer science challenges of our times. And that makes me optimistic.

Cybersecurity to the rescue

The way that we will thwart the evil masterminds is analytics. We’ll fight technology with technology.

The key to winning is prevention. The nature of the cyberthreat means it’s no longer enough to reinforce the perimeter. Stopping data breaches means assuming that criminals are already inside. In this new reality, analytics serve not as a barricade to keep criminals out, but an alarm that sounds when the virus they’ve implanted awakes.

Today the combination of event stream processing, Hadoop, in-memory analytics and visual analytics make it possible to react in near real time, helping you spot the bad guys and foil their attempts. SAS has recently unveiled a cybersecurity solution to do just that, which will be available this fall. It works by searching hundreds of thousands of records per second and billions per day to spot the inevitable threats.

Plenty of companies try to build fences to keep people out, but we don’t do that. As soon as the virus wakes up and begins to do its thing, SAS will find it immediately and alert security teams – before the doomsday scenario plays out. The thing about villains is, they never seem to rest. They’re wily, they’re malicious, and they’re going to keep coming at us. The stakes are very high.

When it comes to foiling intricate plots, we’re going to need serious brainpower and cutting-edge analytics. That’s why I’m so passionate about bringing SAS’ expertise in cybersecurity to customers around the world.

Learn more

Post a Comment

Disaster relief efforts show promise of analytics and seemingly unrelated data sources

As monsoon season begins, many Nepal earthquake victims have shelter over their heads thanks in part to an unlikely intersection of two SAS global development projects.

The first project is with the International Organization for Migration (IOM). IOM is the first responder to any crisis that displaces people. IOM provides temporary shelter and helps coordinate the efforts of other relief agencies that provide food, clean water, medical care and security.

IOM is currently assisting thousands of victims in the earthquake-ravaged areas of Nepal. SAS is helping IOM analyze shelter data to help better allocate resources, based on the work we did with them following Typhoon Haiyan in the Philippines.

Using SAS Visual Analytics, IOM can see where the high-risk shelters are, based on factors such as:

  • A dangerous mix of overcrowding, unsafe drinking water and solid waste disposal problems.
  • High numbers of families still living in makeshift shelters.
  • Rapid growth of certain vulnerable populations in a short amount of time.
  • Higher concentrations of diarrhea, fever and skin disease among older people.

As new data comes in, new insights are revealed. As you would expect, Kathmandu is the focus of the bulk of relief efforts. However, after visualizing data on young children, it was revealed that a nearby district had more small children, ages 1-5, and in particular, five times the number of infant girls as Kathmandu. This smaller district had a larger need for diapers, formula, children’s medicine and other supplies for nursing moms. These were quick, but important, insights to guide relief efforts.

Concentration of females under age 1 at IOM evacuation centers (click to enlarge)

Concentration of females under age 1at IOM evacuation centers (click to enlarge)

How can global trade data inform disaster relief?

There’s another side to the Nepal data story, though. In April, SAS announced the launch of SAS Visual Analytics for UN Comtrade, which made 27 years of international trade data available using data visualization software. How is this helping with the Nepal earthquake response?

IOM is building temporary shelters for displaced people in Nepal and needed to understand where/how to quickly procure sheet metal roofing (CGI) before monsoon season.  People are sleeping out in the open due to the fear that more aftershocks will bring buildings down on them, so protection during monsoon season is a big concern.

Using UN Comtrade, we were able to show IOM a graphic of the top exporters of CGI. Some of the findings include:

  • Neighboring India is the world’s largest producer of CGI roofing sheets that are wider than 24 inches, but India rarely sells it to Nepal.
  • Nepal is actually the world’s 7th largest producer so historically there’s good capacity for CGI fabrication in Nepal. Consequently, some of the supply can be sourced locally.
  • There are other potential sellers in the region like China (2nd), Thailand (8th) and Vietnam (9th).

    Top exporters of CGI roofing sheets, via SAS Visual Analytics for UN Comtrade (click to enlarge)

    Top exporters of CGI roofing sheets, via SAS Visual Analytics for UN Comtrade (click to enlarge)

Brian Kelly, who is leading the Nepal response for IOM, shared his thoughts. “Shelter is so important to helping the 63,000 displaced families create a level of stability and protection, especially with monsoon season upon us. With the UN Comtrade information, we were able to secure materials more quickly and, literally, put roofs over peoples’ heads.”

A new era of data-for-good

These projects just scratch the surface of what’s possible when new data, and those that know how to use it, are applied to humanitarian needs. Organizations such as DataKind and INFORMS, through its new Pro Bono Analytics program, are rallying data scientists to lend their time and expertise to helping people around the world. And there are many more data sets out there that could help with relief and other humanitarian efforts.

It’s an exciting time to be in the world of big data and analytics. We’re just beginning to understand how technology can tackle society’s “grand challenges.” Please share your ideas on what unlikely data sources might help with disaster relief. And, how can we bring the world’s analytics talent to bear on these challenges?


Post a Comment

Bringing Hadoop into the mainstream

Jim Goodnight, Mike Olson, Herb Cunitz and Jim Davis discuss Hadoop.

Jim Goodnight, Mike Olson, Herb Cunitz and Jim Davis.

Remember when the morning talk show hosts started talking about Twitter? That was weird at first. But now, even your small, home-town news stations have a Twitter handle, and so does your boss, most likely.

“Big data” took a similar route into the mainstream vernacular. At first, we heard pundits saying that only the banks had big data. Or only big government needed to worry about big data. But then, before we knew it, 60 Minutes and The Atlantic were running regular features discussing big data.

I’m not sure if Hadoop will ever hit that level of mainstream attention, but it has become an everyday topic with the leaders I talk to at conferences and customer events.  And the Hadoop naysayers are getting harder to find.

Why is that? Four reasons:

  1. Organizations are seeing Hadoop as more than just a dumping ground for their data. They’re approaching it with strategic business problems and learning how to treat it as an analytics platform.
  2. The early adopters who took the risks with the platform are seeing real results, and now everyone else is realizing it’s time to catch up.
  3. Today’s data volumes make it impossible to ignore Hadoop. We talked about this when discussing the Internet of Things, which is an undeniably huge growing source of big data.
  4. Hadoop is becoming enterprise hardened and easier to implement and maintain. Vendors like Cloudera and Hortonworks are developing ecosystems around Hadoop that improve its stability and offer layers of governance and security that make it a viable option for even the most conservative companies.

Recently at the SAS Global Forum Executive Conference I discussed some of these topics on a panel with SAS CEO Jim Goodnight, Cloudera co-founder Mike Olson and Hortonworks president Herb Cunitz.

Jim was the first to say he's seeing more customers use Hadoop for analytics, and the other panelists agreed, mentioning Hadoop use cases from MasterCard and the Financial Industry Regulatory Authority. Herb and Mike both talked about how the technology that started out in many IT shops is now catching the attention of business leaders too.

Watch the video below to hear us discuss the growing use of Hadoop in the cloud, and learn one thing that Jim says is stupid to do with Hadoop (hint: it involves a straw). Fair warning, if you stop watching too early, you won’t hear why boards of directors are suddenly paying attention to Hadoop now too.

Post a Comment

US Senate takes up fight against patent trolls

med500004 (1)With the recent introduction of the Protecting American Talent and Entrepreneurship (PATENT Act), the US Senate set aside partisan politics to take on a problem that plagues all industries, but especially high-tech.

In front of Congress, in the media and in a previous blog post, I have decried the current patent litigation landscape in the US. Simply put, patent trolls produce nothing, employ practically no one, and yet they threaten US innovation and economic growth.

The number of patent lawsuits is at historic levels, and promising to increase again in 2015. According to United for Patent Reform, an organization of like-minded companies and trade association of which SAS is a member, “The number of patent lawsuits filed in the first quarter of 2015 was up 30% over the number filed in the fourth quarter of 2014. The percentage of those suits filed by patent trolls was also higher in this quarter than in the last (62% vs. 57%).”

Having been the target of such wasteful and frivolous suits, SAS is on the front lines of the battle both in the courts, and in the legislatures seeking a solution to this legalized extortion. I applaud the senators who have taken up this fight.

The legislation, S. 1137, is sponsored by Senate Judiciary Committee Chairman Chuck Grassley (R-IA), Ranking Member Patrick Leahy (D-VT), Senate Majority Whip John Cornyn (R-TX), Sen. Chuck Schumer (D-NY), Sen. Orrin Hatch (R-UT), Sen. Amy Klobuchar (D-MN) and Sen. Mike Lee (R-UT).

The introduction of this legislation demonstrates their leadership in the complex and critical area of patent reform. The bill attempts to protect defendants and consumers from frivolous and damaging lawsuits by clarifying the litigation process, increasing transparency and adding more risk for the plaintiffs, while recognizing concerns raised by other patent stakeholders. The proposed changes will make great strides in protecting American job creators from patent trolls while reaffirming America’s commitment to innovation, entrepreneurship and consumer welfare.

Now that there is meaningful legislation pending in both the Senate and the House of Representatives, I encourage Congress to act quickly in moving the legislation forward to bring common sense back to patent litigation.

Post a Comment

What's your defense against cyberattacks?

461990699Sophisticated cyberattacks are on the rise. And cybersecurity professionals are in demand. There’s a real shortage of talent in both the public and private sector, with a recent Booz Allen report recommending an increase in skills to protect government networks. Likewise, new IDC research sponsored by SAS recommends integrating analytics into the core of your cyber detection efforts.

How is your business tackling this problem from both the human and technology sides? Do you have people and processes in place to protect your assets and your reputation? What's your strategy to detect intrusions in real time?

The way I see it, cybersecurity requires the ability to store tremendous amounts of data, apply advanced analytics to determine when threats are happening in real time, and then immediately take action to take those activities offline.

Let’s look more closely at the problem – and the solution.

Today’s cyber criminals can gain access to your entire network from any single computer or entry point on the network. They can even come in through a contractor who has connected temporarily to your network or through a remotely managed system on the network.

Then, once they’re inside your network, the hackers quietly and methodically work their way through your systems – often for months – before their presence is even detected.

This isn’t a farfetched scenario taking place at some dodgy company in the bad part of town. It’s happening today on the networks of your favorite brands in almost every industry.

The solution is no longer a matter of identifying weak entry points, reinforcing security at the perimeter and stopping the cyber criminals before they get in. You have to assume they're already there.

But how can you find them hiding away in your network? And how can you stop them? You have to analyze the network traffic, compare it to normal traffic patterns and investigate any anomalies. That sounds simple enough, right? Just look for anything out of the ordinary.

The problem is that even an average sized company today sees 100,000 network transactions PER SECOND. Most companies aren't set up to monitor that much traffic, let alone store and analyze it all in real time.

But now you can. With low cost storage options like Hadoop, storing that much data is within reach. And with event stream processing, analyzing all of your network activity on the fly – not after the fact –  is possible too. Finally, with in-memory and visual analytics capabilities, you can see the unusual network patterns and react immediately. Of course, the system also sends out alerts and connects with your existing perimeter defense systems to notify security experts immediately.

This might sound like science fiction, but it’s happening now. In fact, it was unveiled this week at the RSA conference in San Francisco and will be available in the fall.

You can learn more about the evolving nature of cyber threats in this interview with Security Intelligence expert Stu Bradley.

Post a Comment

2Q 2015 Intelligence Quarterly: Digital marketing in the modern era

Intelligence Quarterly 2Q 2015How well do you truly know your customers? Maybe you can identify them on multiple channels, and you know how to cross-sell products in various situations. But do you know your customers better than your competitors do? And do you know them well enough to keep their data safe?

Truly knowing your customers is about more than identity. It’s also about preferences, needs and the contextual knowledge of time and place. In the digital world, customers expect to be known as individuals with distinct preferences, not just as one member of a segment. Understanding digital habits and demographic data is just the beginning.

If you think digitization starts and ends with amassing data, think again. The world is shifting from data being king to knowledge being king. Knowing your customer is powered 10 percent by your internal data and 90 percent by your ability to model the behavior of your customer.

It should therefore come as no surprise that knowing your customer starts with analytics. An analytics factory can help you industrialize the process of customer knowledge, starting with a decision hub that can be used to open up consistent dialogue with customers across all your touch points.

We’ve packed this issue of Intelligence Quarterly with examples of companies approaching it the right way – the customer-centric way. For instance:

  • Trade Me, New Zealand’s largest online marketplace, leads a paradigm shift in how we think about advertising. And data is required.
  • Telenor, a telecommunications leader in Norway, changes its sales model by analyzing mobile data.
  • FANCL, a large Japanese retailer, uses analytics to change the way it communicates with customer.
  • Dustin Group, one of the leading Nordic resellers of IT business solutions, doubles conversion rates with analytics.

If you can’t add your name to this list of companies, it’s time to ask why. What are you doing to please your customers? And how are you gaining knowledge to do that? The knowledge that you derive from customer data allows you to treat customers differently, to anticipate market changes and to meet global expectations.

‘Know your customer’ regulations

As you build your customer programs, also consider the regulatory importance of customer data. Especially in the banking industry, “know your customer” (KYC) initiatives are becoming crucial to verifying and protecting the identity of each customer.

The same data you use to improve digital marketing programs can also be used to prevent identity theft, financial fraud, money laundering and terrorist financing. When considered from this perspective, knowing your customer moves beyond marketing to become a broader corporate initiative. Customer knowledge moves beyond helping the customer find the best product. It can also help all customers and employees feel safe that their data is secure and their investments are protected.

To accomplish KYC, consider whether your customer view is consistent across the house, allowing you to take preventative measures in the front office. Is your data secure, and how much of your IT budget goes into moving and storing transactional data for operational purposes?

Without analytics and data management, your data can be a liability as much as an asset. It is what you do with the data that can determine the risk-weighted performance of the information flowing into your organization. If your company isn’t embracing technology transformation, you won’t be able to address the increased demands from clients or deal with the challenges and disruptions of the digital marketplace.

Following the advice in this issue – and future issues – of Intelligence Quarterly will help you gain knowledge of your customers to simultaneously meet customer needs, ensure that customers are who they say they are, and fully protect the use of customer data. It all starts with analytics.

Post a Comment

Is Hadoop a storage platform or an analytics architecture?

482172147Hadoop is everywhere. It’s changing the way we store and analyze data – and it’s changing the IT landscape.

One of the analysts we work with in the big data space says he’s feeling like it’s 1995 all over again. Why? Because Hadoop is so cheap that people are starting to replicate data again. It reminds him of the early days of the data warehousing craze. Remember that? When replicating data for different purposes was the norm?

Then we hit the 2010 time frame and we were talking about having too much data to store. And definitely too much to copy and collect in multiple systems. Now, with Hadoop entering the mainstream, you can just spin up a cluster, grab the data, make a copy and store it for later.

If you don’t think Hadoop is important to your company or your industry, think again. This is incredibly important to you. There is an opportunity to store tons and tons of data in Hadoop at a fraction of the cost compared to what you’re paying with relational database systems.

Loading data into Hadoop

What did you do in the past if you wanted to capture a bunch of data? You had to call IT, ask to allocate a few terabytes of storage, take these data sources and load them up. Next, you had to return to IT to request access to the data. What’s wrong with that? It’s expensive and time consuming, and it increases license fees for storage and data use.

The alternative is a Hadoop data loading system that makes it easy for data scientists to gain access to data and prep it without an IT request for data management support. Data scientists can play an important role here in reducing the workload for IT and gaining self-service access to Hadoop.

Developing an analytic architecture

What other opportunities does Hadoop create? And how can you make Hadoop successful in your environment?

You need to think of Hadoop as more than a simple storage container. Instead, look at Hadoop as a modern analytic architecture where you can:

  • Load and store data without limiting it to a tabular format.
  • Use visual analytics to explore the data in the location where new data is continually ingested and available for analysis.
  • Persist high-performance analytics right inside your Hadoop clusters.
  • Conduct analytic procedures inside the clusters using in-memory capabilities.

With these options, you can use Hadoop more strategically – and without learning a new programming language. You don’t just store data there and pull it out when it’s time to analyze. Instead, you can send processing requests down to the Hadoop cluster and use in-memory capabilities to analyze the data that is stored there.

Find out more about cleansing, processing and preparing data in Hadoop.

Post a Comment
  • About this blog

    Welcome to The Corner Office blog, where SAS executives post their thoughts on global business, analytics and technology.

  • Subscribe to this blog

    Enter your email address:

    Other subscription options

  • Archives