Why it's hard to say "big data" without saying "cloud"

Woman viewing big data stored on cloud
Read our primer on big data – what it is and why it matters

We're bombarded by the exponential growth, availability and use of big data in our personal and business lives every day. It may be streaming in real-time. Coming from smart devices, social media or video. Or, text conversations with a customer service chatbot. Regardless of where it originates, these massive types and amounts of data can't be managed by traditional computing resources and techniques. That's why we often hear about big data in the context of cloud computing, IoT and artificial intelligence.

Before we talk about how to make the best use of big data to become data-driven, let's look at what big data encompasses and why it's relevant. Because it's not the amount of data that’s important. It’s what we do with it that matters.

Characteristics of big data

Big data is about more than just data volume. Two other characteristics of "the 3 V's" are variety and velocity.

  • Data variety acknowledges structured, semi-structured and unstructured data – as well as other data types, such as sensor data from the Internet of Things (IoT).
  • Data velocity acknowledges how fast big data is produced – which correlates to how fast we need to process and transform it into business insight.

Another "V" of big data relates to variability, which is a way to describe the unpredictable nature of today's data flows. Then there’s data veracity, which refers to big data quality. This is a complicated characteristic. Because big data’s biggest bang has been its explosion in external data's volume, variety and velocity – where we have much less control over data quality. It’s important to note, though, that big data has created new use cases. Consider aggregate analytics, which sometimes does not (and does not need to) compensate for – or even check for – data quality issues. You can only determine "how much" quality big data needs on a case-by-case basis.

Many big data projects start as pilots or prototypes that demand immediate and immense IT infrastructure resources. This is where big data and cloud computing often converge.

How do you cloud?

Cloud computing is a subscription-based delivery model that provides scalability, fast delivery and IT efficiencies. It removes many physical and financial barriers to aligning IT needs with evolving business goals. With a promise to deliver better applications, platforms and infrastructure quickly and cheaply, cloud computing has become a major force for business innovation across all industries. There are three cloud hosting options to consider.

Public cloud

A public cloud is a shared computing environment hosted entirely by a cloud service provider that manages and maintains the infrastructure. This option allows for lower costs, quick scalability, storage, and compute-on-demand, where customers pay only for the capacity they use. There is no hardware and software to buy, install, configure or maintain. Many organizations don't recognize how much their employees are using the public cloud. The reason is that our heavily used mobile devices are often connected to – and automatically back up data to – free public cloud services (e.g., Gmail and Google Drive).

Private cloud

In this environment, the server infrastructure is dedicated to a single enterprise. Sometimes a cloud service provider hosts this dedicated environment. Other times an enterprise owns and manages it on-premises, behind their own firewall. The latter allows for the utmost security and control, but it also requires the most in-house maintenance. Many choose this option because they want to protect intellectual property, business-critical applications, and sensitive data while complying with government regulations.

Hybrid cloud

Most enterprises opt for a hybrid cloud platform because it lowers the cost of entry and provides greater security – the best of both worlds. This model allows an enterprise to configure and manage multiple cloud environments (public or private, hosted or on-premises) as a single resource pool. It uses a combination infrastructure where data and applications are either:

  • On a public cloud – which applies to things like time-tracking applications.
  • On a private cloud – which organizations often use for sensitive data and business-critical applications, like payroll.

The bottom line is increased speed and agility, decreased IT infrastructure costs, standardized architecture and improved collaboration. These are especially important benefits for today's increasingly mobile workforce. With the cloud, you have the flexibility to use data storage, network and compute resources when you need them. And you can easily scale up and down as business needs change. The results? Faster IT deployment times trigger accelerated time to market, time to value, and pace of innovation.

As (or -aaS) you like it

The cloud provides a fresh way of looking at how you deploy IT services across your enterprise. There are several cloud-based service delivery options you can consider.

Software as a service (SaaS)

Cloud-based software (or application) offerings are designed to scale for unique purposes, usually with a pay-as-you-go deployment approach. Service providers take care of enabling the cloud-based software to run on most types of computers and mobile devices. The provider also manages access and security. SaaS gets you immediate access to the software you need without the deployment challenges associated with buying new software.

Platform as a service (PaaS)

A cloud-based platform (hardware, complete software stack, infrastructure, and even development tools) enables you to create and manage custom cloud applications using programming languages, frameworks, and tools provided by the cloud host. The user doesn’t manage or control the underlying cloud infrastructure (networks, servers, operating systems or storage). But the user does have control over deployed applications and possibly the application-hosting configurations.

Infrastructure as a service (IaaS)

If you want PaaS, but you also want someone else to manage it, that’s IaaS. In this model, users get their infrastructure equipment and resources from the provider – including storage, networks, processing and other general computing resources. IaaS users can run software from the cloud, access operating systems, applications and frameworks, and perform general administrative functions.

Results as a service (RaaS)

With RaaS, you simply upload (or enter) your data, select some options or configuration settings, and get your results. Then you can view, download and use those results as you see fit. You usually have no insight into the underlying infrastructure, platform or software. Online tax preparation services are a good example of this.

For enterprise offerings, RaaS is usually associated with big data analytics solutions. In this case, you provide access to your data. Then, analytics experts transform your big data into business insights you can use. IaaS delivers cloud infrastructure support for SaaS and PaaS. PaaS can provide development and support for SaaS, but it is not required because SaaS can also be delivered on top of IaaS.

Another important term in cloud computing is containers. With containers, you can separate the underlying IT infrastructure from the application code. For software deployment in the cloud, containers have become the standard. Because with a container, it doesn’t matter whether you’re hosting applications on a private data center or a public cloud. It works wherever.

Big data cloud

Digital transformation is driving the need for enterprises to establish a unified ecosystem of data, analytics and humans. Cloud computing offers a fundamental change in data consumption, service delivery and enterprise-wide collaboration. By cost-effectively providing faster access to data, adaptive storage, infrastructure, platform, software, and processing resources, the cloud enables agility and elasticity. You can use the cloud to automate many of these components, too. This simplifies processes and makes your big data efforts more productive.

Especially at the enterprise level, big data and the cloud are used in tandem so often that big data cloud is becoming an umbrella term (pun intended) for cloud-based services for big data. Optimizing costs, increasing efficiency through standardization and scalability, improving agility, and inspiring innovation are the key benefits of big data cloud.

Whether it's public, private or hybrid, a flexible big data cloud environment is key. It will free you to customize cloud-based services to ensure they're ideal for your business. In turn, you’ll be ready to transform big data into big business insights.

Learn how SAS can help you innovate with analytics in the cloud

About Author

Jim Harris

Blogger-in-Chief at Obsessive-Compulsive Data Quality (OCDQ)

Jim Harris is a recognized data quality thought leader with 25 years of enterprise data management industry experience. Jim is an independent consultant, speaker, and freelance writer. Jim is the Blogger-in-Chief at Obsessive-Compulsive Data Quality, an independent blog offering a vendor-neutral perspective on data quality and its related disciplines, including data governance, master data management, and business intelligence.

Leave A Reply

Back to Top