By infusing analytics through every phase of the guest journey, hotel managers can help shore up the complicated balance between the guest experience and revenue and profit responsibilities – delivering memorable and personalized guest experiences, while maximizing revenue and profits. To accomplish this, hotels need to be able to collect, store and analyze the volumes of data generated by their guest interactions, their operations and the broader market. As the volume and complexity of data increases, wrapping your head around what is available and how it can be useful is becoming challenging.
“Big Data” is a challenge for organizations not just because the volume of data has increased, but also because the variety has increased – it’s gone beyond traditional transactional data into unstructured formats like text, video, email, call logs, images, click-stream. It is coming at us fast. Data like tweets or location is stale nearly the minute it is created.
The reason why big data is a “big deal” is because the volume and complexity of the data puts pressure on traditional technology infrastructures, which are set up to handle primarily structured data (and not that much of it). In these environments, it is difficult, or even impossible, for organizations to access, store and analyze “big data” for accurate and timely decision making. This problem has driven innovations in data storage and processing, such that it is now possible to access more and different kinds of data.
To a certain extent, big data is forcing business leaders (like our analytic hospitality executives) to get more involved in technology decisions than ever before. To help with this, in this post, I’ll talk about how technology has evolved to handle big data and give some examples of how companies are innovating with their big data. Next week, I’ll do the same for big analytics. This is not intended to make everyone into technology experts, but rather, to provide some basic information that can arm hotel managers to start having conversations with their IT counterparts.
This influx of large amounts of complex data has necessitated changes in the way that data is captured and stored. To handle the volumes of unstructured data, databases need to be faster, cheaper, scalable and most importantly more flexible. This is why some have been talking about Hadoop as an emerging platform for storing and accessing big data. Hadoop is a database that is designed to handle large volumes of unstructured data. Hadoop works because it is cheap, scalable, flexible and fast.
- Cheap & Scalable – Hadoop is built on commodity hardware – which is exactly what it sounds like – really cheap, “generic” hardware. It is designed as a “cluster”, tying together groups of inexpensive servers. This means it’s relatively inexpensive to get started, and easy to add more storage space as your data, inevitably, expands. (it also has built in redundancy – data stored in multiple places,- so if any of the servers happen to go down, you don’t’ lose all the data)
- Flexible – The Hadoop data storage platform does not require a pre-defined data structure, or data schema. I use the analogy of the silverware drawer in your kitchen. The insert that sorts place settings is like a traditional relational database. You had to purchase it ahead of time, planning in advance for the size of the drawer and the kinds of silverware you wanted to put in it. It makes it easy for you to grab out the four sets of forks and knives you need for a place setting. However, the pre-defined schema makes it difficult to add additional pieces of silverware should you decide to buy ice tea spoons or butter knives, or if you are looking for a place to store serving utensils. Hadoop, on the other hand, is more like an empty drawer with no insert – it has no pre-defined schema. You can put any silverware you want in there without planning ahead of time. You can see the advantage of this approach with unstructured data. There is no need to “translate” it into a pre-defined schema, you can just “throw it in there” and figure out the relationships later.
- Fast – Hadoop is fast in two ways. First, it uses massive parallel processing to comb through the databases to extract the information. Data is stored in a series of smaller containers, and there are many helpers available to reach in and pull out what you are looking for (extending the drawer metaphor: picture four drawers of silverware with a family member retrieving one place setting from each at the same time). The work is split up, as opposed to being done in sequence. The second way that Hadoop is fast is that because the database is not constrained by a pre-defined schema, the process of loading the data into the database is a lot faster. (picture the time it takes to sort the silverware from the dishwasher rack into the insert, as opposed to dumping the rack contents straight into the drawer).
Many companies have put a lot of effort into organizing structured data over the years, and there are some data sets that make sense to be stored according to traditional methods (like the silverware you use every day). Because of this, most companies see Hadoop as an addition to their existing technology infrastructure rather than a replacement for their relational, structured, database.
Next week I’ll talk about innovations in the execution of analytics that speed up time to results, allowing organizations to take full advantage of all of that big data.