If you're looking for an honest discussion around big data and IT from one technologist to another, look no further than the recent Forbes article, "SAS CTO On IT And Big Data Analytics."* In it, Keith Collins discusses Hadoop, shadow IT and more. Three excerpted points you won't want to miss are below:
On Hadoop: People are all hyped up about Hadoop. But what is it, really? It is big and wide record sizes, big block sizes, designed specifically for high-volume, sequential processing. Just like a SAS data set in 1968… The only difference between a SAS data set and Hadoop is that now the disks are cheap enough that you can do replication.
On Shadow IT: The CIO is going to have this interesting challenge and opportunity to be the person who actually knows how to bring all of that data back together to facilitate the business and actually do something with the data. Otherwise, it is living in all its little silos. The new master data management opportunity is, “how do I synchronize and get all of the data that is in all of these software-as-a-service applications?”
On big data analytics: Recently I worked with a customer who currently can only afford to take twenty thousand attributes to do variable reduction down to the four hundred that he models with. But he has enough data to allow him to work with one hundred thousand attributes. The challenge for him was not the volume of data but it was algorithmic and computational. They had the data but could not process it in a cost-effective manner. That is an unsung piece of a lot of conversations around big data. A lot of people had a bunch of data already—they just could not afford to process and analyze it. Now they can.
Read the full article at Forbes to get more tips from Collins, including how to staff and train for the IT shop of the future.
*Keith Collins is currently the CIO of SAS. When this Forbes article was originally published, his title was SAS CTO.