The data lake is a great place to take a swim, but is the water clean? My colleague, Matthew Magne, compared big data to the Fire Swamp from The Princess Bride, and it can seem that foreboding. The questions we need to ask are: How was the data transformed and
Tag: Hadoop
Provisioning data for advanced analytics in Hadoop
Using Hadoop: Emerging options for improved query performance
In my last two posts, we concluded two things. First, because of the need for broadcasting data across the internal network to enable the complete execution of a JOIN query in Hadoop, there is a potential for performance degradation for JOINs on top of files distributed using HDFS. Second, there are
Explore las razones por las cuales los gigantes usan Hadoop
Seguramente ya ha escuchado hablar sobre Hadoop y todas sus potentes capacidades, de no ser así, este sistema no es más que un marco para software de código abierto que permite almacenar y procesar grandes volúmenes de datos de forma distribuida en un gran número de productos de hardware. En