• post-spark

    The case for Spark

    I have been following Big Data, Hadoop in particular for the past four years. A lot has changed since then. I fell into this, back at the time, because I was looking for the next big wave of technologies especially on the backend, given that was my forte at the time. I started looking into […]

    Continue reading
  • post-kafka

    Apache Kafka

    This week I want to discuss an up and coming topic specifically Apache Kafka. For a long time streaming data into Hadoop was not considered relevant due to the fact that Hadoop was batch MapReduce. In quite a few respects it is still Batch though there are several efforts to make it faster for interactive […]

    Continue reading
  • post-featured-hadoopworld-strata

    Can I get that “Spark” to go?

    Wow, Hadoop World has come to pass and as predicted, Spark was the main topic on everyone’s mind. Just to give you a hint on the various Spark topics Paxata announces their Adaptive Data Prediction product built atop Spark. ClearStory Data announces Collaborative Storyboards that is also built atop Spark. GraphLab announces their tool GraphLab […]

    Continue reading