Posted in Uncategorized

Red Point and Big Data

Just saw a quality video on an end solution by Red Point that provides a high end solutions that abstracts out the complications of Hadoop, Yarn and infrastructure. See this presentation on the importance of a data lake 

My understanding was that Red Points approach is to store all the data in raw format in Haddoop. A refined cluster of the data would also be stored and they had a way of managing all the complex keys that would link various elements of data.

Their technology would manage the complications of shifting and storing the data. Further when querying data they would not use and complex map reduce code, but rather a Binary file would be deployed by via YARN. This file would be created via a visual programming interface.

The Red point solution promised less developer time for faster procession / compute in the Hadoop environment.

Worth keeping an eye on this technology as its got some promise, particularly given how complex map reduce can be. The only other consideration is how well this technology would work with something like Spark and Scala.

Scala from what I can see seems to bridge the world of the data scientist, data analyst and programmer quite well. Providing yet another paradigm in the big data world!




Posted in Uncategorized

Starting my journey in learning about Big Data

In my quest to identify the key players in the Big Data space, I investigated the Gartner Magic quadrant for web analytics

Figure 1.Magic Quadrant for Data Warehouse and Data Management Solutions for Analytics
Gartner Magic Quadrant for web analytics

I then looked into courses that would help me learn about the field. IBM’s Big data university provided some great tutorials and I have already earned a few badges after passing the online learning tests.

The great thing about the course was that you could easily setup VMware to try out and play with the product and solutions that IBM provide.

The next really good industry resource that I stumbled upon was this video Elephant Riders where some of the key Big Data players debate about the relevance of Hadoop. The key companies present in the panel were MapR, Cloudera, HortonWorks and Continuity. The video is worth watching just to see the Elephant bouncing on the trampoline.

Overall this is a great to start to my journey in learning and investigating more about Big Data.