Posted in Courses, Design Thinking, Innovation, Interesting Tools

Design Thinking the Art & Science of Innovation

I recently had the pleasure of completing a Design Thinking Course that was recommended by my manager at Capgemini. Normally I would be inclined to take more technical courses so this was certainly different….

The key motivation was that at Capgemini employee’s should think outside the box and help come up with ideas that are novel for our clients. Design thinking is particularly strong for any form of human centric problem solving, it helps accelerate innovation to create better solutions for challenges facing business and society.

The course was based on IDEO’s design thinking approach and they provided a very comprehensive methodology whilst helping us solve a real-world case study. What was really interesting was how we could collectively come up with insights that were founded on observations and experiencing the process was great… what was more encouraging was that I found myself in the winning team (makes things even more fun)

Design thinking starts with people and we apply creative tools, like storytelling, prototyping and experimentation to deliver new breakthrough innovations.

When doing observations, it’s good to look at extreme cases and then based on fact based observations coming up with insights that are authentic, non-obvious and revealing. This needs to be followed by framing opportunities and this becomes a springboard for ideas and solutions.

Whilst at the course we came up with this novel thinking and more recently as part of an AIE project at Capgemini we were again able to make this leap making this a very powerful and credible approach.

The following a you tube video that was recorded whilst we were having fun doing some field research for one of UK’s most innovative brands:

Posted in HDInsights, HIVE, Interesting Tools, Power BI, SPARK

Working With Big Data with HDInsights and Power BI

Recently I had the opportunity to explore microsoft’s offering for managing Big Data. It was amazing to see how easy it was to setup a Hadoop & Spark cluster using the Azure HDinsights framework. There are numerous setups available and pre-made distributions are also available from vendors such as Cloudera and Hortonworks.
It’s great to see Microsoft not only embracing but also actively contributing to some of the BigData solutions in the Open Source community. Apache Spark is an open-source framework for cluster computing and one that I actively follow. I have seen regular contribution from Microsoft staff members towards this project.
The key advantage of using Azure HDInsights is that a Cluster of computers can easily be configured and made available within 20 to 30 mins. The default Spark cluster also comes pre-configured with lots of applications such as Ambari, PIG, HIVE, Flume, Kafka as well as a Jupyter notebook that will work with Python and Scala.
It’s worth noting that rather than the filesystem residing on commodity hardware, Microsoft would utilise Azure Blob Storage. The following are some common patterns on how we might use Microsoft Power BI (Data Visualisation Tool for Big Data which is similar to Tableau):
Option 1: (Hive to Data Visualisation tool)

Use a Hive Table and query this via an OBDC driver, note that with this approach the entire table is downloaded to PowerBI. In most cases the hive table will be derived by querying data in the Azure Blob Storage using a cluster of computers

Option 2: (Process Data-> Save to Azure Blob -> Analyse Flat File with Visualisation tool)
Another way to import data from HDInsight into Power BI is by connecting to flat files in either Blob or the Data Lake Store
In this situation use HDInsight to process your data and write the resulting curated or aggregated data into text files. Generally this will give better refresh performance as we bypass the ODBC driver.

Option 3: (This is in Beta) Direct Query with Spark Cluster
This option allows you to keep the data in the Azure Blob storage and utilise technology like Spark SQL to query the data, the summarised results are sent back to Power BI. This approach can allow for huge data sets to be analysed using a cluster of machines

Option 4: Direct Query with Azure SQL DB

DirectQuery using Azure SQL Database (DB) you would process your data in your cluster, but write the resulting data to tables in Azure SQL DB (or Azure SQL Data Warehouse). Power BI would take care of the data refresh as well as getting only the data that is required from the Azure SQL database.

In my view the most common implementation scenario is going to be utilising a cluster to mine the inital data and get aggregated results. This data would then reside in flat files on Azure Blob storage or Azure SQL Database. This approach will mean that the cost of keeping a cluster turned on will be at a minimum. Further tools like Jupyter notebooks can have prpre-builtcripts that can be easily modified for Adhoc data processing.

Let me know if there is anything you would like me to conver furter with Azure HDInsights, I also have access to some rolling microsoft credits to prototype sample solutions.

Posted in Interesting Tools, Meetup Events

IBM Datapalooza Mashup

I was at the IBM Datapalooza Mashup day in June 2016. It was really interesting to learn how IBM go about capturing all the data and statistics at Wimbledon. The ability for the Wimbledon’s team of  content creators to get interesting facts before the press is a real world example on how data can give a strategic advantage.

By predicting the type of records that are likely to be broken well in advance and then getting real time data when these records are broken give team Wimbledon an edge over any other news organisations.

There were various demonstrations on how IBM’s Bluemix product is being utilised to manage social media through its text analytics interface.

Another really interesting workshop was about a product called Node Red, this is  a visual programming language ideal for Hackathons, it allows one to connect various rest services visually and if you need more flexibility one can get under the hood by editing the node.js scripts the application produces in the background.


Certainly a great tool to help build proof of concepts and test out code: