The Data Challenge By Outreach Digital

A few days ago I attended a meetup hosted by Outreach Digital, a diverse community of Digital Professionals in Europe.

The format for the challenge was simple:

– 1 mystery dataset
– X randomized teams
– 20 minutes to analyze the problem
– 10 minutes to create your solution  .
..aaaaand 2 minutes to pitch

Federico set up the scene for the challenge and we were emailed a copy of the dataset. The objective of the exercise was to find a predictive modal that could help identify If a person survived the crash.

Meetup Challenge

I found myself seated with a multi-disciplinary team and it quickly became apparent that we all came from a number of different backgrounds.

We took to the challenge by initially hypothesising on the story and looking to validate the hypothesis by analysing the data. It quickly became quite obvious that Men over a certain age were more than likely not to survive. Also females were more likely to survive and if they were in the top tier classes the chances were significantly improved.

We were also very lucky to have Francesco on the team as he had awareness of a software tool called KNIME, the tool allows you to provide a data set and works out a decision tree algorithm that would fit the dataset. Obviously there is some level of manipulation required to better utilise the tool. Also given the short time we did not bother getting into data set sampling or major cleaning of the data.


In the end we got a model that had an 83% likelihood to predict the outcome of the crash. This was actually quite good, one of the other teams manged 81%. Needless to say our team managed to win by the narrowest of margins.

One thing that really came out for me in the session was the importance of thinking about the story as that provided a good foundation for any modelling. It was great to have Pantea on the team as she really drove the story telling…

The whole team really contributed and engaged on the task and I am sure “Dream Team” will be up for a similar challenge in the not so distant future.

Learning Scala and Gaining Commercial Experience

So… I am now into week 5 of the Coursera course on Functional Programming Principles in Scala. It’s been hard work so far as I have had to learn a lot more about functional programming.

Week 1 was really a challenge as I had to get my head around recursion. Particularly the coin counting assignment which I eventually figured out. What’s really been fascinating is that once you start “getting” Scala you realise how powerful the language really is. One starts to appreciate it’s elegance and expressiveness.

My initial reason for learning Scala was to understand Spark better as I see Spark as a key component for many Big Data Solutions. Spark is written in Scala and hence I felt the need to learn Scala.

Having learned the foundations of Scala, I am now debating on next steps. I have a number of choices either getting a role where I can do some hands on coding or building my own software product. Ideally, I would prefer the former provided I have a good team of people that I can work with. I also joined the Slack chat for Spark and there is a nice channel dedicated to Scala Algorithms (Scala_viz)

Coursera is yet to launch the Spark and Scala course but when it’s on I think it will be a really good course.

In terms of the Coursera course, I would highly recommend it. It’s challenging but you really do get a lot out of it particularly if you have not programmed in a functional manner previously.

In terms of gaining experience with Scala, there seems to be a shortage of Scala developers that have a few years experience. However, for the newbies, it’s the classic chicken and egg situation where it’s hard to find a junior role that will provide sufficient commercial experience. The great thing about coding is you can still build your own App or contribute to an open source project.