A few days ago I attended a meetup hosted by Outreach Digital, a diverse community of Digital Professionals in Europe.
The format for the challenge was simple:
– 1 mystery dataset
– X randomized teams
– 20 minutes to analyze the problem
– 10 minutes to create your solution .
..aaaaand 2 minutes to pitch
Federico set up the scene for the challenge and we were emailed a copy of the dataset. The objective of the exercise was to find a predictive modal that could help identify If a person survived the crash.
I found myself seated with a multi-disciplinary team and it quickly became apparent that we all came from a number of different backgrounds.
We took to the challenge by initially hypothesising on the story and looking to validate the hypothesis by analysing the data. It quickly became quite obvious that Men over a certain age were more than likely not to survive. Also females were more likely to survive and if they were in the top tier classes the chances were significantly improved.
We were also very lucky to have Francesco on the team as he had awareness of a software tool called KNIME, the tool allows you to provide a data set and works out a decision tree algorithm that would fit the dataset. Obviously there is some level of manipulation required to better utilise the tool. Also given the short time we did not bother getting into data set sampling or major cleaning of the data.
In the end we got a model that had an 83% likelihood to predict the outcome of the crash. This was actually quite good, one of the other teams manged 81%. Needless to say our team managed to win by the narrowest of margins.
One thing that really came out for me in the session was the importance of thinking about the story as that provided a good foundation for any modelling. It was great to have Pantea on the team as she really drove the story telling…
The whole team really contributed and engaged on the task and I am sure “Dream Team” will be up for a similar challenge in the not so distant future.