In an interesting post on Jeff Atwood’s blog, he talks about an old but still highly relevant book by Robert Cringely about the early days of tech star-ups in Silicon Valley and their founders. Jeff argues that for a successful software project, you need the same kinds of characters that make a tech start-up successful. The same is actually true for an analytics project.
Commandos are the ones that parachute behind enemy lines and establish a bridgehead before anyone notices. They innovate, working fast and hard to come up with unique new ideas, perhaps though with less professionalism, because professionalism costs time.
The infantry comes in to fortify the defensive position established by the commandos. They test their work thoroughly, refactor, improve, write documentation and define business rules. All the things that the commando doesn’t like doing but that are essential for the survival of a project.
After the commandos and infantry are long gone to the next battle, they leave behind the police whose job is maintaining order. They are essential to the long-term success of a project but have often long forgotten who it was that first set foot on the enemy territory that they now occupy.
You need all three kinds of people at the right time for a data science project to be successful in the long run.. Once the project enters the maintenance phase, having a commando in the team can actively hurt you, while not sending in the infantry once the commandos get bored will stall your project unnecessarily.
This is an important thought when reflecting on the role of a data scientist. They should be the commando, the first man on the ground, tapping into new data sources or combining data in novel ways, building predictive models that give your organisation a competitive advantage quickly, before everyone else gets there and what was cutting edge analytics becomes common sense. They talk to business people and convince them that the new findings will help the organisation and that they should be put to use today rather than tomorrow.
Then the infantry comes in and cleans up the ETL, fixes the APIs and reasons about the integration of the data model into the company’s data warehouse infrastructure. The (for the data scientist) boring but necessary part of the project. Once the police take over the maintenance and occasional feature request, the data scientists should be long gone.
So what mistakes are commonly made when using data scientists?
This is what I see most often. Companies establish data models, data warehousing infrastructure, and internal processes first, buy enterprise tools, and then proceed to hire a data scientist who first gets frustrated and then bored, and either leaves the company or disengages and becomes a liability. Don’t do that. If you’re at the front line, you can’t wait for a week until the developer in your data centre makes that new table for you.
The second common mistake. Once the bridgehead is established, bring in the infantry. While I think that data scientists absolutely have to know some SQL for basic ETL needs, they usually have neither the patience nor the motivation to become a full-time ETL developer. This leads to sub-par ETL and demotivated data scientists.
We as data scientists should definitely care about the long-term performance of our models, and the monitoring thereof, but this can’t become the main task for someone who loves to dig through data and discover new things. Don’t make us stop doing what we love to do something that other might be better at!
Data scientist should be at the front line of your business, finding new insights in your data, hacking away, predicting the future, convincing people. Of course their products have to be integrated into your company’s infrastructure at some point. While it’s okay to let data scientists do this for small projects, they’re usually neither motivated nor the right people to do this for bigger data apps. Let them fight their knife fights. Don’t bring them in too late and force an infrastructure on them that is probably not suitable for them.
Use your data scientists wisely!
Talk to us if you want to learn more.Back to Insights
Register here to receive the latest NewsletterRegister