How to be a Citizen Data Scientist
In a world awash with data, both inside corporations and open data freely available to anyone who wants to use it, data science has become a core aspect of many businesses, with some having adopted it to the maximum, and even going so far as to build business models around it.
Key adopters include marketing and advertising companies, and companies in the retail and logistics sector that use data to create efficient operating models.
However, opportunities that exist through leveraging data exist for companies of all sizes, governments, and across many professions.
What is a citizen data scientist?
A person who creates or generates models that use advanced diagnostic analytics or predictive and prescriptive capabilities, but whose primary job function is outside the field of statistics and analytics. Gartner.
The term Citizen Data Scientist has been used to describe non-data-scientists, usually in a company context, who have the analytical and numerical skills required to conduct their own research on specific problem domains without having the deep technology and mathematical skills of a pure data scientist. The spectrum of what constitutes a data scientist is very broad, and ranges from cleaning up and presenting data, to building re-usable insights and dashboard, to building neural networks that classify data sets to identify outlier patterns, fraudulent activity, and making predictions.
Citizen data scientists can reliably create domain specific hypotheses, source, clean, blend and restructure datasets they require, build models, and translate the results into insights. These insights subsequently drive decisions or prompt further questions and more exploration. These are exactly the type of skills needed to understand your customers and their behaviour before, during and after a purchase.
Data analysis skills are important for understanding our world, and helping us to identify fundamental patterns that are happening in it by digging deeper and using evidence to build our own opinions, and make decisions. In a world of fake news and misunderstood data, this is important.
The key attributes of a citizen data scientist are a:
- Desire to understand what matters in their domain better
- Desire to understand the value delivered in quantitative terms
- Desire to understand problems and build their own opinions
- Desire to deliver positive commercial outcomes
- Desire to challenge what they are being told, and understand it more deeply
- Desire to collaborate and contribute to meaningful evidence based discussion.
Data has become democratised
It's easier than ever to adopt business intelligence tools, and many of these now exist, targeted at non-technical business users who can benefit from rapid time to insight, and more informed decision making. No longer the sole remit of an internal data warehouse team, any business department can task one if its team to use tools to extract valuable information from data.
In addition to corporate data, open data is available from many sources, and governments have actively promoted making datasets available for individual research under the Open Government Licence (OGL).
The Open Street Map initiative is a geographical database, and mapping service that can be queried for geographical features, that can in turn be mapped to postcodes, ONS (UK Office for National Statistics) labour market data, and census data.
Many other websites aggregate data and present it in easily consumable ways. Some interesting examples include:
- Our World In Data: https://ourworldindata.org/
- World0meters: https://www.worldometers.info/
- European Centre for Disease Control: https://www.ecdc.europa.eu/en
- UK Office for National Statistics: https://www.ons.gov.uk/
- UK Open Data: https://data.gov.uk/
It's almost impossible to create a comprehensive list of data sources, but there are easy to find. With a bit of practice, filtering, cleaning and creating reusable models is easy, and a short example of how this can be done is below.
Why Citizen Data Scientists are Important
Citizen data science democratises data and avoids dependency on the data 'boffins' who who seem somehow out of rhythm with the business. It also means that we can use data without resorting to complex mathematical models.
People at the coal face of the business to become more accountable for outcomes, with the ability to use a data driven approach to continuous improvement. For example, eCommerce technology teams can be organised around value to ensure success in business critical functions: teams responsible for attracting customers to the site, teams that ensure customers see relevant product listings, and teams that reliably convert customers.
At each stage of the customer journey, things can go wrong, causing customers to move on to a competitor. Accountable, cross functional teams can ensure high quality data feeds, and make this data available in real time. By taking a data driven approach, they can find out why customers are navigating away, why checkouts are failing, etc, and develop solutions. The efficacy of these solutions, are in turn subjected to the same rigour of data analysis and the circle of improvement continues.
Any part of a business can exploit a similar approach and similar techniques. By being curious, having access to accurate raw data, and adopting a digital try/fail/change approach, all teams can reap the benefits of Citizen Data Science.