Automation is “using data and technology to automate activities in a business that have more than one step.” This improves quality, improves timeliness, simplifies administration and governance, and saves cost.

McKinsey’s research (McKinsey Global Institute, 2018) suggests that 60% of all occupations have at least 30% of tasks that are suitable for automation. Fewer than one in 20 is 100% automatable, and that’s hardly a surprise: employing someone to do only robotic jobs means that person is unlikely to be able to use judgement or skill when there is a non-routine task to deal with. Tasks with the highest potential to be automated are predictable physical activities (81%) (that’s why robots build cars), and that doesn’t concern us. But McKinsey argues that 69% of data processing, and 64% of data collection can be automated. That’s what this blog is about …

The resistance to automation within organisations is a powerful force. We all like to think that what we do is special, that each working day is unique, and that any job that requires a learned skill can’t be done by a machine. This also holds true of building the key components in our data world – data feeds, data transformations, data loads, reports, dashboards, data quality reporting, data science models. However, many of the tasks we do can and should be automated, and surprisingly this is a significant benefit to our organisations, us and our careers.

Automation is key for ambition

Allow me to use an analogy for ambition in the ways we can travel. So let me explain. If you just use data manually, you’re doing the equivalent of making a journey on foot, you won’t get that far, it’ll be tiring, and you’ll feel every step. If you use spreadsheets – it’s like cycling – you start to see the benefit of the bike, you make more progress and you get the excitement of freewheeling occasionally. If you develop reports and dashboards you’re in a car. The car is doing most of the work automatically and it starts to feel like you’ve got superpowers. However, if we’re really ambitious we may want to get to the moon (and back). For that a car won’t do – we need a rocket. For that you need pretty much everything that can be automated to be automated. There is no time for manual intervention. This is a major step up from a car, cycling or walking. That’s the essence of this blog.

How to measure potential for automation.

We started with quotes from McKinsey on the tasks that are suitable for automation. So let’s take a quick look at where the potential lies …

  • Keeping up with the demand - In most organisations the data teams become the victim of their own success. They deliver a couple of projects well, early on, but then can’t keep up with the demand - partly because that demand grows exponentially, and partly because they end up having to spend significant amounts of time documenting and supporting their own code. The value of automation is that it enables developers to focus more on the business goals, and deliver at scale.
  • Increase in job satisfaction - Automation is critical to stay ahead in terms of productivity, creativity, and job satisfaction. The more you automate the more you can stay focussed on the goals of the activity and stay out the weeds. You get results faster, you can then modify, improve and validate with less stress.
  • Frequent changes - One of the reasons why data handling is so manual is that requirements for reports change regularly, as systems are upgraded, as projects come and go, as products and services evolve. This is often cited as a reason why reporting has to be manual, when in fact it is the opposite. The point of these frequent changes is that you need to be agile enough to respond to them, and to pass information around the organisation in a consistent way. The only way to do this is to automate all the data feeds, and then you can apply them as requirements change.
  • Manage risk - Automation means that risk-based decisions are always using the latest data, and that we can be sure that it is not data that has been massaged or biased. This is also hugely important for financial reporting. It creates, for example, month-end financial reports in a timely manner, without conflict over reconciliation.
  • Reduction in errors - Another benefit automation is that it reduces the number of manual errors, which in turn reduces the risk of the projects.
  • Undocumented processes - Manual data-wrangling aggregates many, sometimes hundreds, of small decisions and tasks that may exist only in the minds of whoever does it. If you asked that person to write the process down, chances are they would miss some of it out. As your automation maturity develops, you may encounter the same problem with analysts and data scientists, or anyone who decided to write code to extract data: they have not documented what they did. When that person moves jobs, goes on holiday or simply forgets, you need to know how to extract that data.
  • Data access - This is occasionally a huge bottleneck in the functioning of the organisation – and again, it’s not confined to interns and admins. In one role, the administrator of the data warehouse had come to act more like its gatekeeper. If we wanted to extract data, we had to go through him, and do it his way. Automation was almost a loss of sovereignty. It is not uncommon in my experience for someone who runs the reports without documentation to “hold the company to ransom” over pay rises and job security.
  • Technological change - We are going through a digital transformation, but what does that mean? Often it means that data will move around as applications are shifted to the cloud. Manual processes need to be updated every time the technology changes, often in a state of panic when someone repeats last month’s manual process and finds that apparently the business suddenly has no staff or zero turnover. A script that extracts data is almost entirely focused on how to extract it, in which format, which fields to combine. If there are 100 lines in the script, 99 of them will be about this, and one will tell the application where the data lives. If you’ve automated, this is all you need to update.

Solutions

It’s important that you know there are solutions that work well today, and that the ability to automate and be independent of technology environments is key. My argument is that automation tools should be intuitive and ideally used by business savvy people in the data team. So instead of having to create code manually, you create a diagram of the flow of data and the processes that it goes through, and it self-documents. My experience suggests that you can automate 70-80% of the data warehouse work and get productivity of improvements of 500%.

This works in that you can experiment, iterating based on what works best and most universally. It also automatically documents what you do.

There is clearly a sunken cost to using these tools: apply them to one report and it’s cheaper to hire someone to do the job manually. Apply them to 50 reports, and your finance director will be supportive not only because you made reporting quicker but also because you’re saving money by automating the activity of collecting and communicating information, as well as the result.

SUMMARY

Lack of automation will curb your ambition, your careers prospects and your company’s ability to compete. The truth is that, for many tasks, automation isn’t just possible, it’s the only sensible thing to do. How much can we automate? There are two answers that, experience says, are universally true about how much you can automate:

  1. More than you are automating today.
  2. More than you think.

So when you’re considering how you’re going to build your data lake, data vault, or data warehouse start planning to automate from day 1.

Simon At

ABOUT THE AUTHOR

Following decades of experience as a CDO for enterprise organizations, Simon invested time to author a book called "Data and Analytics Strategy for Business", which was just published in June of this year. This book outlines how to build consistent, high-quality sources of data which will create business value and explores how automation, AI, and machine learning can improve performance and decision making. Filled with real-world examples and case studies, this book is a stage-by-stage guide to designing and implementing a results-driven data strategy. This book is available to order on Amazon here.