Ready to write Python code 4x faster?
It’s a supremely underrated fact that Excel is the most-widely used piece of consumer software outside of a web-browser, and has been so for the past 40 years or so. Simply put, Excel is one of the most dominant pieces of software of all time.
What can explain this dominance? How does a program that has not really changed in 40 years keep its relevance? Despite being a programmer myself, why do I feel myself turning to Excel for many of my one-off data tasks? Why am I under Excel’s thumb?
Introspecting the reasons for Excel’s dominance over the past 40 years can help us understand what Excel is good for and what it sucks at, help us understand where Excel’s competitors are lacking, and understand where the future of data work is going over the next 40 years.
Building Dominance over 40 years
In major part, the dominance of Excel can be summarized as “ultimate flexibility, for most data use cases, primarily visually, in the simplest package.” Let’s break it down.
The simplest package
Most data we are likely to encounter is very naturally expressed in a table. Since the famous 1970 paper introducing SQL to the world, it’s become clear that most of the data we encounter in a business context is most naturally stored, transformed, and consumed in a “rectangular” context. Think: columns and rows, rows and columns.
Now, let’s think about the simplest way to represent this rectangular data to a user. We have columns and rows… so the simplest thing to show the user is just columns and rows. Representing this data as a grid of cells is as simple as you can possibly get.
And thus you have a spreadsheet! Each row in Excel corresponds to a row in your data. Each column in Excel corresponds to a column in your data. It’s so obvious that this entire section seems silly — “a rectangle holds a rectangle.” Yeah, Nate, duh. We know.
This simplicity of representation makes getting started with Excel easier than any other tool for transforming data.
Many data tools, like SQL, make the tools that transform the data “what the user sees first”. Think about the first thing you see when you open an SQL editor — it’s not the data you’re editing, it’s the SQL code itself that takes up most of the screen.
Excel is the exact opposite. The data you’re working with is what takes up most of the screen space. By default, cells with formulas show the value result of the formula - and if you want to see the formula, you have to double click on the cell or look at the formula bar.
As above, this choice to make the data visible first-and-foremost, not the transformations that manipulate the data, make Excel incredibly easy to grok. Your data is always right there, without you needing to do anything to get it.
Most Basic Use Cases
More than just visually displaying our data in the simplest possible format, Excel also provides capabilities that cover most basic data transformation and querying use cases.
Simply put, transforming data requires tools. One option for tooling (ala SQL or Pandas) is to use a full programming language.
But again, Excel makes a decision that keeps the transformation tools as simple as possible. Rather than introducing a notion of variables, a runtime, etc - there’s just the grid of cells again. You write formulas that reference other cells in a grid, and they automatically stay up to date. A middle-schooler can pick up the basics of Excel’s transformation model in 10 minutes.
These formulas have the ability to cover the most common basic use cases - all in the most usable possible package.
And more than just covering most use cases, the capabilities presented by formulas and a huge grid of cells is about as flexible as you can get.
Want to break away from the column and rows model and edit cells individually? Go for it in Excel - but good luck in SQL or Python.
Want to totally break away from rectangular data and start creating random, one-off calculations? Go for it in Excel - good luck in SQL.
Want to ignore data entirely and use your spreadsheet to make a TODO list? Go for it in Excel. Yeah right anywhere else.
Excel’s dynamic grid places very little limitations on how you can accomplish your use cases - and this makes it well suited to just about any data task you can bring to it!
Bringing it all together
“Ultimate flexibility, for most data use cases, primarily visually, in the simplest package.” Excel accomplishes this better than any other data tool in existence, and as a result has remained the dominant tool for data work for the past 40 years.
In the next post in this series, we’ll explore what competing data products can learn from Excel’s dominance, and in the final post we’ll use these learnings to understand what the next 40 years of spreadsheets hold.
More Like This
Automating Spreadsheets with Python 101
How to tell the difference between a good and bad Python automation target.
10 Mistakes To Look Out For When Transitioning from Excel To Python
10 Common Mistakes for new programmers transitioning from Excel to Python
Research shows Mito speeds up by 400%
We're always on the hunt for tools that improve our efficiency at work. Tools that let us accomplish more with less time, money, and resources.
3 Rules for Choosing Between SQL and Python
Analysts at the world's top banks are automating their manual Excel work so they can spend less time creating baseline reports, and more time building new analyses that push the company forward.