10 Common Mistakes for new programmers transitioning from Excel to Python | Mito

10 Common Mistakes for new programmers transitioning from Excel to Python

Ready to write Python code 4x faster?

  1. Using single equals sign (=) instead of double equals sign (==) for checking equality: In Excel, a single equals sign checks equality in formulas. For example, the formula =IF(A1=B1, "Equal", "Not Equal") checks if the values in A1 and B1 are the same. In Python, on the other hand, = is used for assignment and == is used to check equality. For example, 'a = 5' assigns the value 5 to the variable 'a', whereas 'a == 5' checks whether 'a' is equal to 5. As seen below, using a single equals sign in a conditional statement in Python will create a SyntaxError.
Error caused by using = instead of ==
  1. Forgetting to indent code blocks: Indentation is used in Python to define code blocks for functions, if-else conditions, and loops. Unlike Excel's VBA, where indentation is for readability, Python uses indentation to know what code to execute. Always indent the code following a function definition, if, elif, else, and for loop. Not doing so will create an 'IndentationError'.
Error caused by forgetting to indent function code block
  1. Forgetting to add a colon at the end of functions, conditionals and loops: In Python, a colon : is required at the end of the line that starts a function, if, elif, else, and for loop. The colon signals the start of an indented code block. Omitting the colon will result in a 'SyntaxError'. 
Error caused by forgetting : at the end of function
  1. Ignoring case sensitivity: Python is case-sensitive, but Excel is not. In other words, “Mito” and “mito” are equivalent values in Excel, but not in Python. Forgetting to handle case sensitivity in Python is especially common when filtering data and merging datasets together. To replicate Excel’s behavior, make sure to convert string values to uppercase or lowercase before filtering or merging datasets.
  2. Not managing your data types: Python requires explicit handling of data types (like integers, strings, and floats). Unlike Excel, which automatically interprets and converts data types depending on the context you are using them in, Python needs explicit type conversion.To convert individual values you can use Python built-in formulas like float() or str(). To convert pandas series, you can use the syntax:
df['A'] = df['A'].astype('str')
  1. Mixing up parenthesis () and square bracket []: Square brackets [] are used for indexing elements from a list or columns from a DataFrame. Parentheses, on the other hand, are used to make a function call. 
Code Action
my_list[0] Get the first element from list
df['A'] Get column A from dataframe
my_function() Call a function
  1. Forgetting to use quotes to create a string: Strings must be enclosed in quotes. Putting quotes around a word tells Python to use the value enclosed in quotes instead of interpreting the text as a variable that it must find in your code. The table below shows how removing the quotes would effect the code.
Code Action
company_name = "Mito" Assign the value “Mito” to the variable company_name
company_name = Mito Find the variable Mito and assign it’s value to the variable company_name
df['A'] = 1 Assign the value 1 the column ‘A’ in the dataframe called df
df[A] = 1 Find the variable A, use it’s value find the column in dataframe df, and then assign the value 1 to that column
  1. Not reading error messages to debug: Python's error messages provide clues to the source of the problem. The best way to use error message is following this protocol:
    1. Start at the top of the error message
    2. Find each line of code that the error message has a green arrow pointing to. In the example below, the first line is line 1: check_contains_value(df[’A’], ‘USD’)
    3. Continue reading the lines that have a green arrow pointing to it until you get to code that you did not write. In this example, we wrote the code on line 2: column = column.as_type(’str’), but we did not write the code on line 6204. That tells us we probably need to change the code on line 2.
    4. Read the last line of the error message to understand the specific error. In this case, its an AttributeError caused by as_type function not existing on the Series object.
    5. Aha! the function is actually called astype instead of as_type. That’s an easy fix, but if you don’t identify what is wrong right away, copying and pasting the line of code that is causing the error along with the error message into Google is a great next step.
Error message caused by using a function that does not exist
  1. Forgetting to run code to create or update a function before using the function: Remember that there is a difference between writing code and executing code. Just when you write a formula in Excel it does not execute until you submit the formula, in Python, your code does not execute until you press run. In the example below, the error message tells us that the function, my_function, is not defined. That’s because we never ran the code cell where the function was created.
Error function caused by calling a function you did not yet define
  1. Not breaking down the automation into sections: Successfully automating an Excel report requires planning. A little bit of work upfront will save you days or even weeks of debugging later.
    1. Start by dividing the report into manageable chunks. These chunks should be the smallest possible task that still moves you towards the goal of full automation.
    2. Implement the first chunk of work and validate that it is 100% accurate by comparing it against the original Excel report.
    3. Cleanup and document your code so in a few months if you return to the code you’ll remember what it does.
    4. Repeat those three steps until you finish your automation.

And remember most of all, utilize online resources like Stack Overflow, Pandas Documentation, and Mito documentation. These resources are a programmer’s best friend. 

Ready to write Python code 4x faster?