Excel to Python: REPLACE, SUBSITUTE Function - A Complete Guide | Mito
Home Icon
divider

Functions

divider

Text

divider

REPLACE

How to Use Excel's REPLACE, SUBSITUTE Function in Pandas

There are several formulas in Excel designed to help you replace characters in a string.

The REPLACE function in Excel replaces characters in a string by location.

The SUBSTITUTE function in Excel replaces chracters in string by finding matching substrings.

Mito is an open source library that lets you write Excel formulas in Python. Either write the formula directly in Python or use the REPLACE, SUBSITUTE formula in the Mito Spreadsheet and generate the equivalent Python code automatically.

Mito's REPLACE, SUBSITUTE function works exactly like it does in Excel. That means you don't need worry about managing data types, handling errors, or the edge case differences between Excel and Python formulas.

Install Mito to start using Excel formulas in Python.

# Import the mitosheet Excel functions
from mitosheet.public.v3 import *;

# Use Mito's REPLACE, SUBSITUTE function
# Note: No need to convert the column to a string first
# because Mito's SUBSTITUTE formula does so automatically
df['column'] = SUBSITUTE(df['column'], 'old_text', 'new_text')
Copy!
Clipboard

Pandas offers the replace method which is versatile and can be used for a wide range of replacements. Here are some common implementations:

Excel's REPLACE function allows you to replace a substring by specifying the starting position and the number of characters to be replaced.

In pandas, you can achieve this using the string slicing syntax. For example, to replace the first three characters of a string with 'ABC' in Excel, you'd use the formula =REPLACE(A1, 1, 3, "ABC"). In Pandas you'd use:

# Replace the first three characters with 'ABC'
df['column_name'] = 'ABC' + df['column_name'].str[3:]
Copy!
Clipboard

To replace the last three characters from the end of a string in Excel, you'd use the formula =REPLACE(A1, LEN(A1) - 2, 3, "ABC"). In Pandas you'd use:

# Replace the last three characters with 'ABC'
df['column_name'] = df['column_name'].str[:-3] + 'ABC'
Copy!
Clipboard

To replace characters from the middle of a string in Excel, you'd use the formula =REPLACE(A1, 3, 3, "ABC"). In Pandas you'd use:

# Replace three characters starting from the 3rd position with 'ABC'
df['column_name'] = df['column_name'].str[:2] + 'ABC' + df['column_name'].str[5:]
Copy!
Clipboard

To replace specific values in a column in Excel, you'd use the 'Find & Replace' feature or the SUBSTITUTE formula instead of the REPLACE functionality.

Instead of replacing characters by location, the SUBSTITUTE formula replaces characters by finding matching substrings. To do the same in Pandas, you can use the replace method.

# Replace old_text with new_text
df['column_name'] = df['column_name'].replace(old_text, new_text)
Copy!
Clipboard

To replace multiple values, you can pass a dictionary to the replace method. The keys of the dictionary will be replaced by the corresponding values.

For example, to replace 'NYC' with 'New York City' and 'LA' with 'Los Angeles', you could use the following code:

# Replace multiple values using a dictionary
df['column_name'] = df['column_name'].replace({
    'NYC': 'New York City',
    'LA': 'Los Angeles'
})
Copy!
Clipboard

The replace method in pandas is powerful, but certain pitfalls might produce unexpected results. Here are some of the common mistakes and how to address them.

The string slicing method that we use above only works on string columns. If you try to use it on a numeric column, you'll get an error.

To avoid this, you can convert the column to a string using the astype method.

# Convert a column to a string before replacing characters
df['column_name'] = df['column_name'].astype(str).str[3:]
Copy!
Clipboard

If you having missing values in your column, the string slicing method will return an error. To avoid this, you can use the fillna method to replace missing values with an empty string.

# Replace missing values with an empty string
df['column_name'] = df['column_name'].fillna('').str[3:]
Copy!
Clipboard

The string slicing method is useful when you want to replace a specific number of characters from the front, end or middle of a string. However, if you want to replace a specific value, you can use the replace method directly.

To learn more about the replace method, refer to the SUBSITUTE page.

In Excel, you can use the REPLACE function to replace specific characters in a string by location. For example, to replace the first three characters of a string with 'ABC', you'd use the formula =REPLACE(A1, 1, 3, 'ABC').

REPLACE, SUBSITUTE Excel Syntax

ParameterDescriptionData Type
old_textThe original text or the text which contains the characters you want to replace.string
start_numThe starting position of the character in old_text you want to replace. The first position in text is 1.number
num_charsThe number of characters in old_text that you want to replace. If omitted, it will replace all characters from start_num to the end of old_text.number
new_textThe text string that will replace the characters in old_text.string

Examples

FormulaDescriptionResult
=REPLACE("Hello World", 7, 5, "Pandas")Replaces the text 'World' with 'Pandas' starting at the 7th position.Hello Pandas
=REPLACE("123456", 2, 3, "ABC")Replaces three characters starting from the 2nd position with 'ABC'.1ABC56
=REPLACE("ABCDE", 1, 0, "X")Inserts 'X' at the 1st position without replacing any characters.XABCDE
=REPLACE("ABCDE", 3, 2, "XY")Replaces two characters starting from the 3rd position with 'XY'.ABXYE

Don't re-invent the wheel. Use Excel formulas in Python.

Install Mito

Don't want to re-implement Excel's functionality in Python?

Automate analysis with Mito