Functions

Math

MEDIAN

# How to Use Excel's MEDIAN Function in Pandas

Excel's MEDIAN function calculates the median of a range of values. This guide explains how to replicate Excel's MEDIAN functionality in Python using pandas.

## Implementing the Median function in Pandas#

To mimic the MEDIAN function from Excel in pandas, there are several approaches depending on your specific needs. Here are some common implementations:

### Median value in each row#

In Excel, to median values in cells A1, B1, and C1, you'd use =MEDIAN(A1, B1, C1).

In pandas, finding the median of values across columns for a given row can be done similarly. Below is how you'd compute the median of 'Col1', 'Col2', and 'Col3' for each row:

```
# Calculate the median of Col1, Col2, Col3
df['Median'] = df[['Col1', 'Col2', 'Col3']].median(axis=1)
```

### Entire column median#

In Excel, to find the median of an entire column, you'd use =MEDIAN(A:A).

In pandas, you can use the median method on the desired column to get a similar result:

```
# Calculate the median of the entire column
col_median = df['Column1'].median()
```

### Entire dataframe median#

In Excel, to find the median of an entire table, you'd use =MEDIAN(A1:D10).

In pandas, you can use the median method on the entire dataframe to get a similar result:

```
# Flatten the DataFrame and get the median value
df_values = df.values.flatten().tolist()
median_value = pd.Series(df_values).median()
```

## Common mistakes when using MEDIAN in Python#

These are common mistakes (and solutions) that you might run into while calculating medians in Python.

### Finding the Median of Strings#

Attempting to calculate the median on columns with string values raises a TypeError. For example: `TypeError: could not convert string to float`. You must convert these strings to numeric types before computing the median.

For example, if you have values with the format $1.99, $2.99, etc. you must convert them to float values before calculating the median

```
# Convert $ values to float
df['Column'] = df['Column'].str.replace('$', '').astype(float)
# Calculate the median
median = df['Column'].median()
```

### Ignoring Missing Values#

Not managing `NaN` values correctly can lead to inaccurate averages. By default, pandas skips `NaN` values, similar to how Excel's MEDIAN function disregards blank cells. However, you can also choose to treat `NaN` values as 0 when calculating the median, or return NaN if any of the values are NaN.

```
# Create a sample dataframe
df = pd.DataFrame({'Column': [1, 2, 3, None]})
# Calculate median with NaN values
df['Column'].median(skipna=False) # Resut: NaN
# Calculate median without NaN values
df['Column'].median(skipna=True) # Result: 2.0
# Calculate median with NaN values treated as 0
df['Column'].fillna(0).median() # Result: 1.5
```

## Understanding the Median Formula in Excel#

The MEDIAN function in Excel takes one or more arguments (ranges of numbers) and returns their median.

=MEDIAN(number1, [number2], ...)

### MEDIAN Excel Syntax

Parameter | Description | Data Type |
---|---|---|

number1 | The first number you want to include in the median. | number |

number2 | (Optional) Additional numbers you want to include in the median. | number |

... | (Optional) Add up to 255 numbers you want to include in the median. | number |

### Examples

Formula | Description | Result |
---|---|---|

=MEDIAN(1, 2, 3) | Calculate the median of values 1, 2, and 3. | 2 |

=MEDIAN(A1:A10) | Calculate the median of values from A1 to A10. | Median of A1 to A10 |

**Don't re-invent the wheel. Use Excel formulas in Python.**

## Don't want to re-implement Excel's functionality in Python?

## Edit a spreadsheet.

Generate Python.

Mito is the easiest way to write Excel formulas in Python. Every edit you make in the Mito spreadsheet is automatically converted to Python code.

View all 100+ transformations →