Skip to content

BUG: .describe() doesn't work for EAs #61707

@andrewgsavage

Description

@andrewgsavage

Pandas version checks

  • I have checked that this issue has not already been reported.

    I have confirmed this bug exists on the latest version of pandas.

    I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd, pint_pandas
s = pd.Series([1, 2, 3], dtype='pint[kg]')
s.describe()

DimensionalityError                       Traceback (most recent call last)
...

Issue Description

hgrecco/pint-pandas#279

Series.describe sets the dtype for the results to Float64Dtype when the input is an EA. pint's Quantity
cannot be casted to Float64Dtype.

dtype = Float64Dtype()

Expected Behavior

.describe should return a Series of objectdtype, or the dtype of the EA

Installed Versions

Replace this line with the output of pd.show_versions()

Activity

kernelism

kernelism commented on Jun 26, 2025

@kernelism

take

andrewgsavage

andrewgsavage commented on Jun 26, 2025

@andrewgsavage
Author
kernelism

kernelism commented on Jun 29, 2025

@kernelism

@andrewgsavage I’ve been exploring the codebase (still new here), and my initial thought is to check if the list of calculated statistics contains multiple types. If so, setting dtype=None would cause the result to be a Series with object dtype, which should resolve the issue. Does that sound right?

andrewgsavage

andrewgsavage commented on Jun 29, 2025

@andrewgsavage
Author

I think that is a good solution. I wonder how it would deal with an Series with int dtype. Should that give objects or float dtype?, since mean would give a float while mean would give int

added
Needs DiscussionRequires discussion from core team before further action
and removed
Needs TriageIssue that has not been reviewed by a pandas team member
on Jun 30, 2025
kernelism

kernelism commented on Jun 30, 2025

@kernelism

I think that is a good solution. I wonder how it would deal with an Series with int dtype. Should that give objects or float dtype?, since mean would give a float while mean would give int

In that case, it would never go into the if block checking if the series dtype is EA. It would automatically get a float type based on existing logic. Our fix would only pertain to EAs.

elif series.dtype.kind in "iufb":
# i.e. numeric but exclude complex dtype
dtype = np.dtype("float")

Should I open a PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

BugExtensionArrayExtending pandas with custom dtypes or arrays.Needs DiscussionRequires discussion from core team before further action

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

    Participants

    @andrewgsavage@simonjayhawkins@kernelism@arthurlw

    Issue actions

      BUG: .describe() doesn't work for EAs · Issue #61707 · pandas-dev/pandas