-
-
Notifications
You must be signed in to change notification settings - Fork 19.1k
Open
Labels
BugExtensionArrayExtending pandas with custom dtypes or arrays.Extending pandas with custom dtypes or arrays.Needs DiscussionRequires discussion from core team before further actionRequires discussion from core team before further action
Description
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
To pick up a draggable item, press the space bar. While dragging, use the arrow keys to move the item. Press space again to drop the item in its new position, or press escape to cancel.
Reproducible Example
import pandas as pd, pint_pandas
s = pd.Series([1, 2, 3], dtype='pint[kg]')
s.describe()
DimensionalityError Traceback (most recent call last)
...
Issue Description
Series.describe
sets the dtype for the results to Float64Dtype
when the input is an EA. pint's Quantity
cannot be casted to Float64Dtype
.
pandas/pandas/core/methods/describe.py
Line 255 in 35b0d1d
dtype = Float64Dtype() |
Expected Behavior
.describe should return a Series of objectdtype, or the dtype of the EA
Installed Versions
Replace this line with the output of pd.show_versions()
Metadata
Metadata
Assignees
Labels
BugExtensionArrayExtending pandas with custom dtypes or arrays.Extending pandas with custom dtypes or arrays.Needs DiscussionRequires discussion from core team before further actionRequires discussion from core team before further action
Activity
kernelism commentedon Jun 26, 2025
take
andrewgsavage commentedon Jun 26, 2025
kernelism commentedon Jun 29, 2025
@andrewgsavage I’ve been exploring the codebase (still new here), and my initial thought is to check if the list of calculated statistics contains multiple types. If so, setting
dtype=None
would cause the result to be a Series with object dtype, which should resolve the issue. Does that sound right?andrewgsavage commentedon Jun 29, 2025
I think that is a good solution. I wonder how it would deal with an Series with int dtype. Should that give objects or float dtype?, since mean would give a float while mean would give int
kernelism commentedon Jun 30, 2025
In that case, it would never go into the if block checking if the series dtype is EA. It would automatically get a float type based on existing logic. Our fix would only pertain to EAs.
pandas/pandas/core/methods/describe.py
Lines 256 to 258 in 35b0d1d
Should I open a PR?