Skip to content Skip to sidebar Skip to footer

Pandas Isin() Returns Different Result As Eq() - Floating Dtype Dependency Issue

pandas' isin method seems to have a dtype dependency (using Python 3.5 with pandas 0.19.2). I just came across this by accident in a related topic where we couldn't explain a non-w

Solution 1:

Maybe this will make it more clear:

>>> '%20.18f' % df[0].astype(np.float64)
'1.199999999999999956'

>>> '%20.18f' % df[0].astype(np.float32)
'1.200000047683715820'

Generally you don't want to see 18 decimal places so pandas will make reasonable choices about how many decimals to display -- but the difference is still there, albeit invisibly. So you need to make sure to compare float64 to float64 and float32 to float32. That's the floating point life we have chosen for ourselves...

Alternatively, if you were comparing to the values one at a time you could use np.isclose (after import numpy as np) to identify an approximate equality:

>>> np.isclose( df.astype(np.float64), 1.2 )
array([[ True, False, False, False, False, False]], dtype=bool)

>>> np.isclose( df.astype(np.float32), 1.2 )
array([[ True, False, False, False, False, False]], dtype=bool)

(You don't need the astype(), of course, it's just to prove that you would get the same answer for both float32 and float64.)

I don't know if there is a way to make isin work in a comparable way so you may have to do something like:

>>> np.isclose( df, 1.2 ) | np.isclose( df, 1.4 )
array([[ True, False, False,  True, False, False]], dtype=bool)

Solution 2:

#try this:
import numpy as np
df = df.apply(lambda x: x.astype(np.float32))
test=[1.2,1.4]
test=test.apply(lambda x: x.astype(np.float32))
df.isin(test)

Post a Comment for "Pandas Isin() Returns Different Result As Eq() - Floating Dtype Dependency Issue"