python如何把dataframe中字符型的nan替换为'',数值型的nan替换为null?

Python019

python如何把dataframe中字符型的nan替换为'',数值型的nan替换为null?,第1张

dataframe是字符串还是列表?

字符串用re.sub(pattern, repl, string)

列表的话用列表推导式

['' if x in NaN for x in dataframe[:-1]]

['null' if x in NaN for in dataframe[-1]]

null/None/NaN

null经常出现在数据库中

None是Python中的缺失值,类型是NoneType

NaN也是python中的缺失值,意思是不是一个数字,类型是float

在pandas和Numpy中会将None替换为NaN,而导入数据库中的时候则需要把NaN替换成None

找出空值

isnull()

notnull()

添加空值

numeric容器会把None转换为NaN

In [20]: s = pd.Series([1, 2, 3])

In [21]: s.loc[0] = None

In [22]: s

Out[22]:

0NaN

12.0

23.0

dtype: float641234567891012345678910

object容器会储存None

In [23]: s = pd.Series(["a", "b", "c"])

In [24]: s.loc[0] = None

In [25]: s.loc[1] = np.nan

In [26]: s

Out[26]:

0None

1 NaN

2 c

dtype: object123456789101112123456789101112

空值计算

arithmetic operations(数学计算)

NaN运算的结果是NaN

statistics and computational methods(统计计算)

NaN会被当成空置

GroupBy

在分组中会忽略空值

清洗空值

填充空值

fillna

DataFrame.fillna(value=None, method=None, axis=None, inplace=False, limit=None, downcast=None, **kwargs)

参数

value : scalar, dict, Series, or DataFrame

method : {‘backfill’, ‘bfill’, ‘pad’, ‘ffill’, None}, default None(bfill使用后面的值填充,ffill相反)

axis : {0 or ‘index’, 1 or ‘columns’}

inplace : boolean, default False

limit : int, default None

downcast : dict, default is None

返回值

filled : DataFrame

Interpolation

replace

删除空值行或列

DataFrame.dropna(axis=0, how=’any’, thresh=None, subset=None, inplace=False)

参数

axis : {0 or ‘index’, 1 or ‘columns’}, or tuple/list thereof

how : {‘any’, ‘all’}

thresh : int, default None

subset : array-like

inplace : boolean, default False

返回

dropped : DataFrame