python数据分析-科学计数法

2023-03-03 13:38:02Python014

python数据分析-科学计数法,第1张

用python进行数据分析时，查看数据，经常发生数据被自动显示成科学记数法的模式，或者多行多列数据只显示前后几行几列，中间都是省略号的情形。

import numpy as npnp.set_printoptions(suppress=True, threshold=np.nan)

suppress=True 取消科学记数法

threshold=np.nan 完整输出（没有省略号）

display.[max_categories, max_columns, max_colwidth, max_info_columns, max_info_rows, max_rows, max_seq_items, memory_usage, multi_sparse, notebook_repr_html, pprint_nest_depth, precision, show_dimensions]

详细介绍文档： pd.set_option

可以在pd.set_option设置display.float_format参数来以政策小数显示，比如下面设置显示到小数点后3位

pd.set_option('display.float_format', lambda x: '%.3f' % x)

set_option中还有其它一些控制设置，包括默认显示列数，行数等等

pd.set_option('display.max_columns',5, 'display.max_rows', 100)

import pandas as pdpd.set_option('display.max_columns', 10000, 'display.max_rows', 10000)

display.max_columns 显示最大列数

display.max_rows 显示最大行数

1、pd.set_option(‘expand_frame_repr’, False)

True就是可以换行显示。设置成False的时候不允许换行

2、pd.set_option(‘display.max_rows’, 10)

pd.set_option(‘display.max_columns’, 10)

显示的最大行数和列数，如果超额就显示省略号，这个指的是多少个dataFrame的列。如果比较多又不允许换行，就会显得很乱。

3、pd.set_option(‘precision’, 5)

显示小数点后的位数

4、pd.set_option(‘large_repr’, A)

truncate表示截断，info表示查看信息，一般选truncate

5、pd.set_option(‘max_colwidth’, 5)

列长度

6、pd.set_option(‘chop_threshold’, 0.5)

绝对值小于0.5的显示0.0

7、pd.set_option(‘colheader_justify’, ‘left’)

显示居中还是左边，

8、pd.set_option(‘display.width’, 200)

横向最多显示多少个字符，一般80不适合横向的屏幕，平时多用200.

np.set_printoptions(precision=None, threshold=None, edgeitems=None, linewidth=None, suppress=None, nanstr=None, infstr=None, formatter=None)

参数：

precision 设置浮点数的精度（默认值：8）

threshold 设置显示的数目（超出部分省略号显示， np.nan是完全输出，默认值：1000）

edgeitems 设置显示前几个，后几个（默认值：3）

suppress 设置是否科学记数法显示（默认值：False）

示例如下：

import numpy as npnp.set_printoptions(precision=4, threshold=8, edgeitems=4, linewidth=75, suppress=True, nanstr='nan', infstr='inf')print("precision=4, 浮点数精确小数点后4位: ", np.array([1.23446789]))print("threshold=8, edgeitems=4, 显示8个，前4后4: ", np.arange(10))np.set_printoptions(formatter={'all': lambda x :'int:'+str(-x)})print("formatter, 格式化输出: ", np.arange(5))

输出如下：

[图片上传失败...(image-15f596-1587702700460)]

注意：precision自动四舍五入

详细介绍文档: np.set_printoptions

pd.set_option

pd.set_option(pat, value)

数据科学工作需要用到数学和统计科学的知识，因此选择数据科学语言时要考

虑其对数值处理、统计分析、矩阵运算等的良好支持。Python中提供了第三方包NumPy和

SciPy，它们很好地提供了这些功能。其次，从事数据科学工作还需要整合各种数据源、

开发数据应用。这些需要语言能支持各种数据库的连接、不同数据格式文件读取、与企业

当前的应用整合以及外部数据爬取等功能。Python作为一门高级面向对象编程语言可以完

美完成上述工作。最后，数据科学家还需要结合领域知识完成数据分析、机器学习，最终