pandas知识点(处理缺失数据)
2018-12-17 10:50:25来源:博客园 阅读 ()
In [14]: string_data = Series(['aardvark','artichoke',np.nan,'avocado']) In [15]: string_data Out[15]: 0 aardvark 1 artichoke 2 NaN 3 avocado dtype: object In [16]: string_data.isnull() Out[16]: 0 False 1 False 2 True 3 False dtype: bool
In [17]: string_data[0] = None In [18]: string_data.isnull() Out[18]: 0 True 1 False 2 True 3 False dtype: bool
In [20]: data = Series([1,NA,3.5,NA,7]) In [21]: data.dropna() Out[21]: 0 1.0 2 3.5 4 7.0 dtype: float64
In [22]: data[data.notnull()] Out[22]: 0 1.0 2 3.5 4 7.0 dtype: float64
In [23]: data = DataFrame([[1.,6.5,3.],[1.,NA,NA],[NA,NA,NA],[NA,6.5,3.]]) In [24]: cleaned = data.dropna() In [25]: data Out[25]: 0 1 2 0 1.0 6.5 3.0 1 1.0 NaN NaN 2 NaN NaN NaN 3 NaN 6.5 3.0 In [26]: cleaned Out[26]: 0 1 2 0 1.0 6.5 3.0
In [27]: data.dropna(how="all") Out[27]: 0 1 2 0 1.0 6.5 3.0 1 1.0 NaN NaN 3 NaN 6.5 3.0
In [28]: data[4] = NA In [29]: data Out[29]: 0 1 2 4 0 1.0 6.5 3.0 NaN 1 1.0 NaN NaN NaN 2 NaN NaN NaN NaN 3 NaN 6.5 3.0 NaN In [30]: data.dropna(axis=1,how="all") Out[30]: 0 1 2 0 1.0 6.5 3.0 1 1.0 NaN NaN 2 NaN NaN NaN 3 NaN 6.5 3.0
In [41]: df Out[41]: 0 1 2 0 -0.184676 NaN NaN 1 0.565214 NaN NaN 2 0.440203 NaN NaN 3 0.188283 NaN 0.146847 4 1.696903 NaN 0.554640 5 -1.287915 0.139527 -0.494558 6 0.854922 0.299511 0.773247 In [42]: df.dropna(thresh=2) # thresh=2表示至少有两个非空数据 Out[42]: 0 1 2 3 0.188283 NaN 0.146847 4 1.696903 NaN 0.554640 5 -1.287915 0.139527 -0.494558 6 0.854922 0.299511 0.773247 In [43]: df.dropna(thresh=1) Out[43]: 0 1 2 0 -0.184676 NaN NaN 1 0.565214 NaN NaN 2 0.440203 NaN NaN 3 0.188283 NaN 0.146847 4 1.696903 NaN 0.554640 5 -1.287915 0.139527 -0.494558 6 0.854922 0.299511 0.773247
In [9]: df.fillna(0) Out[9]: 0 1 2 0 0.863556 0.000000 0.000000 1 -0.099558 0.000000 0.000000 2 -0.605804 0.000000 0.000000 3 -0.934688 0.000000 -1.198976 4 0.741383 0.000000 0.229845 5 -1.415495 0.511485 -0.086808 6 -0.748325 0.437964 -2.458319
In [11]: df.fillna({1:0.5,2:-1}) Out[11]: 0 1 2 0 0.863556 0.500000 -1.000000 1 -0.099558 0.500000 -1.000000 2 -0.605804 0.500000 -1.000000 3 -0.934688 0.500000 -1.198976 4 0.741383 0.500000 0.229845 5 -1.415495 0.511485 -0.086808 6 -0.748325 0.437964 -2.458319
标签:
版权申明:本站文章部分自网络,如有侵权,请联系:west999com@outlook.com
特别注意:本站所有转载文章言论不代表本站观点,本站所提供的摄影照片,插画,设计作品,如需使用,请与原作者联系,版权归原作者所有
- 网络编程相关知识点 2019-08-13
- PYTHON异常处理 2019-07-24
- 图像处理库 Pillow与PIL 2019-07-24
- python学习-37 其他的文件处理方法 2019-07-24
- Python-09-文件处理 2019-07-24
IDC资讯: 主机资讯 注册资讯 托管资讯 vps资讯 网站建设
网站运营: 建站经验 策划盈利 搜索优化 网站推广 免费资源
网络编程: Asp.Net编程 Asp编程 Php编程 Xml编程 Access Mssql Mysql 其它
服务器技术: Web服务器 Ftp服务器 Mail服务器 Dns服务器 安全防护
软件技巧: 其它软件 Word Excel Powerpoint Ghost Vista QQ空间 QQ FlashGet 迅雷
网页制作: FrontPages Dreamweaver Javascript css photoshop fireworks Flash