Numpy知识点

2018-11-13 07:33:46来源:博客园 阅读 ()

新老客户大回馈,云服务器低至5折

  最近在学习python数据分析的书籍《利用python进行数据分析》,以下是第四章总结的一些知识点
1.ndarray
  ndarray是一个N维数组对象。
  创建ndarray:
In [5]: data = [[1,2,3],[4,5,6]]
In [6]: arr = numpy.array(data, dtype=numpy.int32)
In [7]: arr
Out[7]: array([[1, 2, 3],
               [4, 5, 6]])

  查看数组各维度大小:

In [9]: arr.shape
Out[9]: (2, 3)

  查看数组数据类型:

In [10]: arr.dtype
Out[10]: dtype('int32')

  其他创建方法:

In [11]: numpy.zeros((3,6))  # 创建一个维度大小(3,6)的数组,长度全0
Out[11]:
array([[0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.]])

  arange类似于python内置的range:

In [12]: numpy.arange(15)
Out[12]: array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

  转化type:

In [15]: farr = arr.astype(numpy.float64)
In [16]: farr.dtype
Out[16]: dtype('float64')

  PS:如果将浮点数转化为整数,那么小数部分将被截断

  数组的切片是原始数组的视图,而不是数据被复制,所以修改切片会反应到原始数组上去:
In [2]: arr = numpy.arange(10)
In [3]: arr_slice = arr[5:8]
In [4]: arr_slice[0] = 123456
In [5]: arr
Out[5]:
array([     0,      1,      2,      3,      4, 123456,      6,      7,      8,      9])

  PS:这样做是因为当数量大量数据时,频繁的复制会导致性能降低

  想要得到切片副本而非视图可以使用copy:
In [7]: arr2 = arr[5:8].copy()

  数组和值都可以赋值给ndarray:

In [13]: data = [[[1,2,3],[4,5,6]],[[4,5,6],[7,8,9]]]
In [14]: arr = numpy.array(data)
In [15]: arr2 = arr[0].copy()
In [16]: arr[0] = 123
In [17]: arr
Out[17]:
array([[[123, 123, 123],
        [123, 123, 123]],
       [[  4,   5,   6],
        [  7,   8,   9]]])
In [18]: arr[0] = arr2
In [19]: arr
Out[19]:
array([[[1, 2, 3],
        [4, 5, 6]],
       [[4, 5, 6],
        [7, 8, 9]]])

  布尔型的数组索引和切片可以一起使用

In [1]: arrr[name=="liu", :2]

  按顺序选区行子集,只需要索引一个列表或ndarray:

In [9]: arr
Out[9]:
array([[0., 0., 0., 0.],
       [1., 1., 1., 1.],
       [2., 2., 2., 2.],
       [3., 3., 3., 3.],
       [4., 4., 4., 4.],
       [5., 5., 5., 5.],
       [6., 6., 6., 6.],
       [7., 7., 7., 7.]])
In [10]: arr[[4,3,0,6]]
Out[10]:
array([[4., 4., 4., 4.],
       [3., 3., 3., 3.],
       [0., 0., 0., 0.],
       [6., 6., 6., 6.]])

  将一维数组展开成二维数组:

In [11]: arr = numpy.arange(32).reshape((8,4))
In [12]: arr
Out[12]:
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23],
       [24, 25, 26, 27],
       [28, 29, 30, 31]])

  花式索引:

In [13]: arr[numpy.ix_([1,5,7,2],[0,3,1,2])]
Out[13]:
array([[ 4,  7,  5,  6],
          [20, 23, 21, 22],
          [28, 31, 29, 30],
          [ 8, 11,  9, 10]])

  PS:花式索引是将数据复制到新数组中

  数据转置(transpose):
In [14]: arr
Out[14]:
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23],
       [24, 25, 26, 27],
       [28, 29, 30, 31]])
In [15]: arr.T
Out[15]:
array([[ 0,  4,  8, 12, 16, 20, 24, 28],
       [ 1,  5,  9, 13, 17, 21, 25, 29],
       [ 2,  6, 10, 14, 18, 22, 26, 30],
       [ 3,  7, 11, 15, 19, 23, 27, 31]])

  对于高维数组,需要设置编号才能转置:

In [16]: arr = numpy.arange(16).reshape((2,2,4))
In [17]: arr
Out[17]:
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7]],
       [[ 8,  9, 10, 11],
        [12, 13, 14, 15]]])
In [18]: arr.transpose((1,0,2))
Out[18]:
array([[[ 0,  1,  2,  3],
        [ 8,  9, 10, 11]],
       [[ 4,  5,  6,  7],
        [12, 13, 14, 15]]])

 

2.利用数组进行数据处理
In [2]: point = numpy.arange(-5,5,0.01)
In [3]: xs, ys = numpy.meshgrid(point, point)
In [4]: ys
Out[4]:
array([[-5.  , -5.  , -5.  , ..., -5.  , -5.  , -5.  ],
       [-4.99, -4.99, -4.99, ..., -4.99, -4.99, -4.99],
       [-4.98, -4.98, -4.98, ..., -4.98, -4.98, -4.98],
       ...,
       [ 4.97,  4.97,  4.97, ...,  4.97,  4.97,  4.97],
       [ 4.98,  4.98,  4.98, ...,  4.98,  4.98,  4.98],
       [ 4.99,  4.99,  4.99, ...,  4.99,  4.99,  4.99]])
 
In [6]: import matplotlib.pyplot as plt
In [7]: z = numpy.sqrt(xs**2+ ys**2)
In [8]: z
Out[8]:
array([[7.07106781, 7.06400028, 7.05693985, ..., 7.04988652, 7.05693985,7.06400028],
       [7.06400028, 7.05692568, 7.04985815, ..., 7.04279774, 7.04985815,7.05692568],
       [7.05693985, 7.04985815, 7.04278354, ..., 7.03571603, 7.04278354,7.04985815],
       ...,
       [7.04988652, 7.04279774, 7.03571603, ..., 7.0286414 , 7.03571603,7.04279774],
       [7.05693985, 7.04985815, 7.04278354, ..., 7.03571603, 7.04278354,7.04985815],
       [7.06400028, 7.05692568, 7.04985815, ..., 7.04279774, 7.04985815,7.05692568]])
In [9]: plt.imshow(z,cmap=plt.cm.gray);plt.colorbar()
效果图

 

3.将条件逻辑表述为数组运算
In [9]: xarr = numpy.array([1.1,1.2,1.3,1.4,1.5])
In [10]: yarr=numpy.array([2.1,2.2,2.3,2.4,2.5])
In [11]: cond =numpy.array([True,False,True,True,False])
In [12]: numpy.where(cond,xarr,yarr)
Out[12]: array([1.1, 2.2, 1.3, 1.4, 2.5])

第二/三个参数不一定要传数组

In [9]: numpy.where(arr>0,2,-2)

 

标签:

版权申明:本站文章部分自网络,如有侵权,请联系:west999com@outlook.com
特别注意:本站所有转载文章言论不代表本站观点,本站所提供的摄影照片,插画,设计作品,如需使用,请与原作者联系,版权归原作者所有

上一篇:表格输出内容

下一篇:约束和异常处理