首页 > > 网络编程 > 其它 >

第21天常用模块二

2018-10-19 06:18:12来源：博客园阅读 ()

介绍的模块

pickle
shelve
json
xml
configparser

人生三问

序列化是什么
　　序列化就是将内存中数据格式转换成中间对象，存储到硬盘上面或者是传输到网络上面。
　　反序列化就是将硬盘或者网络中传输的中间对象转换成内存中数据格式的过程。

为什么要有序列化
1. 为了持久化存储文件
　　数据是为了使用的，当我们的电脑断电之后数据就丢失是一件让人无法忍受的事情，因此我们要存储文件，存储的
方式有两种，一种就是自己打开文件写入文件，一种就是使用模块给我们存储文件。　
2. 为了跨平台进行交互
　　之前我们在写atm程序的时候，为了持久保存信息都是自己通过open打开文件，然后将我们自己的数据转换成字符串的形式
写入文件中，当我们需要的时候重新按照一定的格式翻译成我们想要的样子展现给我们。这样做不仅麻烦，而且跨平台性很差，因为
无论我们把数据传给谁，我们都要告诉对方我这个数据是怎么存储的，你要怎么去获取，可能对方就算是获取了你的数据，也不一定
用它的语言去得到想要的值，或者说需要花费很长的时间去解析你的数据才能真正得到想要的数据。
　　为了解决这样的问题，有了序列化。
　　一天，有一个人告诉大家，以后你们都不要随便的存储数据了，都按照我的格式来存储，也都按照我的格式来解析好了，因此
网络上就出现了各式各样的序列化操作方法。归根结底就是一种通用的标准来来存储数据，方便其他人进行解析的。

怎么使用序列化
　　序列化的方法有很多种，python中自带的有pickle, shelve模块，通用的模块有xml和json。接下来我来详细的
介绍一下怎么使用这几个模块。

模块一：pickel

方法1： dumps和loads

import pickle
user = {
    'name': 'alex',
    'sex': 'male',
    'age': 32
}

# 序列化  将内存中的格式转换成字节流之后存储到文件中
with open('a.kle','wb') as f:
    f.write(pickle.dumps(user))

# 反序列化 将文件中的字节流读取到内存转换成想要的格式
with open('a.kle', 'rb') as f:
    res = pickle.loads(f.read())
    print(res)

方法2： dump和load

# dump和load封装了文件的write和read方法
# 使得模块使用起来更加的方便
# 序列化
with open('a', 'wb') as f:
    pickle.dump(user, f)
# 反序列化
with open('a', 'rb') as f:
    print(pickle.load(f))

模块二：shevel

import shelve   
# shelve只有一个open函数
# 打开文件，文件可以存在也可以不存在，
f = shelve.open(r'shelve.txt', writeback=True)
# 在关闭之前可读可写参数writeback=True代表的是否可以进行修改
# f['user'] = {'name': 'hu'}
print(f['user'])
f['user']['sex'] = 'male'
f.close()

通用的模块

模块三：json

json是什么？
json是一种轻量级的数据交换语言。简单清晰的层次结构使得它更易于人们的阅读以及机器的解析。能够有效的提高网络传输速率。

'''
js中的数据类型和python中数据类型的一一转换关系

js中的数据类型                python中的数据类型
{}                            dict
[]                            list
int/float                     int/float
string""双引号                str
true/false                    True/False
null                          None


json格式的语法规范：
    最外层通常是一个字典或者列表
    字符串要用双引号
    你可以在里面套用任意多的层次
'''

方法一：dumps, loads

# 和pickle的使用方法是一样的，如果是dumps和loads要通过f.write和f.readfang方法辅助
import json
user = {
    'name': 'hu',
    'sex': 'male',
    'age': 123
}

# 序列化
with open('a.json', 'wt', encoding='utf-8') as f:
    f.write(json.dumps(user))

# 反序列化
with open('a.json', 'rt', encoding='utf-8') as f:
    print(json.loads(f.read()))

方法二：dump, load

with open('a.json', 'wt', encoding='utf-8') as f:
    json.dump(user, f)

with open('a.json', 'rt', encoding='utf-8') as f:
    print(json.load(f))

模块四：xml

xml是什么
    xml是一种可扩展的标记语言，它制定了一种文本内容的书写规范，使得计算机能够很轻松的识别这些数据。用于多个平台之间
的数据交换。和json类似。

xml语法标准
　　1. xml的每一个标签必须有一个结束标签
　　2. 标签可以嵌套使用
　　3. 所有的属性必须有值
　　4. 所有的值必须是加上双引号    
　　5. 一个标签中可以同时表示起始和结束标签 eg:<百度百科词条/>
eg:标签是由
　　<a>
　　　　<b name="hu">
　　　　<b/>
　　<a/>

'''
xml.etree.ElementTree as ET
tree = ET.parse('b.xml')
root = tree.getroot()
tree的方法：
步骤一：获得根标签
    parse     解析一个xml文档获得一个对象
    getroot   根据获得得对象获得当前xml文档的跟标签
步骤二：根据根标签找子标签
    iter      全xml文档去寻找标签
    find      找到当前层的一个标签
    findall   找到当前层的所有标签
    getchildren   获得当前标签的儿子标签
步骤三：找到子标签更改标签的内容：   
    text       标签文本，必须是叶子节点
    attrib     属性值
    set()      设置属性
步骤四：删除该标签
    remove     删除当前节点
步骤五：添加标签
    创建节点：
        year2 = ET.Element('year2')
        # 设置节点的text
        year2.text = '新年'
        # 设置节点的属性
        year2.attrib = {'updated': 'yes'}
        
步骤六：一系列操作完成之后写入文件：
    tree.write() 
    ET.dump()  把xml对象转换成一个文本   
'''

案例：

<?xml version="1.0"?>
<data>
    <country name="Liechtenstein">
        <rank updated="yes">2</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <rank updated="yes">5</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/>
    </country>
    <country name="Panama">
        <rank updated="yes">69</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country>
</data>

xml数据

xml文件内容

方法一：parse获得根标签

import xml.etree.ElementTree as ET
# 解析xml文档为一颗树，如果此时报错，说明xml文档有问题，到xml格式化网站进行格式化就可以了
tree = ET.parse('b.xml')
# 根据一棵树先得到树根（也就是根标签）
root = tree.getroot()

方法二：遍历当前一层标签

# 遍历xml文档
for child in root:
    print('=====》', child.tag, child.attrib, child.attrib['name'])
    # for i in child:
    #     print(i.tag, i.attrib, i.text)
    
结果:
=====》 country {'name': 'Liechtenstein'} Liechtenstein
=====》 country {'name': 'Singapore'} Singapore
=====》 country {'name': 'Panama'} Panama

方法三：iter全文档查找当前标签

# iter方法代表全文档查看标签名为year的节点，然后循环遍历
for node in root.iter('year'):
    # 获得当前节点的标签名，属性和文本
    print(node.tag, node.attrib, node.text)
    
# 结果：
# year {} 2008
# year {} 2011
# year {} 2011

方法四：修改节点的值以及添加属性

for node in root.iter('year'):
    # 把年加一
    new_year = int(node.text) + 1
    # 然后重新赋值给当前节点
    node.text = str(new_year)
    # 给节点node设置了两个属性值，updatted='yes', 'version' = 1.0
    node.set('updated', 'yes')
    node.set('version', '1.0')
# 把这棵树重新写入文件中
tree.write('b.xml')

方法五：删除节点

# # rank的值如果大于50，就删除当前country节点
# # 查找当前层内的所有country节点
# for country in root.findall('country'):
#     # 查抄当前country节点下的rank节点并返回他的值
#     rank = int(country.find('rank').text)
#     # 如果rank的值大于50， 则删除
#     if rank > 50:
#         root.remove(country)

方法六：添加节点

for country in root.findall('country'):
    for year in country.findall('year'):
        # 如果大于2000则添加节点
        if int(year.text) > 2000:
            # 创建一个节点，tag = 'year2'
            year2 = ET.Element('year2')
            # 设置节点的text
            year2.text = '新年'
            # 设置节点的属性
            year2.attrib = {'updated': 'yes'}
            # 把创建好的节点加入
            country.append(year2)
# 然后写入xml文件中            
tree.write('b.xml')

方法七：创建一个xml文档

# 通过Element创建一个节点
new_xml = ET.Element('namelist')
# 通过SubElement创建子节点
name = ET.SubElement(new_xml, 'name', attrib={'enrolled': 'yes'})
age = ET.SubElement(new_xml, "age", attrib={'checked':'no'})
sex = ET.SubElement(new_xml, 'sex')
sex.text = '33'
# 通过Element创建一个节点
name2 = ET.SubElement(new_xml, 'name2', attrib={'enrolled': 'no'})
age = ET.SubElement(new_xml, 'age', attrib={'enrolled': 'no'})
age.text = '19'
# 生成文档对象
et = ET.ElementTree(new_xml)
# 把生成的文档对象写入到text文件中
et.write('text.xml', encoding='utf-8', xml_declaration=True)
# 打印生成的格式对象
ET.dump(new_xml)

模块五：configparser

'''
配置文件解析模块
配置文件的格式：
    1. 分区section
    2. 选项option
    eg: 下面的格式有两个分区有是三个选项
        [user_list]  # 以[]包含的就是分区
            username = hu   # key=value形式的是选项   
            password = 123
        [db]
            file_path = 'C://user'
'''

'''
config = configparser.ConfigParser()
config.read('')
查看：
1. 查section
    config.sections()
2. 查options
    config.items()
    config.options()
    config.get('section', 'k1')
    config.getint('section', 'k1')
    config.getfloat('section', 'k1')
    config.getboolean('section', 'k1')

删除：
1. 删除section
    config.remove_section()
2. 删除option
    config.remove_option()

添加：
1. 添加section
    config.add_section()
2. 添加option
    config.set()

写入：
    config.write()
'''

案例：

配置文件内容：

# 注释1
; 注释2

[section1]
k1 = v1
k2:v2
user=egon
age=18
is_admin=true
salary=31

[section2]
k1 = v1

conf.cfg

查看方法

# config = configparser.ConfigParser().read('conf.cfg', encoding='utf-8')

config = configparser.ConfigParser()
config.read('conf.cfg', encoding='utf-8')

# 查
print(config.sections())
print(config.items('section1'))
print(config.options('section1'))
print(config.get('section1', 'k1'))
print(config.getint('section1', 'age'))
print(config.getfloat('section1', 'age'))
# print(config.getboolean('section1', 'is_admin'))

# 改
# config.remove_section('section2')  # 删除的是整个分区
config.remove_option('section2', 'k1')
print(config.has_section('section2'))
print(config.has_option('section2', 'sje'))
# config.add_section('egon')
config.set('egon','name', 'egon')
config.set('egon', 'age', '18')
config.write(open('conf.cfg', 'w'))