python处理xml文件的方法小结

下面是“Python处理XML文件的方法小结”的完整攻略。

什么是XML

XML（Extensible Markup Language）是一种可扩展的标记语言，用于描述、存储和传输数据。XML文件由标签、属性和值组成，可以通过解析XML文件来获取其中的数据。

Python处理XML的方法

XML解析库

Python有很多用于解析XML文件的库，包括：

ElementTree
lxml
xml.dom
xml.sax

使用ElementTree解析XML文件

导入ElementTree模块

import xml.etree.ElementTree as ET

解析XML文件

tree = ET.parse('example.xml')
root = tree.getroot()

获取所有标签为'book'的元素

for book in root.findall('./book'):
    print(book.attrib)
    for child in book:
        print(child.tag, child.text)

使用lxml解析XML文件

lxml是一个高效的XML解析库，可以快速解析大型XML文件。

导入lxml模块

from lxml import etree

解析XML文件

tree = etree.parse('example.xml')
root = tree.getroot()

获取所有标签为'book'的元素

for book in root.xpath('//book'):
    print(book.attrib)
    for child in book:
        print(child.tag, child.text)

示例说明：

假设我们有一个example.xml文件，如下所示：

<?xml version="1.0" encoding="UTF-8"?>
<catalog>
    <book id="book1">
        <author>John Smith</author>
        <title>Python Programming</title>
        <price>29.99</price>
    </book>
    <book id="book2">
        <author>Jane Doe</author>
        <title>Java Programming</title>
        <price>19.99</price>
    </book>
</catalog>

示例1：使用ElementTree解析XML文件

import xml.etree.ElementTree as ET

tree = ET.parse('example.xml')
root = tree.getroot()

for book in root.findall('./book'):
    print(book.attrib)
    for child in book:
        print(child.tag, child.text)

输出结果：

{'id': 'book1'}
author John Smith
title Python Programming
price 29.99
{'id': 'book2'}
author Jane Doe
title Java Programming
price 19.99

示例2：使用lxml解析XML文件

from lxml import etree

tree = etree.parse('example.xml')
root = tree.getroot()

for book in root.xpath('//book'):
    print(book.attrib)
    for child in book:
        print(child.tag, child.text)