要将HTML表格的每行每列转为数组,实现采集表格数据,可以采用以下步骤:
1.首先,根据table标签的id或class属性找到目标表格。
2.通过PHP的DOMDocument类,将HTML代码解析为DOM结构,然后用DOMXPath类查找表格中的每一行。
3.对每一行进行循环遍历,将每个单元格的内容存入关联数组中,并将该数组存入外层的索引数组中。
4.最后,返回整个二维数组。
以下是示例代码:
示例1:
<?php
$html = '<table>
<tr>
<td>Name</td>
<td>Age</td>
<td>Gender</td>
</tr>
<tr>
<td>John</td>
<td>25</td>
<td>Male</td>
</tr>
<tr>
<td>Jane</td>
<td>30</td>
<td>Female</td>
</tr>
</table>';
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$tableRows = $xpath->query('//table[@class="table"]//tr');
$data = [];
foreach ($tableRows as $row) {
$rowData = [];
foreach ($row->getElementsByTagName('td') as $cell) {
$rowData[] = $cell->nodeValue;
}
$data[] = $rowData;
}
print_r($data);
?>
输出结果:
Array
(
[0] => Array
(
[0] => Name
[1] => Age
[2] => Gender
)
[1] => Array
(
[0] => John
[1] => 25
[2] => Male
)
[2] => Array
(
[0] => Jane
[1] => 30
[2] => Female
)
)
示例2:
<?php
$html = '<table id="my-table">
<tr>
<td>Product</td>
<td>Price</td>
</tr>
<tr>
<td>Shoes</td>
<td>$50</td>
</tr>
<tr>
<td>Shirt</td>
<td>$20</td>
</tr>
</table>';
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$tableRows = $xpath->query('//table[@id="my-table"]//tr');
$data = [];
foreach ($tableRows as $row) {
$rowData = [];
foreach ($row->getElementsByTagName('td') as $cell) {
$rowData[] = $cell->nodeValue;
}
$data[] = $rowData;
}
print_r($data);
?>
输出结果:
Array
(
[0] => Array
(
[0] => Product
[1] => Price
)
[1] => Array
(
[0] => Shoes
[1] => $50
)
[2] => Array
(
[0] => Shirt
[1] => $20
)
)
通过以上示例能够看出,对于不同的HTML表格,通过修改XPath表达式可以找到不同的表格,并将其转换为二维数组的形式。
本站文章如无特殊说明,均为本站原创,如若转载,请注明出处:php将HTML表格每行每列转为数组实现采集表格数据的方法 - Python技术站