python代碼xml轉(zhuǎn)txt實(shí)例
為了訓(xùn)練深度學(xué)習(xí)模型,經(jīng)常要整理大量的標(biāo)注數(shù)據(jù),需統(tǒng)一不同格式的標(biāo)注數(shù)據(jù),一般情況下習(xí)慣讀取TXT格式的數(shù)據(jù)。但實(shí)際中經(jīng)常遇到XML格式的標(biāo)注數(shù)據(jù),在此舉例:1.讀取XML標(biāo)注數(shù)據(jù);2.寫入TXT文件。
XML標(biāo)注數(shù)據(jù)如下
<annotation verified="no"> <folder>suE</folder> <filename>Drivingrecord_001</filename> <path>C:\Desktop\Drivingrecord_001.jpg</path> <source> <database>Unknown</database> </source> <size> <width>1920</width> <height>1080</height> <depth>3</depth> </size> <segmented>0</segmented> <object> <name>蘇E*****-藍(lán)-1-白,灰-大眾-上海大眾-桑塔納-尚納</name> <flag>polygon</flag> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <leftTopx>170</leftTopx> <leftTopy>704</leftTopy> <rightTopx>167</rightTopx> <rightTopy>729</rightTopy> <rightBottomx>242</rightBottomx> <rightBottomy>735</rightBottomy> <leftBottomx>243</leftBottomx> <leftBottomy>710</leftBottomy> </bndbox> </object> <object> <name>蘇E*****-藍(lán)-1-黃-雷克薩斯-雷克薩斯(進(jìn)口)-雷克薩斯RX</name> <flag>polygon</flag> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <leftTopx>733</leftTopx> <leftTopy>721</leftTopy> <rightTopx>733</rightTopx> <rightTopy>759</rightTopy> <rightBottomx>881</rightBottomx> <rightBottomy>760</rightBottomy> <leftBottomx>882</leftBottomx> <leftBottomy>722</leftBottomy> </bndbox> </object> <object> <name>蘇*****-藍(lán)-1-黑-寶馬-寶馬(進(jìn)口)-寶馬7系</name> <flag>polygon</flag> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <leftTopx>1274</leftTopx>
<leftTopy>657</leftTopy> <rightTopx>1274</rightTopx> <rightTopy>671</rightTopy> <rightBottomx>1325</rightBottomx> <rightBottomy>670</rightBottomy> <leftBottomx>1326</leftBottomx> <leftBottomy>656</leftBottomy> </bndbox> </object> <object> <name>蘇*****-藍(lán)-1-灰-標(biāo)致-東風(fēng)標(biāo)致-標(biāo)致307</name> <flag>polygon</flag> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <leftTopx>1609</leftTopx> <leftTopy>658</leftTopy> <rightTopx>1611</rightTopx> <rightTopy>671</rightTopy> <rightBottomx>1659</rightBottomx> <rightBottomy>669</rightBottomy> <leftBottomx>1657</leftBottomx> <leftBottomy>656</leftBottomy> </bndbox> </object> </annotation>
在此,我們只需要圖片名filename,和每個(gè)object的坐標(biāo)(四個(gè)點(diǎn)的坐標(biāo))
Drivingrecord_001.jpg 170 704 167 729 242 735 243 710 733 721 733 759 881 760 882 722 1274 657 1274 671 1325 670 1326 656 1609 658 1611 671 1659 669 1657 656
利用xml.dom.*模塊,文件對(duì)象模塊DOM在讀取XML文件時(shí),一次讀取整個(gè)文件,將其所有數(shù)據(jù)保存在一個(gè)樹結(jié)構(gòu)中,此時(shí),可利用DOM的各種函數(shù)來讀取目標(biāo)數(shù)據(jù)。在此,利用xml.dom.minidom解析XML文件。
并將目標(biāo)數(shù)據(jù)寫入TXT文檔。
# -*- coding: utf-8 -*-
"""
Created on Fri Mar 2 15:36:44 2018
@author: gg
"""
import xml.dom.minidom
import os
save_dir = 'D:\plate_train'
if not os.path.exists(save_dir):
os.mkdir(save_dir)
f = open(os.path.join(save_dir, 'landmark.txt'), 'w')
DOMTree = xml.dom.minidom.parse('D:\plate_train\label\Drivingrecord_001.xml')
annotation = DOMTree.documentElement
filename = annotation.getElementsByTagName("filename")[0]
imgname = filename.childNodes[0].data+'.jpg'
print(imgname)
objects = annotation.getElementsByTagName("object")
loc = [imgname] #文檔保存格式:文件名 坐標(biāo)
for object in objects:
bbox = object.getElementsByTagName("bndbox")[0]
leftTopx = bbox.getElementsByTagName("leftTopx")[0]
lefttopx = leftTopx.childNodes[0].data
print(lefttopx)
leftTopy = bbox.getElementsByTagName("leftTopy")[0]
lefttopy = leftTopy.childNodes[0].data
print(lefttopy)
rightTopx = bbox.getElementsByTagName("rightTopx")[0]
righttopx = rightTopx.childNodes[0].data
print(righttopx)
rightTopy = bbox.getElementsByTagName("rightTopy")[0]
righttopy = rightTopy.childNodes[0].data
print(righttopy)
rightBottomx = bbox.getElementsByTagName("rightBottomx")[0]
rightbottomx = rightBottomx.childNodes[0].data
print(rightbottomx)
rightBottomy = bbox.getElementsByTagName("rightBottomy")[0]
rightbottomy = rightBottomy.childNodes[0].data
print(rightbottomy)
leftBottomx = bbox.getElementsByTagName("leftBottomx")[0]
leftbottomx = leftBottomx.childNodes[0].data
print(leftbottomx)
leftBottomy = bbox.getElementsByTagName("leftBottomy")[0]
leftbottomy = leftBottomy.childNodes[0].data
print(leftbottomy)
loc = loc + [lefttopx, lefttopy, righttopx, righttopy, rightbottomx, rightbottomy, leftbottomx, leftbottomy]
for i in range(len(loc)):
f.write(str(loc[i])+' ')
f.write('\t\n')
f.close()
以上這篇python代碼xml轉(zhuǎn)txt實(shí)例就是小編分享給大家的全部?jī)?nèi)容了,希望能給大家一個(gè)參考,也希望大家多多支持腳本之家。
相關(guān)文章
tensorflow實(shí)現(xiàn)讀取模型中保存的值 tf.train.NewCheckpointReader
今天小編就為大家分享一篇tensorflow實(shí)現(xiàn)讀取模型中保存的值 tf.train.NewCheckpointReader,具有很好的參考價(jià)值,希望對(duì)大家有所幫助。一起跟隨小編過來看看吧2020-02-02
nlp自然語言處理基于SVD的降維優(yōu)化學(xué)習(xí)
這篇文章主要為大家介紹了nlp自然語言處理基于SVD的降維優(yōu)化學(xué)習(xí),有需要的朋友可以借鑒參考下,希望能夠有所幫助,祝大家多多進(jìn)步早日升職加薪2022-04-04
python中l(wèi)ist循環(huán)語句用法實(shí)例
這篇文章主要介紹了python中l(wèi)ist循環(huán)語句用法,以實(shí)例形式詳細(xì)介紹了Python針對(duì)list的解析,包含各種常見的遍歷操作及原理分析,需要的朋友可以參考下2014-11-11
Python與AI分析時(shí)間序列數(shù)據(jù)
預(yù)測(cè)給定輸入序列中的下一個(gè)是機(jī)器學(xué)習(xí)中的另一個(gè)重要概念.本章為您提供有關(guān)分析時(shí)間序列數(shù)據(jù)的詳細(xì)說明,有需要的朋友可以借鑒參考下,希望能夠有所幫助2022-05-05
pytorch張量和numpy數(shù)組相互轉(zhuǎn)換
在使用pytorch作為深度學(xué)習(xí)的框架時(shí),經(jīng)常會(huì)遇到張量tensor和矩陣numpy的類型的相互轉(zhuǎn)化的問題,本文主要介紹了pytorch張量和numpy數(shù)組相互轉(zhuǎn)換,感興趣的可以了解一下2024-02-02

