C/C++利用libxml2高效輸出XML大文件詳解

更新時(shí)間：2017年11月21日 11:57:48 作者：infoworld

這篇文章主要給大家介紹了關(guān)于C/C++利用libxml2高效輸出XML大文件的相關(guān)資料，文中通過示例代碼介紹的非常詳細(xì)，對(duì)大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價(jià)值，需要的朋友們下面隨著小編來一起學(xué)習(xí)學(xué)習(xí)吧。

前言

Libxml2 是一個(gè)xml c語言版的解析器，本來是為Gnome項(xiàng)目開發(fā)的工具，是一個(gè)基于MIT License的免費(fèi)開源軟件。它除了支持c語言版以外，還支持c++、PHP、Pascal、Ruby、Tcl等語言的綁定，能在Windows、Linux、Solaris、MacOsX等平臺(tái)上運(yùn)行。功能還是相當(dāng)強(qiáng)大的，相信滿足一般用戶需求沒有任何問題。

libxml2常用數(shù)據(jù)類型

xmlChar是libxml2中的字符類型，在庫中的所有字符，字符串都是基于這個(gè)數(shù)據(jù)類型的。

xmlChar＊是指針類型，很多函數(shù)都會(huì)返回一個(gè)動(dòng)態(tài)分配的內(nèi)存的xmlChar＊類型的變量，因此，在使用這類函數(shù)時(shí)要記得釋放內(nèi)存，否則會(huì)導(dǎo)致內(nèi)存泄漏，例如這樣的用法：

xmlChar *name = xmlNodeGetContent(CurNode);
strcpy(data.name, name);
xmlFree(name);

xmlDoc、 xmlDocPtr //文檔對(duì)象結(jié)構(gòu)體及指針
xmlNode、 xmlNodePtr //節(jié)點(diǎn)對(duì)象結(jié)構(gòu)體及節(jié)點(diǎn)指針
xmlAttr、 xmlAttrPtr //節(jié)點(diǎn)屬性的結(jié)構(gòu)體及其指針
xmlNs、 xmlNsPtr //節(jié)點(diǎn)命名空間的結(jié)構(gòu)及指針
BAD_CAST //一個(gè)宏定義，事實(shí)上它即是xmlChar＊類型

場(chǎng)景

1.libxml2基本上算是xml的C/C++標(biāo)準(zhǔn)讀寫庫. 在linux,macOS里是默認(rèn)支持. 可惜在Windows上有自己專有的msxml, 所以并不支持libxml2, 惡心的是msxml還不是標(biāo)配, 還要必須另外下載安裝, 所以作為Windows上優(yōu)先選擇的XML庫, 就是可跨平臺(tái)的libxml2.

2.xml的sax讀取庫expat也是比較優(yōu)秀的選擇, 可惜不支持寫.

3.一般的寫庫方式是生成一整個(gè)DOM結(jié)構(gòu), 之后把這個(gè)DOM結(jié)構(gòu)輸出到XML格式的文本里, 可調(diào)用自帶寫函數(shù)或標(biāo)準(zhǔn)io函數(shù). 這樣的缺點(diǎn)是如果生成這個(gè)DOM結(jié)構(gòu)過于大, 會(huì)導(dǎo)致在生成這個(gè)DOM結(jié)構(gòu)時(shí)內(nèi)存暴漲，之后再輸出到內(nèi)存里，這時(shí)候內(nèi)存又暴漲一次，最后從內(nèi)存輸出到文件里.

說明

1.DOM結(jié)構(gòu)存儲(chǔ)非常浪費(fèi)內(nèi)存, 如果數(shù)據(jù)量大時(shí), 但是元素的父子關(guān)系, 文本值，屬性值等等很浪費(fèi)內(nèi)存. 如果我們可以按照每個(gè)元素來輸出的話，最好輸出完就釋放元素內(nèi)存, 那么能最大限度的利用內(nèi)存資源.

2.局部輸出元素可以最大限度使用系統(tǒng)的資源, 比如IO輸出需要權(quán)限限制的函數(shù), 或者輸出到界面等

例子

以下例子是windows上使用libxml2, 用mingw編譯出的libxml2, 使用_wfopen來打開unicode編碼的文件路徑.

#include "stdafx.h"
#include <libxml/parser.h>
#include <libxml/tree.h>
#include <libxml/xmlreader.h>
#include <iostream>
#include <memory>

void TestStandardIOForXml()
{
 xmlDocPtr doc = NULL; /* document pointer */
 xmlNodePtr one_node = NULL, node = NULL, node1 = NULL;/* node pointers */
 char buff[256];
 int i, j;

 doc = xmlNewDoc(BAD_CAST "1.0");
 std::shared_ptr<void> sp_doc(doc,[](void* doc1){
 xmlDocPtr doc = (xmlDocPtr)doc1;
 xmlFreeDoc(doc);
 });

 FILE* file = _wfopen(L"test.xml",L"wb");
 if(!file)
 return;

 std::shared_ptr<FILE> sp_file(file,[](FILE* file){
 fclose(file);
 });

 // 寫XML的聲明
 xmlChar* doc_buf = NULL;
 int size = 0;
 xmlDocDumpMemoryEnc(doc,&doc_buf,&size,"UTF-8");
 std::shared_ptr<xmlChar> sp_xc(doc_buf,[](xmlChar* doc_buf){
 xmlFree(doc_buf);
 });
 fwrite(doc_buf,strlen((const char*)doc_buf),1,file);
 xmlBufferPtr buf = xmlBufferCreate();
 std::shared_ptr<void> sp_buf(buf,[](void* buf1){
 xmlBufferPtr buf = (xmlBufferPtr)buf1;
 xmlBufferFree(buf);
 });

 const char* kRootBegin = "<ROOT>";
 fwrite(kRootBegin,strlen(kRootBegin),1,file);
 for(int i = 0; i< 10; ++i){
 one_node = xmlNewNode(NULL, BAD_CAST "one");
 xmlNewChild(one_node, NULL, BAD_CAST "node1",
  BAD_CAST "content of node 1");
 xmlNewChild(one_node, NULL, BAD_CAST "node2", NULL);
 node = xmlNewChild(one_node, NULL, BAD_CAST "node3",BAD_CAST "this node has attributes");
 xmlNewProp(node, BAD_CAST "attribute", BAD_CAST "yes");
 xmlNewProp(node, BAD_CAST "foo", BAD_CAST "bar");

 node = xmlNewNode(NULL, BAD_CAST "node4");
 node1 = xmlNewText(BAD_CAST "other way to create content (which is also a node)");
 xmlAddChild(node, node1);
 xmlAddChild(one_node, node);

 xmlNodeDump(buf,doc,one_node,1,1);
 fwrite(buf->content,buf->use,1,file);

 xmlUnlinkNode(one_node);
 xmlFreeNode(one_node);
 xmlBufferEmpty(buf);
 }

 const char* kRootEnd = "</ROOT>";
 fwrite(kRootEnd,strlen(kRootEnd),1,file);

}

輸出文件:

<?xml version="1.0" encoding="UTF-8"?>
<ROOT><one>
 <node1>contentÖÐÎÄ of node 1</node1>
 <node2/>
 <node3 attribute="yes" foo="bar">this node has attributes</node3>
 <node4>other way to create content (which is also a node)</node4>
 </one><one>
 <node1>content of node 1</node1>
 <node2/>
 <node3 attribute="yes" foo="bar">this node has attributes</node3>
 <node4>other way to create content (which is also a node)</node4>
 </one><one>
 <node1>content of node 1</node1>
 <node2/>
 <node3 attribute="yes" foo="bar">this node has attributes</node3>
 <node4>other way to create content (which is also a node)</node4>
 </one><one>
 <node1>content of node 1</node1>
 <node2/>
 <node3 attribute="yes" foo="bar">this node has attributes</node3>
 <node4>other way to create content (which is also a node)</node4>
 </one><one>
 <node1>content of node 1</node1>
 <node2/>
 <node3 attribute="yes" foo="bar">this node has attributes</node3>
 <node4>other way to create content (which is also a node)</node4>
 </one><one>
 <node1>content of node 1</node1>
 <node2/>
 <node3 attribute="yes" foo="bar">this node has attributes</node3>
 <node4>other way to create content (which is also a node)</node4>
 </one><one>
 <node1>content of node 1</node1>
 <node2/>
 <node3 attribute="yes" foo="bar">this node has attributes</node3>
 <node4>other way to create content (which is also a node)</node4>
 </one><one>
 <node1>content of node 1</node1>
 <node2/>
 <node3 attribute="yes" foo="bar">this node has attributes</node3>
 <node4>other way to create content (which is also a node)</node4>
 </one><one>
 <node1>content of node 1</node1>
 <node2/>
 <node3 attribute="yes" foo="bar">this node has attributes</node3>
 <node4>other way to create content (which is also a node)</node4>
 </one><one>
 <node1>content of node 1</node1>
 <node2/>
 <node3 attribute="yes" foo="bar">this node has attributes</node3>
 <node4>other way to create content (which is also a node)</node4>
 </one></ROOT>

總結(jié)

以上就是這篇文章的全部內(nèi)容了，希望本文的內(nèi)容對(duì)大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價(jià)值，如果有疑問大家可以留言交流，謝謝大家對(duì)腳本之家的支持。

您可能感興趣的文章: