class XML_QNC
{
private $reader = "";
private $section = "";
private $callback = "";
function XML_QNC($data, $section, $callback="", $type=0)
{
$this->reader = new XMLReader();
$this->section = $section;
$this->callback = $callback;
if($type === 0)
{
// File
$this->reader->open($data);
}
else
{
// String
$this->reader->XML($data);
}
}
// http://us2.php.net/manual/en/ref.domxml.php#71493
// Need to see if we can optimize this function.
function xml2array($domnode)
{
$nodearray = array();
$domnode = $domnode->firstChild;
while(!is_null($domnode))
{
$currentnode = $domnode->nodeName;
switch ($domnode->nodeType)
{
case XML_TEXT_NODE:
if(!(trim($domnode->nodeValue) == "")) $nodearray['cdata'] = $domnode->nodeValue;
break;
case XML_ELEMENT_NODE:
if($domnode->hasAttributes())
{
$elementarray = array();
$attributes = $domnode->attributes;
foreach($attributes as $index => $domobj)
{
$elementarray[$domobj->name] = $domobj->value;
}
}
break;
}
if($domnode->hasChildNodes())
{
$nodearray[$currentnode][] = $this->xml2array($domnode);
if(isset($elementarray))
{
$currnodeindex = count($nodearray[$currentnode]) - 1;
$nodearray[$currentnode][$currnodeindex]['@'] = $elementarray;
}
}
else
{
if(isset($elementarray) && $domnode->nodeType != XML_TEXT_NODE)
{
$nodearray[$currentnode]['@'] = $elementarray;
}
}
$domnode = $domnode->nextSibling;
}
return $nodearray;
}
function parseIt()
{
while($this->reader->read())
{
if($this->reader->nodeType == 1 &&
$this->reader->localName == $this->section)
{
do
{
// Expand the rest of the data
$element = $this->reader->expand();
// Call back
$call = $this->callback;
$call($this->xml2array($element));
}while($this->reader->next($this->section));
break;
}
}
}
}
Refactorings
No refactoring yet !
James Stansfield
June 17, 2008, June 17, 2008 11:07, permalink
I agree, why reinvent something that is a part of PHP's core?
Nick Stinemates
June 30, 2008, June 30, 2008 03:54, permalink
Last question.
Why parse to an array when you can parse to a Model object?
ellisgl.myopenid.com
July 18, 2008, July 18, 2008 18:22, permalink
@lilJon: Try parsing out a 5 gig XML.
@James: XMLReader is fast and uses little memory. Also it can do big files.
@Nick: This was just a quick test of something basically.
Anthony Gordon
October 15, 2009, October 15, 2009 00:39, permalink
When you use objects like simpleXML which is a model object, you allocate memory. so if you are parsing a document that is 300mb, It takes a heck of a lot of memory to parse it. Most cases your server wont allow it and tell you. XMLreader is like a CD. it reads the data and then forgets it. When you have to parse a xml file that is 300mb like the problem i am facing now. you will see why eventually, using XMLreader just isnt a wise choice. but the only choice.
Jason Brumwell
July 25, 2010, July 25, 2010 02:34, permalink
I needed something simular that would still utilize the memory saving of XMLReader but convert the data to an array this is what I came up with, not complete but fits my needs.
<?php
/**
* Usage
*
* $reader = new Nexus_Xml_Reader();
*
* $reader->XML($response);
*
* //Optional go to the section you need;
* while($reader->read())
* {
* if ($reader->localName == 'response') {
* break;
* }
* }
*
* $array = $reader->toArray();
*/
class Nexus_Xml_Reader extends XmlReader {
protected $array = array();
public function toArray()
{
$array = array();
if (self::NONE == $this->nodeType)
{
$this->read();
}
while ($this->nodeType != self::ELEMENT) {
$this->read();
}
do {
$array[$this->localName] = $this->_toArray();
}while($this->read());
unset($array['#text']);
return $array;
}
protected function _toArray(array $array = array())
{
$current = $this->localName;
while (true) {
switch ($this->nodeType) {
case self::SIGNIFICANT_WHITESPACE:
break;
case self::END_ELEMENT:
if ($this->localName == $current) {
break 2;
}
break;
case self::ATTRIBUTE;
break;
case self::COMMENT:
break;
case self::ELEMENT:
$key = $this->localName;
if ($current != $key)
{
$data = $this->_toArray();
if (is_string($data))
{
$data = array($key => $data);
}
if ($this->hasAttributes)
{
$index = 0;
for ($index=0; $index<=$this->attributeCount; $index++)
{
$this->moveToAttributeNo($index);
$data['#attributes'][$this->name] = $this->value;
}
$this->moveToElement();
}
if (true == isset($array[$key]))
{
if (false === isset($array[$key][0]))
{
$oldData = $array[ $key ];
unset($array[ $key ]);
$array[ $key ][] = $oldData;
}
$array[ $key ][] = $data;
} else {
$array[ $key ] = $data;
}
}
break;
case self::DOC:
break;
case self::END_ENTITY:
break;
case self::ENTITY:
break;
case self::ENTITY_REF:
break;
case self::LOADDTD:
break;
case self::NONE:
break 2;
case self::NOTATION:
break;
case self::PI:
break;
case self::TEXT:
$array = $this->value;
break;
case self::CDATA:;
break;
default:
var_dump($this->nodeType);
}
$this->read();
}
return $array;
}
}
Caden
April 23, 2011, April 23, 2011 10:15, permalink
Touchdown! That's a really cool way of pttunig it!
Touchdown! That's a really cool way of pttunig it!
XMLReader is fast and uses little memory. Here's something I came up with, but need a little bit of refactoring on the array creation meathod.