8bba46dba9043c254e6eef1aa664e6f0

XMLReader is fast and uses little memory. Here's something I came up with, but need a little bit of refactoring on the array creation meathod.

class XML_QNC
 {
  private $reader   = "";
  private $section  = "";
  private $callback = "";
  
  function XML_QNC($data, $section, $callback="", $type=0)
   {
    $this->reader   = new XMLReader();
    $this->section  = $section;
    $this->callback = $callback;

    if($type === 0)
     {
      // File
      $this->reader->open($data); 
     }
    else
     {
      // String
      $this->reader->XML($data);
     }
   }

  // http://us2.php.net/manual/en/ref.domxml.php#71493
  // Need to see if we can optimize this function.
  function xml2array($domnode)
   {
    $nodearray = array();
    $domnode   = $domnode->firstChild;

     while(!is_null($domnode))
      {
       $currentnode = $domnode->nodeName;

       switch ($domnode->nodeType)
        {
         case XML_TEXT_NODE:
          if(!(trim($domnode->nodeValue) == "")) $nodearray['cdata'] = $domnode->nodeValue;
         break;

         case XML_ELEMENT_NODE:
          if($domnode->hasAttributes())
           {
            $elementarray = array();
            $attributes   = $domnode->attributes;

            foreach($attributes as $index => $domobj)
             {
              $elementarray[$domobj->name] = $domobj->value;
             }
           }
         break;
        }

       if($domnode->hasChildNodes())
        {
         $nodearray[$currentnode][] = $this->xml2array($domnode);

         if(isset($elementarray))
          {
           $currnodeindex = count($nodearray[$currentnode]) - 1;
           $nodearray[$currentnode][$currnodeindex]['@'] = $elementarray;
          }
        }
       else
        {
         if(isset($elementarray) && $domnode->nodeType != XML_TEXT_NODE)
          {
           $nodearray[$currentnode]['@'] = $elementarray;
          }
        }
       $domnode = $domnode->nextSibling;
      }
    return $nodearray;
   }

  function parseIt()
   {
    while($this->reader->read())
     {
      if($this->reader->nodeType == 1 &&
         $this->reader->localName == $this->section)
       {
        do
         { 
          // Expand the rest of the data
          $element = $this->reader->expand(); 

          // Call back
          $call = $this->callback;
          $call($this->xml2array($element));
         }while($this->reader->next($this->section)); 
        break; 
       } 
     }
   } 
 }

Refactorings

No refactoring yet !

D41d8cd98f00b204e9800998ecf8427e

lilJon

June 14, 2008, June 14, 2008 17:26, permalink

No rating. Login to rate!

Doesn't simpleXML do that?

http://ca3.php.net/SimpleXML

42e1363a474387e41af8f8219797e112

James Stansfield

June 17, 2008, June 17, 2008 11:07, permalink

No rating. Login to rate!

I agree, why reinvent something that is a part of PHP's core?

44522fc90c3828034bf947f635fff4ce

Nick Stinemates

June 30, 2008, June 30, 2008 03:54, permalink

No rating. Login to rate!

Last question.

Why parse to an array when you can parse to a Model object?

8bba46dba9043c254e6eef1aa664e6f0

ellisgl.myopenid.com

July 18, 2008, July 18, 2008 18:22, permalink

No rating. Login to rate!

@lilJon: Try parsing out a 5 gig XML.
@James: XMLReader is fast and uses little memory. Also it can do big files.
@Nick: This was just a quick test of something basically.

Ecfa4a3206d0e502a41d261e10a3a349

Anthony Gordon

October 15, 2009, October 15, 2009 00:39, permalink

No rating. Login to rate!

When you use objects like simpleXML which is a model object, you allocate memory. so if you are parsing a document that is 300mb, It takes a heck of a lot of memory to parse it. Most cases your server wont allow it and tell you. XMLreader is like a CD. it reads the data and then forgets it. When you have to parse a xml file that is 300mb like the problem i am facing now. you will see why eventually, using XMLreader just isnt a wise choice. but the only choice.

D99af888eaa09a09bca83917fc61c80c

Jason Brumwell

July 25, 2010, July 25, 2010 02:34, permalink

No rating. Login to rate!

I needed something simular that would still utilize the memory saving of XMLReader but convert the data to an array this is what I came up with, not complete but fits my needs.

<?php

/**
 * Usage
 *
 * $reader = new Nexus_Xml_Reader();
 * 
 * $reader->XML($response);
 * 
 * //Optional go to the section you need;
 * while($reader->read())
 * {
 *     if ($reader->localName == 'response') {
 *         break;
 *     }
 * }
 * 
 * $array = $reader->toArray();
 */
class Nexus_Xml_Reader extends XmlReader {
    protected $array = array();

    public function toArray()
    {
        $array = array();
        
        if (self::NONE == $this->nodeType)
        {
            $this->read();
        }
        
        while ($this->nodeType != self::ELEMENT) {
            $this->read();
        }
        
        do {
           $array[$this->localName] = $this->_toArray();
        }while($this->read());

        unset($array['#text']);

        return $array;
    }

    protected function _toArray(array $array = array())
    {
        $current = $this->localName;

        while (true) {
            switch ($this->nodeType) {
                case self::SIGNIFICANT_WHITESPACE:
                    break;
                case self::END_ELEMENT:                    
                    if ($this->localName == $current) {                        
                        break 2;
                    }
                    break;
                case self::ATTRIBUTE;
                    break;
                case self::COMMENT:
                    break;
                case self::ELEMENT:
                    $key = $this->localName;
                    
                    if ($current != $key)
                    {                        
                        $data = $this->_toArray();
                        
                        if (is_string($data))
                        {
                            $data = array($key => $data);
                        }

                        if ($this->hasAttributes)
                        {
                            $index = 0;

                            for ($index=0; $index<=$this->attributeCount; $index++)
                            {
                                $this->moveToAttributeNo($index);
                                $data['#attributes'][$this->name] = $this->value;
                            }

                            $this->moveToElement();
                        }

                        if (true == isset($array[$key]))
                        {
                            if (false === isset($array[$key][0]))
                            {
                                $oldData = $array[ $key ];

                                unset($array[ $key ]);

                                $array[ $key ][] = $oldData;
                            }
                            
                            $array[ $key ][] = $data;
                        } else {
                            $array[ $key ] = $data;
                        }
                    }
                    break;
                case self::DOC:
                    break;
                case self::END_ENTITY:
                    break;
                case self::ENTITY:
                    break;
                case self::ENTITY_REF:
                    break;
                case self::LOADDTD:
                    break;
                case self::NONE:
                    break 2;
                case self::NOTATION:
                    break;
                case self::PI:
                    break;
                case self::TEXT:
                    $array = $this->value;
                    break;
                case self::CDATA:;
                    break;
                default:
                    var_dump($this->nodeType);
            }
            $this->read();            
        }

        return $array;
    }    
}
2fe78e5081cc8737daeb71993345df6f

Caden

April 23, 2011, April 23, 2011 10:15, permalink

No rating. Login to rate!

Touchdown! That's a really cool way of pttunig it!

Touchdown! That's a really cool way of pttunig it!
2fe78e5081cc8737daeb71993345df6f

Caden

April 23, 2011, April 23, 2011 10:15, permalink

No rating. Login to rate!

Touchdown! That's a really cool way of pttunig it!

Touchdown! That's a really cool way of pttunig it!
2fe78e5081cc8737daeb71993345df6f

Caden

April 23, 2011, April 23, 2011 10:15, permalink

No rating. Login to rate!

Touchdown! That's a really cool way of pttunig it!

Touchdown! That's a really cool way of pttunig it!

Your refactoring





Format Copy from initial code

or Cancel