7aa9a6d812dc3a96d9f5e39fb07e84f5

Hi everybody!
I'm kinda new here (been a lurker for long, but didn't feel like asking till right now), and I have a problem.

In my quest to stop @|#|\ repeating myself at work I decided to build a simple toolkit to pack all the things I did before and reuse them whenever needed. Thing is, most of my tools are based around this little booger here, which is an extremely barebones version of CakePHP's Set. What it does is essentially simple:
Collection translates a dot.separated.key to a multidimensional array and puts or reads a given value from such array.
What it also does is suck at speed:
In the given bench on my development machine it does some 570-590 operations per second overall. Since it's the basement of everything else I built, it sounded like a good idea to make it generally faster (as much as possible) but my knowledge on this is really limited... Could you give me a hand here?

<?php
//class
	class Collection{
		private $data = array();
		public function data($key,$value=null){
			$keys = explode('.',$key);
			$ptr = &$this->data;
			if(!is_null($value)){
				while(($key=array_shift($keys))!==null){
					if (!isset($ptr[$key])) 
						$ptr[$key] = array();
					$ptr = &$ptr[$key];
				}
	
				$ptr[]=$value;
				return true;
			}
			else{
				while(($key=array_shift($keys))!==null){
					$ptr = &$ptr[$key];
				}
				return array_shift($ptr);
			} 
		}		
	}
?>
<?php
//bench
	error_reporting(E_ALL ^ E_STRICT);
	$set = new Collection();
	$tests=500;
	$counter = 50;	
	$i=$deltas=0;
	$nums=array();
	$random_crap = array();
	
	for($i<0;$i<$tests;$i++){
		$iterator = 0;
		$begin = microtime(true);
		while($iterator < $counter){
			 $set->data('random.numbers.put.in.a.really.deeply.nested.array',(int)mt_rand(0,$counter));
			 $iterator++;
		}
		while($iterator > 0){
			 $nums[]=$set->data('random.numbers.put.in.a.really.deeply.nested.array');
			 $iterator--;
		}	
		$deltas+=(microtime(true)-$begin);		
	}
echo "On average it took $deltas seconds to do $tests tests. That's ".
	($tests/$deltas).
	" tests per second, or a total of ".
	(2*$tests*$counter).
	" writings and writings (50%/50%) at a rate of ".(($tests*$counter)/$deltas).
	" per second";	
?>

<pre><?php var_dump($nums)?></pre>

Refactorings

No refactoring yet !

D85d44a0eca045f40e5a31449277c26c

Ben Marini

December 8, 2009, December 08, 2009 08:13, permalink

No rating. Login to rate!

Your benchmarking code is benchmarking more than just your Collection class. It's also benchmarking microtime() and more importantly, mt_rand(). You should isolate your code in the benchmark. With the code below, I was getting roughly 97,951 writes per second on my 2.4Ghz MacBook. I got was able to speed that up 150% (150,428 writes per sec) by replacing your while loops with foreach loops in the data function.

<?php
  // Benchmarks for writes and reads, separated.
  $times = 10000;
  $set   = new Collection();
  $start = microtime(true);
  for ($i=0; $i < $times; $i++) { 
    $set->data('random.numbers.put.in.a.really.deeply.nested.array', 5); // Write
  }
  $diff = microtime(true) - $start;
  echo "Writes, long key\n";
  echo "$times writes in $diff\n";
  echo $times / $diff . " writes/sec\n\n";

  $times = 10000;
  $set   = new Collection();
  $set->data('random.numbers.put.in.a.really.deeply.nested.array', 5);
  $start = microtime(true);
  for ($i=0; $i < $times; $i++) { 
    $set->data('random.numbers.put.in.a.really.deeply.nested.array'); // Write
  }
  $diff = microtime(true) - $start;
  echo "Reads, long key\n";
  echo "$times reads in $diff\n";
  echo $times / $diff . " reads/sec\n\n";

  // Data function with while loops replaced with foreach loops

    public function data($key,$value=null) {
      $keys = explode('.', $key);
      $ptr  = &$this->data;
      if ( !is_null($value) ) {
        foreach ($keys as $key) {
          if ( !isset($ptr[$key]) ) $ptr[$key] = array();
          $ptr = &$ptr[$key];
        }

        $ptr[] = $value;
        return true;
      } else {
        foreach ($keys as $key) {
          $ptr = &$ptr[$key];
        }
        return array_shift($ptr);
      }
    }
7aa9a6d812dc3a96d9f5e39fb07e84f5

blackBear

December 8, 2009, December 08, 2009 15:39, permalink

No rating. Login to rate!

Showed you code to my roommate who in turn ellaborated on it; somehow I missed I had duplicated loops, and I didn't know those don't get optimized out. The end result:

<?php
class Collection {
	private $data = array();
	public function data($key, $value = null) {
	    $keys = explode('.', $key);
	    $ptr = &$this->data;	    
	    foreach($keys as $key) 
	        $ptr = &$ptr[$key];
	    if (!is_null($value)) {	        
	        $ptr[] = $value;
	        return true;
	    } 
		else 
			return array_shift($ptr);
	}        
}
?>

Your refactoring





Format Copy from initial code

or Cancel