Thursday, November 26, 2015

PowerShell: Adding Elements to an Array

If you've been using PowerShell for a while, you may have stumbled across this syntax:
$Array += $Item
Or this:
$Array = $Array + $Item
This is inefficient and slow, and here is why. The += operator, instead of simple adding a new element to the array, creates a second copy of the array with the new element. To visualize:
$Array = 'A'
#New array = A
#Sum of array sizes = 1
$Array += 'B'
#Old array = A
#New array = A,B
#Sum of array sizes = 3
$Array += 'C'
#Old array = A,B
#New array = A,B,C
#Sum of array sizes = 5
...
$Array += 'Z'
#Old array = A,B,C,...,Y
#New array = A,B,C,...,Y,Z
#Sum of array sizes = 51
This may seem innocuous, but when the array is large, the issue is exacerbated. Say the array has millions of elements. Every time an element is added, another completely new array containing millions of items is created. So the memory usage is basically double (plus one) the size of the array.

The more efficient way to add to an array is to move it back a level. So instead of:
foreach ($Item in $Items) {
    $Array += $Item
}
Use this:
$Array = foreach ($Item in $Items) {
    $Item
}
This creates the array once. Much quicker, much more efficient. I'm not a mathematician, but I've run my own tests and have seen the difference.

This took 1.2 seconds:
$Array = 1..50000 | ForEach-Object { "$_" }
This took 143 seconds:
1..50000 | ForEach-Object { $Array += "$_" }
I didn't have the patience to test higher than 50000. Your mileage may vary, but += is not going to win this race.

No comments:

Post a Comment