My JavaScript book is out! Don't miss the opportunity to upgrade your beginner or average dev skills.

Tuesday, July 27, 2010

Array.prototype.slice VS Arrayfication

One of the most common operations performed on daily basis directly or indirectly via frameworks and libraries is Array.prototype.slice calls over non Array elements such HTMLCollection, NodeList, and Arguments.

Why We Perform Such Operation


The Function.prototype.apply works only with object created through the [[Class]] Array or Arguments.
In latter case we may like to avoid ES3 arguments and named arguments mess when dealing with indexes.
Finally, in most of the case we would like to perform Array operations over ArrayLike objects to filer, map, splice, change, modify, etc etc ...

The Slice Cost


Let's perform over an ArrayLike object of length 2000, 2000 slice calls (change the length if you have such powerful machine):

// create the ArrayLike object
for(var arguments = {length:2000}, i = 0; i < 2000; ++i)
arguments[i] = i;
;

// define the bench function
function testSlice(arguments) {
var t = new Date;
for (var slice = Array.prototype.slice, i = 0, length = arguments.length; i < length; ++i)
slice.call(arguments);
;
return new Date - t;
}

// test it
alert(testSlice(arguments));

The average time in my Atom N270 based Netbook is 1750 ms, and numbers are not that unreasonable.
While a list of 2000 generic values may be not that common, the number of slice.call may be definitively more than 2000 during an application life cycle. All this wasted time to simply transform a generic list into an Array? And maybe just to loop and re-loop over it to filter or change values and indexes?

Alternative One: __proto__


If we modify the __proto__ property of whatever object and in almost all browsers (not IE, of course), we can somehow "promote" the current object to Array, except for the [[Class]] that will still be the generic Object, or Arguments.
The undesired side effect is that the modified object won't find anymore in its prototype its original methods, so we can define this technique really greedy.
But what about performances?

// function to test
function testProto(arguments) {
var t = new Date;
for (var proto = Array.prototype, i = 0, length = arguments.length; i < length; ++i)
arguments.__proto__ = proto;
;
return new Date - t;
}

// the result using precedent arguments object
alert(testProto(arguments));

The average elapsed time in this case is 40 ms but we have to remember is that we are not converting the object into an Array instance, we are simply overwriting it's inherited properties and methods with those part of the Array.prototype object.
This means that push, unshift, splice, forEach, or other operations, will be accessible directly through the object (e.g. arguments.forEach( ... ))
If these methods are the reason we would like to slice the generic object, this solution is definitively preferred.

Alternative Two: Arrayfication


To avoid the undesired side effect obtained via __proto__ assignment, the removal of inherited methods, we may consider to use call or apply directly via Array.prototype.
The imminent side effect of this technique is that potentially every function may like to use one of the Array prototype methods against the same object which is always passed by reference.
A simple solution could be the one to attach directly a method to this object, so that every part of the application will be able to use, as example, a forEach call for this object, without accessing every time the Array.prototype.forEach method.

arguments.forEach = Array.prototype.forEach;
// now use forEach wherever we need
// with arguments object

Since every function may like to use one or more Array methods, how about creating a function able to attach all of them in one shot?

Array.fy = (function () {
// (C) WebReflection - Mit Style License
for (var
m = [
"pop", "push", "reverse", "shift", "sort", "splice", "unshift",
"concat", "join", "slice", "indexOf", "lastIndexOf",
"filter", "forEach", "every", "map", "some", "reduce", "reduceRight"
],
i = m.length; i--;
) {
m[i] = "o." + m[i] + "=p." + m[i];
}
m.push("return o");
return Function("p",
"return function Arrayfy(o){" + m.join(";") + "}"
)(Array.prototype);
}());

With a single call we can attach all current available Array.prototype methods once without getting rid of current inherited properties or methods.
Let's see how much does a call cost:

// test function for Array.fy
function testArrayfy(arguments) {
var t = new Date;
for (var fy = Array.fy, i = 0, length = arguments.length; i < length; ++i)
fy(arguments);
;
return new Date - t;
}

// bench
alert(testArrayfy(arguments));

The average elapsed time for this operation is 3 ms.
After a single call we can consider the generic object an Array duck, preserving its inheritance.
The greedy aspect is about possible overwrites, but I have personally never called a method forEach if this is not exactly representing the Array.forEach method.

Arrayfied Operations


The last benchmark we can do is about common Array operations over our Arrayfied object.
The first consideration to do is that slice, as every other Array operation, seems to be extremely optimized for real Array objects.
In few words if we need to transform because we need many calls to forEach, filter, map, slice, etc, the transformation via slice is probably what we are looking for.
But if we need a generic loop over a generic callback and just few times, Array.fy proposal is probably the most indicated one. Here some extra test:

// slice plus a forEach operation
function testSliceEach(arguments) {
var t = new Date;
for (var slice = Array.prototype.slice, fn = function() {}, i = 0, length = arguments.length; i < length; ++i)
slice.call(arguments).forEach(fn);
;
return new Date - t;
}

// just forEach through Arrayfied object
function testArrayfied(arguments) {
var t = new Date;
Array.fy(arguments);
for (var fn = function() {}, i = 0, length = arguments.length; i < length; ++i)
arguments.forEach(fn);
;
return new Date - t;
}

// just slice through Arrayfied object
function testArrayfiedSliced(arguments) {
var t = new Date;
Array.fy(arguments);
for (var fn = function() {}, i = 0, length = arguments.length; i < length; ++i)
arguments.slice();
;
return new Date - t;
}

// bench

alert(testSliceEach(arguments)); // 3200ms
alert(testArrayfied(arguments)); // 2400ms
alert(testArrayfiedSliced(arguments));// 1700ms

The last test is against a classic slice.call and it costs basically the same.
First and seconds demonstrate that if we slice to use native power we are actually spending more time than using native power directly.
Bear in mind that if the variable is already an Array, slice will cost much less and the average against the last test will be 1350 ms.

IE And Conclusions


I keep saying that IE should simply have normalized Array.prototype, ignoring those person that wrongly rely in for in loops over arrays when it's not necessary, and making this Array.fy portable for IE world as well since the internal proto variable is a pointer to the original Array.prototype then automatically ready for natives standard enhancements.
The nice part is that rather than see Array.prototype.something.call in every piece of code we can easily use one fast call to have them all ... so, you decide :-)

.. last minute Example ...


Just because sometimes we lack of fantasy, here a generic usage for Array.fy:

function $(selector, parentNode) {
return Array.fy((parentNode || document).querySelectorAll(selector));
}

$("div").forEach(function (div, i, nodeList) {
div.innerHTML = "Array.fy rocks!";
});


Update
just for testing/performances purpose:

Array.ify = (function () {
// (C) WebReflection - Mit Style License
for (var
m = [
"pop", "push", "reverse", "shift", "sort", "splice", "unshift",
"concat", "join", "slice", "indexOf", "lastIndexOf",
"filter", "forEach", "every", "map", "some", "reduce", "reduceRight"
],
f = [],
i = m.length; i--;
) {
f[i] = "var " + m[i] + "=p." + m[i];
m[i] = "o." + m[i] + "=" + m[i];
}
m.push("return o");
return Function("p", f.join(";") +
";return function Arrayfy(o){" + m.join(";") + "}"
)(Array.prototype);
}());

8 comments:

Ken Snyder said...

Very interesting! Learned a lot. Thanks!

HB said...

Nice, this is a great comparison of solutions to a pretty much universally common problem. It's interesting that there's not really one "best" or "fastest" answer, it's just a matter of what you need at a particular time.

a.in.the.k (@ainthek) said...

I love the Array.fy code.
Just being curious, I have tested several other ways to implement this "without using evil eval".

All of them have been slower just because of method lookup on Array.prototype. (tested on MSIE 7.0,
orig code: 47ms/10000 loops of Array.fy({}),

rewrite without Function was 180ms/10000 loops

Gooood Job, I start to enjoy your code style more and more.

Andrea Giammarchi said...

thanks, btw it may speed up a little bit assigning directly the function, trapped in the scope, rather than the prototype.method

This should ensure best performances ever, I'll update the post ASAP with some result ;-)

Regards

Andrea Giammarchi said...

yep, please let me know how it is the latest piece of code I have put there, please note it's called Array.ify() :)

a.in.the.k (@ainthek) said...

New code reached 1.2 ration on my machine with just lame quick tests:

MSIE 7.0 10000 loops:
.fy: 860ms
ify: 670ms

SAFARI:
.fy: 160ms
ify: 150ms

Unknown said...

Hmm, two little problems with your Array.fy code:

1 - the type of the return object is wrong:

new Function().apply({}, Array.fy(document.getElementsByTagName("*"))); //error, instanceof Array == false

2 - extend is faster
var a = []
function extend(x) {
for (var i in a) x[i] = a[i]
return x
}

On my computer, FF3.6, testArraify() runs in 3ms and testExtend runs in 1

Andrea Giammarchi said...

1. Arraification does NOT want to change the prototype so you still have the original one

2. that loop is completely pointless, of course if faster, you don't do anything there ( for|in never executed )

Native prototypes have a DONT_ENUM attribute by default, put an alert inside the forIn and see that it will never be executed.

Regards