One of the biggest issues with older versions of Select2 was the slow
performance when working with thousands of options in a <select>
. This was
primarily because Select2 had to generate a JavaScript object for each element,
as well as a DOM element within the dropdown for each possible option. There was
a long-standing ticket in the Select2 repository for
improving the performance, with a few ideas having come up over the years and
some improvements that eventually made their way into the code.
For the rewrite of Select2, we decided that fixing the
performance issues would be a major focus. As Select2 4.0 would be using the
<select>
element for all cases, we realized that telling everyone to use a
remote data set wouldn’t be the magical fix as we now generated all of the
results in a consistent format. This post will cover the biggest improvements
that were discovered along the process, as well as the reason why they were slow
in the first place.
The easiest issue to fix in Select2 involved how it stripped out the diacritics to allow for a fuzzy search to be done on results. This has the benefit of making Select2 much easier to use when working with other languages, but it had the downside of making result matching considerably slower at the same time. The issue was originally pointed out by @ycdtosa after using a profiler, and after a few jsPerf tests we were able to get it to a state where it could be integrated into Select2.
We settled on the badly-named “What?” test from jsPerf, which outperformed the “String builder smarter” test (which is what we previously used) and gave us around an 8x increase in performance.
The old code worked by manually replacing each character that had a diacritic mark with the normalized version. It used a for loop, which would be almost perfect if every letter had a diacritic, but this is almost never the case as diacritic marks tend to be few and far between when working with large data sets.
1
2
3
4
5
6
7
8
function removeDiacritics(str) {
var newStr = "", letter;
for (var i = 0, len = str.length; i < len; i++) {
letter = str[i];
newStr += diacriticsMap[letter] || letter;
}
return newStr;
}
This code iterates over every letter in the string, even if it probably doesn’t need to be checked. While this doesn’t affect performance that much on smaller data sets, when working with large data sets it adds up very quickly. The code checks to see if the letter is in the diacritics map, using the normalized value if it is, and adds the letter to the string containing the normalized data.
There is clearly a few places that can be optimized, the first one being the
fact that it checks every single character in the string. The second place
that we can improve is the for loop, as there are other way to replace
characters in a string, such as
JavaScript’s String.replace
method.
The new code works using the replace
method provided by JavaScript, in
combination with a regular expression and a callback function. In most cases,
the replace
method is used to replace one string with another, but it can be
used to match regular expressions and replace only
the part of the string that matches them.
1
2
3
4
5
function removeDiacritics(str) {
return str.replace(/[^\u0000-\u007E]/g, function(a) {
return diacriticsMap[a] || a;
});
}
This uses a regular expression matching all characters not within the basic ASCII character set, allowing us to skip the characters which we know have no diacritics and will never need to be replaced. It also uses a callback function when replacing the characters which will quickly check if the character is in the diacritics map and replace it if needed.
As Select2 can be used with many different types of data, ranging from a
standard <select>
to a remote data source, it internally uses a standardized
object representation to determine how it should be rendered in different parts
of Select2. The internal object stores information such as if the option is
disabled or selected, the value that should be sent to the server (the id
),
and the option text. It also includes information such as if the option has
children (it’s an <optgroup>
), along with the objects for those children.
As these objects are used just about everywhere, any performance issues in creating them will slow down the entire plugin, but there are some things which were identified which we could improve. Before we go over those, there is something which needs to be covered, and that is how to find performance issues within your own plugins.
We extensively used the JavaScript profiler provided with Google Chrome when we were searching for the slow parts of our code. Since we knew that the performance issues happened when working with large data sets, we knew to look for the parts of the code that were called the most number of times, not just the parts of the code that were slow. Improving a function which normally takes 30ms to run and making it run in 10ms is great, but if it only runs once or twice then it doesn’t help the problem that much. Fixing small performance issues works well if there are many of them and they are easy to identify, but it is quicker if you can fix the larger ones.
This means looking at the waterfall and looking for the slow performing methods, but also knowing how to filter them. If you start looking down the waterfall and you see the internal methods of jQuery are your slow spots, then look around and see if you can either avoid using them or make them work faster.
If you have functions which perform the same calls on an object (in our case, a DOM element), and the result from the function is not dependent on other information, you may have better luck caching the data in a central place instead of wasting time computing the information and getting the same result.
The fastest call is the one that is never made.
In our case, since we constantly needed to get the object form of an <option>
tag, we could cache the object itself on the element and avoid computing it if
we had already done it before. We had the option of either using our own caching
mechanism, such as a global object which mapped elements to objects, or we could
use a caching mechanism already provided with jQuery. jQuery allows you to
associate data with an element using the $el.data()
method, which was perfect
for us as our key is a DOM element.
.data()
unless you need data-*
attributesjQuery provides a .data
method that can be called on jQuery
elements, which will allow you to set and retrieve any type of data, not just
the string data that is limited to the HTML5 data-*
attributes. This allows
users of your plugin to specify options using the data-*
attributes which you
can then use within your plugin, and it also allows you to set arbitrary data on
an element, in the same way that a key-value store or data works.
If you don’t need your users to specify options through data-*
attributes, and
you are working with data that only you need to handle, there is no
reason[1] to not use the more direct method of settings data on an
element.
data-*
attributes every time .data()
is calledIf you don’t need information from the data attributes, or you do but you’ve already stored it elsewhere, there is no need to get all of the information from the attributes before getting the information you have stored on the element. There is a large amount of overhead in getting the data attributes from the element, around 10ms per call, that will add up quickly when you need to cache data on a large number of elements.
This includes when you are only setting data on the element, as jQuery has to
get all of the possible data before calling the same
$.data
methods. This can be a serious performance hit if you need to set the
data on a large amount of objects, and you don’t care about the HTML attributes.
$.data
for caching informationjQuery provides the $.data
method for saving data on a DOM
element, completely independent of the HTML5 data APIs. Internally, it is what
jQuery uses when setting the data on an element and it does not alter the
element itself. Especially when dealing with information that should never be
altered by users of your plugin, using the lower level API allows for you to
skip much of the overhead of accessing the attributes.
You can access the DOM element from any jQuery element as the first element in
the array. This means that if your jQuery element is $e
, the DOM element is
$e[0]
.
$.data
for jQuery-compatible librariesSome libraries, such as Zepto.js, do not provide the $.data
function by default as they implement their .data
function in a way that
eliminates much of the need. In these cases, you can polyfill the missing
function with
1
2
3
4
$.data = function (el, key, val) {
var $el = $(el);
return $el.data(key, val);
}
If you are working on a public plugin, you may want to avoid modifying the
$
object with a polyfill and instead create your own methods for getting and
setting data on an element.
jQuery is a fantastic tool when you need to work with the DOM, especially when supporting older browsers with a complex DOM layout. This is something that is covered often when talking about jQuery being slow and much of the problem comes when making changes to the DOM. In the case of Select2, where most of the time spent interacting with the DOM is when elements are being created, our problems centered around creating new objects.
jQuery is slow compared to the DOM and it’s not likely that jQuery will speed up considerably in the future. Much of the performance issues come from jQuery’s support of older browsers and edge cases that have come up throughout the years, allowing for a consistent interface to be used on any browser.
As an example that will be used throughout this section, we are going to need to
append 10,000 <div>
tags to a common container. The resulting HTML should be
close to the following:
1
2
3
4
5
6
7
8
9
<div class="my-wrapper">
<div class="test">My amazing div #0</div>
<div class="test">My amazing div #1</div>
<div class="test">My amazing div #2</div>
<!-- ... -->
<div class="test">My amazing div #9997</div>
<div class="test">My amazing div #9998</div>
<div class="test">My amazing div #9999</div>
</div>
This is one of the most common recommendations when working with large collections of elements and jQuery. You should append all of your elements to the DOM at once, instead of doing it each time for each element that you create.
1
2
3
4
5
6
var $wrapper = $(".my-wrapper");
for (var i = 0; i < 10000; i++) {
var $e = $('<div class="test">My amazing div #' + i + '</div>');
$wrapper.append($e);
}
You should instead place everything into an array, and then pass the array to
.append
so it all happens at once. This will allow the browser to trigger a
single repaint for the change, and you won’t have to worry about the .append
call being as slow.
1
2
3
4
5
6
7
8
9
var $wrapper = $(".my-wrapper");
var $_elements = [];
for (var i = 0; i < 10000; i++) {
var $e = $('<div class="test">My amazing div #' + i + '</div>');
$_elements.push($e);
}
$wrapper.append($_elements);
When working with the DOM, jQuery allows you to pass a DOM node or a string
containing HTML into the $
object. In our example, we are passing an a string
that builds a <div>
element with the class and content that we are looking
for. This forces jQuery to parse the string and verify that it is HTML, and then
build the DOM that it represents. This is incredibly slow, especially when you
are only building a single DOM node, and it ended up being the largest slow down
when working with jQuery. Even when you are creating a completely bare DOM node,
jQuery still has to parse it and build the DOM node on its own.
document.createElement
We can create the <div>
outside of jQuery using
document.createElement
, which is supported in every
browser, and then have jQuery wrap it, which will allow us to skip jQuery’s
string parsing but still get the benefits of using jQuery.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
var $wrapper = $(".my-wrapper");
var $_elements = [];
for (var i = 0; i < 10000; i++) {
var e = document.createElement('div');
var $e = $(e);
$e.text('My amazing div #' + i);
$e.addClass('test');
$_elements.push($e);
}
$wrapper.append($_elements);
This alone will improve the speed, as jQuery knows how to handle raw DOM elements better (and faster) than it knows how to handle HTML strings.
jQuery allows you to interact with the DOM in a cross-browser way, which will
save you a lot of time when working with existing DOM nodes. When you are
working with your own DOM nodes though, you can often skip jQuery entirely. We
are adding the test
class to the newly created <div>
element, which is the
same as setting the .className
property on the DOM node,
which is a property available on all browsers for setting classes on an element.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
var $wrapper = $(".my-wrapper");
var $_elements = [];
for (var i = 0; i < 10000; i++) {
var e = document.createElement('div');
e.className = 'test';
var $e = $(e);
$e.text('My amazing div #' + i);
$_elements.push($e);
}
$wrapper.append($_elements);
It is important to mention that the .className
property contains all classes
set on the element, separated by whitespace. In IE 8+, the
.classList
property exists which has .add
and .remove
methods that more closely match the .addClass
and .removeClass
methods
provided by jQuery.
jQuery internally uses the .className
property with some regular expressions
to add and remove classes for backwards compatibility with browsers which do not
support the .classList
property. While
.className
on its own is slightly faster than .classList
,
any speed advantages are lost when checking to make sure that classes are not
duplicated or if they already exist.