In debug mode Visual C++ STL iterators can provide
useful information at run-time via failed assertions, but there’s a performance
cost and in certain scenarios they can seem to grind your program to a halt.
This post shows why that happens, and how to fix it. Consider the following code:
vector<int>
myInts;
myInts.push_bach(1);
cout <<
myInts[4] << endl;
It will compile O.K., and when done so in release mode the
resultant program will happily reach into memory beyond the myInts vector and send whatever
garbage it finds there to cout. However,
in debug mode stepping over the third line produces the popup “Debug Assertion Failed… vector subscript
out of range”. That may not be a big deal in the above case but if the index were calculated in a loop dependent on a number of variables that information could be very useful.
For another example consider what’s wrong with now adding
the following lines:
auto it =
myInts.begin();
myInts.push_back(2);
cout << *it
<< endl;
Chances are this code doesn't do what you want it to because
pushing the 2 onto the vector probably
requires it to grow, which requires it to move, which leaves the iterator it pointing at unallocated memory.
Again, in release mode cout will
get whatever it is now pointing
at, but in debug mode we get “vector iterator
not dereferencable”. While dereferencable
may not be a real word you’ll soon figure out what it means and it will
probably save some debugging time.
However, these extra checks come at a cost. Pretty much any call
to an iterator involves a lot of run-time checks. For example dereferencing an
iterator involves checking that the parent container isn’t null, checking the
iterator’s internal pointer isn’t null, and checking that the iterator doesn’t
point to memory before the parent vector’s first iterator, or after its last
iterator. The parent container also has extra concerns when debug iterators are
being used. All iterators that point into one container form a singly linked
list. This allows the parent container to communicate with those iterators, but
walking the list when it needs to is slow.
Example of slow iterators in debug mode
As a simple example, consider a list of 10,000 numbers from
1-100. I wish to find the number of pairs within that list that sum to 100.
Here’s some code to achieve that:
auto noCombosSumTo100 = 0;
for (auto it1 = begin(myNums); it1 != end(myNums); it1++) {
for (auto it2 = it1; it2 != end(myNums); it2++) {
if (*it1 + *it2 == 100) {
noCombosSumTo100++;
}
}
}
On my machine this takes about 34 milli-seconds fully optimised in release mode, and about 75 seconds in debug. That's about 2200x difference between debug and release!
Obviously this difference can't be entirely attributed to the iterators, there are lots of other opportunities for the compiler to optimise the above code in release mode. However, if I implement the same algorithm using vector indexing, like this:
for (auto index1 = 0; index1 < numTerms; index1++) {
for (auto index2 = index1; index2 < numTerms; index2++) {
if (mynums[index1] + mynums[index2] == 100) {
noCombosSumTo100++;
}
}
}
it runs in about 4.5 seconds, so there's clearly a big cost to using debug iterators.
Ways to improve iterator performance
The improvements I'm about to show for the first code snippet could apply to any code, but it's worth understanding that any inefficiencies are exacerbated when debug iterators are involved. To improve the first code snippet while still getting the benefit of debug iterators I can:
These three modifications bring the execution time down to 21 seconds. Interestingly none of them make any difference in the second snippet where built in types are involved. Again, it's worth understanding that these improvements would apply to any user defined type (in debug mode), but it probably wouldn't be just as noticeable as when iterators are involved.
If you want a further speedup then you'll need to sacrifice some additional run-time checks by setting _ITERATOR_DEBUG_LEVEL=1 or 0 (This macro supersedes and combines the functionality of the older _SECURE_SCL and _HAS_ITERATOR_DEBUGGING macros). Doing so decreases the amount of debug help you get, but it brings the original example's execution time down to about 8.5 seconds.
At 8.5 seconds it's still about twice as slow as indexing the vectors, and this is because even with the reduced checking there are more indirections (extra function calls) associated with dereferencing a debug iterator.
In summary...
Remember, the example code used here takes 34 milli-seconds in release mode. The example using vector indexing also takes about 34 milli-seconds in release, and the same algorithm indexing a raw array of ints takes the same time to execute, about 34 milli-seconds. This indicates that the code is being optimised to similar assembly in all three cases.
So, to summarise, STL iterators provide useful debug information for development, they allow you to access STL containers as if you are using a pointer, and they create production code that is as fast as using raw pointers. However, there is a cost in debug mode as they can slow your code down a lot. But, if these costs become too much of a hindrance there are ways to reduce the effect without changing how you are using the iterators.