When to use ToArray()

Page style (CSS):

When to use ToArray()

February 11th, 2008 by Eddie Sullivan

Some thoughts on when to return a List<> object and when to return an array in C#.

The analogy: strings

Many times in C#, a function that needs to build a string from constituent parts will use a StringBuilder internally, and then on the last line of the function call ToString() on the StringBuilder object. This is because strings in C# are immutable. That is, once they are set then can never be changed. Code that looks like it is changing a string in place is usually actually allocating a new string to hold the new value. Obviously, for a function that builds a string incrementally by constantly appending to it, this can lead to lots of reallocation, slowing down performance.

Here's an example of how it's normally done:

    public string GimmeString(int num)
    {
        StringBuilder sb = new StringBuilder("Initiating countdown: ");

        for (int i = num; i > 0; i--)
        {
            sb.Append(string.Format("{0}...", i));
        }
        sb.Append("Blastoff!\n");

        return sb.ToString();
    }
    

This is a pretty obvious design pattern. It's rare you would want to return the StringBuilder object itself, so almost always it is converted to a string before returning. The problem gets trickier, however, when we try to use a similar design pattern for arrays.

Arrays

Arrays in C# are not immutable, at least not in the way strings are. It could be said they are "partially immutable" (my own made-up term). You can swap out elements in the array as much as you like, but the array's length is fixed. Therefore, if you need to build up an array from constituent parts, it makes sense to use a List<> object for doing the work.

The analogy then becomes, in SAT terms:

StringBuilder : string :: List<> : array

The question then becomes whether or not to call ToArray() on the List<> before returning it. In this case, the answer's not as obvious. Let's examine the pros and cons of each approach:

Reasons to call ToArray()

  • If the returned value is not meant to be modified, returning it as an array makes that fact a bit clearer.
  • If the caller is expected to perform many non-sequential accesses to the data, there can be a performance benefit to an array over a List<>.
  • If you know you will need to pass the returned value to a third-party function that expects an array.
  • Compatibility with calling functions that need to work with .NET version 1 or 1.1. These versions don't have the List<> type (or any generic types, for that matter).

Reasons not to call ToArray()

  • If the caller ever does need to add or remove elements, a List<> is absolutely required.
  • The performance benefits are not necessarily guaranteed, especially if the caller is accessing the data in a sequential fashion. There is also the additional step of converting from List<> to array, which takes processing time.
  • The caller can always convert the list to an array themselves.

ToArray() or not ToArray()? That is the question.

Based on these points, it seems to make the most sense as a general rule to simply return the List<> object directly, rather than converting it to an array before returning. Let me know if you disagree.

Here's an example:

    // A contrived example. Similar to Python's "range" function, but only
    // supports positive step.
    public List<int> GimmeInts(int start, int end, int step)
    {
        List<int> ret = new List<int>();

        for (int i = start; i < end; i += step)
        {
            ret.Add(i);
        }
        // Here you could have:
        // return ret.ToArray();
        return ret;
    }
    

Leave a Reply