quinta-feira, 21 de outubro de 2010

C# Split String Improvement

You want to correct a common string Split mistake when parsing text. Observe examples and benchmarks of the mistake and how it can be fixed, with an increase in performance. Here we see how you can make some Split code written in the C# programming language have better performance.

=== C# string Split performance improvements ===

One Split call used: 2340 ms [faster]
Two Split calls nested: 4150 ms

Improving Split code

Here we look at how you can use one Split call instead of two nested Split calls to parse text. For the example, we have lines in a file that have more than one delimiter. We have a collection of paired keys and values. Here we Split once.

=== Program that Splits once (C#) ===

using System;

class Program
{
static void Main()
{
string[] arr = new string[]
{
"something:1,something:2,more:3,bloviate:5,alpaca:65,spaniels:3",
"elementary:4,string:4,miserable:6,reprimands:3,eats:6,trustworthy:5"
};

char[] del = new char[]
{
':',
','
};

foreach (string line in arr)
{
//
// Split on multiple delimiters
//
string[] tokens = line.Split(del,
StringSplitOptions.RemoveEmptyEntries);

for (int a = 0; a < tokens.Length; a += 2)
{
string s1 = tokens[a];
string s2 = tokens[a + 1];
Console.WriteLine("{0},{1}",
s1,
s2);
}

}
}
}

=== Output of the program ===

something,1
something,2
more,3
bloviate,5
alpaca,65
spaniels,3
elementary,4
string,4
miserable,6
reprimands,3
eats,6
trustworthy,5

Description. It reduces a string to an array. This version transforms the string of keys and values into a single array. In the for loop, we use the "a += 2" expression to advance two places each iteration. In the body of the loop, we assign each string to the two elements next to each other.
Inefficient version

Next, the inefficient version. Here we split on each comma and then for each of those strings, split again. This works well but it is slower and may be somewhat harder to manage.

=== Program that uses Split twice (C#) ===

using System;

class Program
{
static void Main()
{
string[] arr = new string[]
{
// [omitted]
};

foreach (string line in arr)
{
string[] pairs = line.Split(',');
foreach (string pair in pairs)
{
string[] parts = pair.Split(':');
string s1 = parts[0];
string s2 = parts[1];
Console.WriteLine("{0},{1}",
s1,
s2);
}
}
}
}

Credits By http://dotnetperls.com/split-improvement

Nenhum comentário:

Postar um comentário