DaedTech

Stories about Software

By

Splitting Strings With Substrings

The String.Split() method in C# is probably something with which any C# developer is familiar.

string x = "Erik.Dietrich";
var tokens = x.Split('.');

Here, “tokens” will be an array of strings containing “Erik” and “Dietrich”. It’s not exactly earth shattering to tokenize a string in this fashion. And some incarnation or another of this predates .NET, C# and probably even my time on this planet.

It’s Actually Harder Than You’d Think to Split Strings Using Sub-Strings

But what about if we want to split over a string instead?

What about if we have “..” as a delimiter instead of ‘.’ and I want to split “Erik..Dietrich” in the same way? Probably an overload of String.Split() that takes a string instead of a char, right? Well, actually no. As it turns out, the API for string.Split() is pretty unintuitive.

First of all, that call to x.Split(‘.’) is not actually invoking Split(char), but rather Split(params char[]). (Notwisthanding the fact that this isn’t advertised in the MSDN page unless you drill into the individual method.)

So, calling x.split(‘.’) and x.Split(‘.’, ‘&’, ‘%’, ‘^’) are equally valid, syntax-wise in the case of “Erik.Dietrich” (and in this case, both will give me back my first and last name).

So, what one might expect is that there would be an overload Split(params[] string) to allow the same behavior as splitting over zero or more characters. Nope. Instead you have Split(string[] separator, StringSplitOptions options).

What’s Really Not Great about the Default Way to Split Strings with Sub-Strings

Two things suck about this.

  1. I have to specify some enum that I don’t care about in the first place and that has only two options, one of which is “none”. I mean, really? You can’t just assume “none” and let users specify a different case if they want with another overload?
  2. But what sucks even more about this is that params have to be the last argument in the parameter list, so that option is out the window. You no longer get that snazzy params syntax that the char version has, and now you have to actually awkwardly create a string array. So, here is the new syntax following the old. Note that the new syntax is pretty hideous.
string x = "Erik.Dietrich";
var tokens = x.Split('.');

string y = "Erik..Dietrich";
var newTokens = y.Split(new string[] { ".." }, StringSplitOptions.None)

This Gets a Lot Easier and Prettier using Regex.Split

I was getting ready to write something to hide this mess from myself as a client, when I stumbled across a better alternative than rolling my own extension method or string splitting class: Regex.Split(). Here’s how it works:

string x = "Erik..Dietrich"
var tokens = Regex.Split(x, "..");

No fuss, no muss, and exactly what String.Split() should do. Granted, the arguments to Regex.Split() are both single strings (so if you want to specify multiple delimiters, you’ll have to cook up a regex recipe) and it’s a static method, but it has the advantage of already existing in the framework and being a much, much cleaner API than x.Split().

Use in good health!

By the way, if you liked this post and you're new here, check out this page as a good place to start for more content that you might enjoy.
5 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Jim Wang
11 years ago

Ha, I miss coding. Oh wait no I don’t. 🙂

Erik Dietrich
11 years ago
Reply to  Jim Wang

What you really miss is staying up all night trying to get C code to work using Pico in a telnet session before an 8 AM deadline. Man, those were the days. By the way, love the bargaineering site. It’s like a more practical and much more frequent version of Money magazine.

Jim Nowak
11 years ago

I love C#. I really do. But when I found out it would not take, very elegantly, multiple delimiters, which I use a lot… WHAT?! Even .js works better than this! This was the nice elegant solution I was looking for. Thank you!

Erik Dietrich
11 years ago
Reply to  Jim Nowak

My thoughts were the same as yours before I found this solution, so I definitely empathize. Glad if it helped!

ling maaki
ling maaki
10 years ago

C# String operations http://csharp.net-informations.com/string/csharp_string_tutorial.htm covering most of string class mathods
ling