The String.Split function uses a separator to divide a string value into an array of string values. Unfortunately the split function does not support text qualifiers. As a result, if the separator is contained within a text qualified block of characters, the text block gets split.
In this article we will create a new extension called, FullSplit, that will implement the same basic functionality of the String.Split function with the added support of text qualifiers and assignment operators. When assignment operators are used the return value will be of type StringDictionary where the left side of the operator represents the DictionaryEntry.Key property and the right side represents the DictionaryEntry.Value property.
Finally we will conclude the article by updating the code from a previous post, Extending IEnumerable, to support dictionary entries by separating the key/value pair with an operator.
This article represents the second edition of an article I previously wrote and posted on The Code Project titled, “Split Function that Supports Text Qualifiers.” In this version I am merely refactoring the code into a format that is cleaner and easier to support. At the time of the original article Extensions didn’t exist but a quick comparison of this article and the previous demonstrates how much easier they can make life!
To give credit where credit is due I would like to thank Abishek Bellamkonda once again for helping with the regular expression used in the original article and today’s post.
Creating the Interface
Our interface needs to support the following capabilities:
- Split values where the text blocks may contain the separator into a string array.
- Split values that contain a key/value pair into a StringDictionary. The key cannot contain a separator.
- When values are split that contain a text qualifier remove the text qualifier so that the raw text block can easily be accessed.
- Compress an array or list into a single text value optionally adding text qualifiers to the beginning and end of each value.
- Compress a Dictionary into a single text value where the key/value pair is preserved by separating the pair with an operator character and optionally adding the text qualifier around the value.
- Compress a collection of objects in the same manner as a Dictionary by specifying a key property name and value property name.
In order to achieve these objectives we will need to create a new module with the following routines:
Public Module ParseExtensions Public Function FullSplit(ByVal value As String, ByVal separator As Char, ByVal qualifier As Char) As String() Throw New NotImplementedException("FullSplit") End Function Public Function FullSplit(ByVal value As String, ByVal separator As Char, ByVal qualifier As Char, ByVal assignmentOperator As Char) As StringDictionary Throw New NotImplementedException("FullSplit") End Function Public Function Collapse(ByVal value As IEnumerable, ByVal separator As Char, ByVal qualifier As Char, ByVal assignmentOperator As Char, ByVal keyPropertyName As String, ByVal valuePropertyName As String) As String Throw New NotImplementedException("Collapse") End Function End Module
To make function calls easier on the consumer there will be a few helper functions:
Public Module ParseExtensions Public Function FullSplit(ByVal value As String, ByVal separator As Char) As String() Return FullSplit(value, separator, Nothing) End Function Public Function Collapse(ByVal value As IEnumerable, ByVal separator As Char) As String Return Collapse(value, separator, Nothing, Nothing, Nothing, Nothing) End Function Public Function Collapse(ByVal value As IEnumerable, ByVal separator As Char, ByVal propertyName As String) As String Return Collapse(value, separator, Nothing, Nothing, Nothing, propertyName) End Function Public Function Collapse(ByVal value As IEnumerable, ByVal separator As Char, ByVal qualifier As Char) As String Return Collapse(value, separator, qualifier, Nothing, Nothing, Nothing) End Function Public Function Collapse(ByVal value As IEnumerable, ByVal separator As Char, ByVal qualifier As Char, ByVal propertyName As String) As String Return Collapse(value, separator, qualifier, Nothing, Nothing, propertyName) End Function Public Function Collapse(ByVal value As IEnumerable, ByVal separator As Char, ByVal qualifier As Char, ByVal assignmentOperator As Char) As String Return Collapse(value, separator, qualifier, assignmentOperator, Nothing, Nothing) End Function End Module
Why Support an Assignment Operator?
In case you are wondering why we are supporting assignment operators in the split function consider command line arguments. The function, Environment.GetCommandLineArguments, returns a collection of values where the space represents the separator. The function, Environment.CommandLine, returns a string value.
There is no function that returns a key/value pair for command line arguments. With our new FullSplit function we will be able to support a dictionary of key/value pairs from the command line. In the event that there is no assignment operator we’ll simply treat the value as the key to the collection. Then you can test for on/off functionality by using the Dictionary.ContainsKey function and you can use the Dictionary.Item property to access the value for the necessary key making command line argument parsing much easier!
The FullSplit Function
Before we begin to get into the code remember to define the import statements for the necessary namespaces:
Imports System.Collections 'DictionaryEntry located here Imports System.Collections.Specialized 'StringDictionary located here Imports System.Text.RegularExpressions 'RegEx located here Imports System.Reflection 'PropertyInfo located here Imports System.Runtime.CompilerServices 'Extension attribute located here
Our custom FullSplit function is actually very simple and straight-forward. We begin by creating the regular expression with the separator and qualifier. Then the expression is applied to the value. Finally we remove the qualifier at the beginning and end of the value in the results.
Since there will be two separate routines for FullSplit, one returning the string array and the other returning a StringDictionary, I’ve moved the logic to remove the text qualifier into a separate routine. If needed these helper routines can be re-scoped as public and turned into extensions as well.
Public Function FullSplit(ByVal value As String, ByVal separator As Char, ByVal qualifier As Char) As String() Dim regExPattern As String = String.Format("{0}(?=(?:[^{1}]*{1}[^{1}]*{1})*(?![^{1}]*{1}))", RegEx.Escape(separator), RegEx.Escape(qualifier)) Dim results As String() = RegEx.Split(value, regExPattern, RegExOptions.Compiled Or RegExOptions.MultiLine Or RegExOptions.IgnoreCase) Return RemoveQualifier(results, qualifier) End Function Private Function RemoveQualifier(ByVal value As IEnumerable, ByVal qualifier As Char) As String() Return RemoveQualifier(value, qualifier, Nothing) End Function Private Function RemoveQualifier(ByVal value As IEnumerable, ByVal qualifier As Char, ByVal propertyName As String) As String() Dim results As New ArrayList Dim instanceValue As Object For Each Item As Object in value If propertyName IsNot Nothing AndAlso propertyName.Trim.Length > 0 Then Dim propertyItem As PropertyInfo = item.GetType.GetProperty(propertyName) instanceValue = propertyItem.GetValue(item, Nothing) Else instanceValue = item.ToString End If If instanceValue Is Nothing Then results.Add(instanceValue) Else results.Add(RemoveQualifier(instanceValue.ToString, qualifier)) End If Next Return CType(results.ToArray(GetType(String)), String()) End Function Private Function RemoveQualifier(ByVal value As String, ByVal qualifier As Char) As String Dim result As String = value If result.StartsWith(qualifier) AndAlso result.EndsWith(qualifier) Then result = result.SubString(1, result.Length - 2) Return result End Function
In the next code snippet we handle converting a string array with key/value pairs into a StringDictionary. We’ll recycle the previous FullSplit function to get a string array of key/value pairs splitting each key/value pair and adding them to the StringDictionary. Again the beginning and ending text qualifier will be removed prior to adding the value to the collection. If there is no assignment operator in the value then the original value will be saved as both a key and a value.
Public Function FullSplit(ByVal value As String, ByVal separator As Char, ByVal qualifier As Char, ByVal assignmentOperator As Char) As StringDictionary Dim results As New StringDictionary For Each pair As String In FullSplit(value, separator, qualifier) Dim indexOfOperator As Integer = pair.IndexOf(assignmentOperator) Dim keyName As String = pair Dim keyValue As String = pair If indexOfOperator > 0 Then keyName = pair.Substring(0, indexOfOperator) keyValue = pair.Substring(indexOfOperator + 1) End If keyName = keyName.Replace(separator, String.Empty) keyValue = RemoveQualifier(keyValue, qualifier) results.Add(keyName, keyValue) Next Return results End Function
Updating the Collapse Function to Support Assignment Operators
We finish by modifying the Collapse extension for the IEnumerable interface. Replace the propertyName parameter with keyPropertyName and valuePropertyName parameters. The modified routine will continue to support IEnumerable that contain both standard variable types and objects.
For a collection of objects specify which property holds the value to save by passing the property name in either the keyPropertyName parameter or the valuePropertyName parameter. To save a key/value pair specify the property that contains the key with the keyPropertyName and the property that contains the value with valuePropertyName parameters.
As this routine is a little more complicated inline documentation has been provided to further explain how the logic is implemented in support of the functional requirements.
Public Module ParseExtensions Public Function Collapse(ByVal value As IEnumerable, ByVal separator As Char, ByVal qualifier As Char, ByVal assignmentOperator As Char, ByVal keyPropertyName As String, ByVal valuePropertyName As String) As String Dim unitedValue As String = String.Empty Dim instanceKey As Object Dim instanceValue As Object For Each item As Object in value 'Check for DictionaryEntry If assignmentOperator <> Nothing _ AndAlso (keyPropertyName Is Nothing OrElse keyPropertyName.Trim.Length = 0) _ AndAlso (valuePropertyName Is Nothing OrElse valuePropertyName.Trim.Length = 0) _ AndAlso TypeOf item Is DictionaryEntry Then keyPropertyName = "Key" valuePropertyName = "Value" End If 'Get Key from Object If keyPropertyName IsNot Nothing AndAlso keyPropertyName.Trim.Length > 0 Then Dim propertyItem As PropertyInfo = item.GetType.GetProperty(keyPropertyName) instanceKey = propertyItem.GetValue(item, Nothing) Else instanceKey = item End If 'Get Value from Object If valuePropertyName IsNot Nothing AndAlso valuePropertyName.Trim.Length > 0 Then Dim propertyItem As PropertyInfo = item.GetType.GetProperty(valuePropertyName) instanceValue = propertyItem.GetValue(item, Nothing) Else instanceValue = item End If If assignmentOperator <> Nothing Then 'When uniting a key/value pair we have to have a key name. 'In the event that a key is not specified use the value. If keyPropertyName IsNot Nothing AndAlso keyPropertyName.Trim.Length > 0 Then If instanceKey IsNot Nothing Then unitedValue &= instanceKey.ToString.Replace(separator, String.Empty) Else If instanceValue IsNot Nothing Then unitedValue &= instanceValue.ToString.Replace(separator, String.Empty) End If unitedValue &= assignmentOperator End If 'When uniting a key/value pair we have to have a value name. 'In the event that a value name is not specified use the key. 'When uniting a regular value then if there is no value name specified use the key value. 'That way if there is a key specified we get the value from the object otherwise we use the object. If qualifier <> Nothing Then unitedValue &= qualifier If valuePropertyName IsNot Nothing AndAlso valuePropertyName.Trim.Length > 0 Then If instanceValue IsNot Nothing Then unitedValue &= instanceValue.ToString Else If instanceKey IsNot Nothing Then unitedValue &= instanceKey.ToString End If If qualifier <> Nothing Then unitedValue &= qualifier If separator <> Nothing Then unitedValue &= separator Next If unitedValue.EndsWith(separator) Then unitedValue = unitedValue.Substring(0, unitedValue.Length - 1) Return unitedValue End Function End Module
To demonstrate that the function works use the following test code. Remember that order is not guaranteed by StringDictionary.
Dim collapsedValue As String Dim parsedValue() As String Dim parsedKeyValue As StringDictionary collapsedValue = "Test1,'Test2a','Test,2b',Test3" parsedValue = collapsedValue.FullSplit(","c, "'"c) collapsedValue = String.Empty collapsedValue = parsedValue.Collapse(","c, "'"c) collapsedValue = "Key1=Test1,Key2a='Test2a',Key2b='Test,2b',Key3=Test3" parsedKeyValue = collapsedValue.FullSplit(","c, "'"c, "="c) collapsedValue = String.Empty collapsedValue = parsedKeyValue.Collapse(","c, "'"c, "="c)
Leave a Reply