Thursday, May 5, 2011

SUBSTRING & CHARINDEX: A great relationship

I have recently been utilizing these awesome tools for some serious string manipulation and 
wanted to write a quick post describing what I am doing. First lets look at what they are. MSDN
 describes SUBSTRING capability: “Returns part of a character, binary, text, or image expression. 
Ok that is simple it returns a portion of a string. The syntax is also very simple: 

 
SUBSTRING ( value_expression , start_expression , length_expression )
 

Lets look at an example.:

Lets say we have the string: ‘Brandon will isolate a portion of this string’ and we want to select just the word ‘isolate’ from this string. This can be done using SUBSTRING like this..

DECLARE @String VARCHAR(50)

SET @String = 'Brandon will isolate a portion of this string'

SELECT SUBSTRING(@String, 14, 8)

Results:

--------

isolate

So in this case we defined a variable called @string with a data type of VARCHAR(50). Then we defined the variable with our string so now @String = ‘Brandon will isolate a portion of this string’. Then we used substring to select the word ‘isolate’ out of the entire string.

· The syntax required three expressions:

o the value_expression: This is the string it self which we stuffed in a variable

o Start_expresion: This is an integer value and it represents which character you want to start. Because we wanted to grab the word ‘isolate’ the ‘i’ would be our start_expression which is the 14th character in the entire string

o Length_expression: This is also an integer value but this one represents how many characters long you want your string to be. In our case 8 characters incased the word we were looking for.

Ok so now that we know how SUBSTRING works lets define CHARINDEX and see how it works. MSDN Defines CHARINDEX as “Searches expression2 for expression1 and returns its starting position if found.” Again, very simple. Using CHARINDEX we can obtain the starting position of a string within a string. The value retrieved is an integer value. The syntax is as follows

CHARINDEX ( expression1 ,expression2 [ , start_location ] ) 

So using our original example lets put CHARINDEX work with an example. In the first example we were able to isolate the word ‘isolate’ by manually inputting the start_expression of the string. But what if we did not know where in the string the word ‘isolate’ was. With CHARINDEX we could retrieve the starting position dynamically. Let see how..

DECLARE @String VARCHAR(50)

SET @String = 'Brandon will isolate a portion of this string'

SELECT CHARINDEX(‘isolate’, @String, 0)

Results:

--------

14

Again we defined a variable called @string with a data type of VARCHAR(50). Then we defined the variable with our string so now @String = ‘Brandon will isolate a portion of this string’. Then using CHARINDEX we retrieved the starting position from the string in question.

· Again the syntax required three expressions

o Expression1: This would be the string you are looking for, in our case it was ‘isolate’

o Expression2: This is the string in which SQL will search for expression1, in our case it was ‘Brandon will isolate a portion of this string’ (defined in a variable)

o Start_location: This is an integer value and represents which character space to start looking for expression1. In our example we selected 0 which informs SQL to start at the beginning of expression2 in its search for expression1

o So to sum it up, SQL searched for expression1 in expression2 and started its search at character number 0. Once it found expression1 it then provided to us the starting position of it which was 14

The first two examples were very simple to help explain how the two functions work seperately. The scenerio which requires you use them together is a bit more complicated and requires a more complicated example. Like the SUBSTRING example above, lets say we had a string and we wanted to isolate just a portion of that string. Well like above we can use SUBSTRING function to do this. In the example above however we statically input the start_expression and legnth expression. This is great and doable if you know both, where the string you are looking for is in the main string, and two how many characters long it is. What if you don’t know where it is, and the length varies? Now we have a scenerio that will require both SUBSTRING and CHARINDEX working together.

In our new example our string will be ‘Brandon has eaten 1,004,584 Cookies in his lifetime’ And we want to isolate the number of cookies I have eaten (1,004,584). Again SUBSTRING requires three expressions:

· the value_expression:

· Start_expresion:

· Length_expression:

We know the value_expression and could figure out the start and length_expression, but lets pretend for learning sake that the string (value_expression) was a lot larger and the length varied. So we need to use CHARINDEX with SUBSTRING to dynamically retrieve the start_expression and length_expression. So here it is:

DECLARE @String VARCHAR(50)

SET @String = '‘Brandon has eaten 1,004,584 Cookies in his lifetime’'

SELECT SUBSTRING(@String, CHARINDEX('eaten', @String, 0)+6, (CHARINDEX('cookies', @String, 0))-(CHARINDEX('eaten', @String, 0)+6))

Results:

------------

1,004,584

The above select statement looks kinda crazy but it makes simple sense when you break it up. Just remember the three expressions required for SUBSTRING. Instead of entering static values for the start_expression and length_expression, we used CHARINDEX to retrieve the values we needed to dynamically give us the starting postion and length. Lets break it up, here is the string so you can reference while we go through each part:

‘Brandon has eaten 1,004,584 Cookies in his lifetime’

· the value_expression: @String

o This should be understood now, @string represents the larger string in which we are searching and is our value_expression in the SUBSTRING call.

· Start_expresion: CHARINDEX('eaten', @String, 0)+6

o What we needed for the start_expression was the character number representing the first character of the string we want to isolate. In our case the string we want to isolate is ‘1,004,584’ So we need to retrieve the character value for ‘1’. To do this we used CHARINDEX looking for the word before our desired string and added 6 to this number as that is the amount of spaces from the beginning of ‘eaten’ to the beginning of our desired string. Thus we retrieve our starting location. This is helpful if the amount of characters before our desired string varies.

· Length_expression: (CHARINDEX('cookies', @String, 0))-(CHARINDEX('eaten', @String, 0)+6)

o The length_expression is the most complicated part to explain, but once you get it, its extremely simple. To retrieve the length_expression dynamically we will use two CHARINDEX references and subtract one from the other. To visualize this lets break it apart. If we take the original SUBSTRING and input all the correct values manually it would look like this: SELECT SUBSTRING(@String,20,10) . The 10 is the length of our number of cookies in our string. So we retrieve this number dynamically using CHARINDEX…

IF (CHARINDEX('cookies', @String, 0)) = 30

AND (CHARINDEX('eaten', @String, 0)+6) = 20

THEN (CHARINDEX('cookies', @String, 0))-(CHARINDEX('eaten', @String, 0)+6) = 10

So even if the length varies and we don’t know exactly where the starting position of our desired string is, we can use CHARINDEX in conjunction with SUBSTRING to retrieve our reference values dynamically.

-B

No comments:

Post a Comment