The previous post presented a User Defined Function (UDF) using regular expressions to extract all numbers, or a selected number, from any string (or a list of strings). In this post I will look more closely at the code used to achieve this, and present a general purpose UDF allowing regular expressions to be used on strings. A spreadsheet with examples of both functions, and full open-source code, can be downloaded from RegExpres.xlsb.

First the Regular Expression object must be created and given the appropriate criteria to return all numbers (0-9), the decimal character (“.”), the exponent symbol (“E”), and the plus and minus symbols (“+-“). It is also set to return all matching strings (Global = True) and to ignore the case of alphabetic characters (IgnoreCase = True):

Const ExpString As String = "E", pmString As String = "+-"
'Initialize variables
Set RE = CreateObject("VBScript.RegExp")
'Criteria for Regular Expression
REPatt = "[0-9" & DecString & ExpString & pmString & "]"
With RE
.Pattern = REPatt
.Global = True
.IgnoreCase = True
End With

There follows some code to convert the input range to a variant array and count the number of rows; the numbers in each string are then extracted with the line:

Set Matches = RE.Execute(NumString)

As discussed in the previous post, for a string containing a single value, that is all that is required, but if there is more than one number all the number strings will be concatenated, and must be separated. This can be done by stepping through the original string and comparing the characters with those in the Matches variable, as shown below:

'Loop Matches collection and compare with each character of original string to build strings of all numbers in the sample string
strNumber = ""
NumMatches = Matches.Count
NumChar = Len(NumString)
j = 0
k = 1
PrevMatch = False
For i = 1 To NumChar
MatchString = Mid(NumString, i, 1)
If PrevMatch = False And UCase(MatchString) = ExpString Then
j = j + 1
ElseIf PrevMatch = False And (MatchString = "+" Or MatchString = "-") And _
IsNumeric(Mid(NumString, i + 1, 1)) = False Then
j = j + 1
ElseIf MatchString <> IgnoreString Then
Digit = Matches(j)
If Digit = MatchString Then
strNumber = strNumber + Matches(j)
j = j + 1
PrevMatch = True
ElseIf PrevMatch = True Then
If Position = 0 Then
OutA(Row, k) = CDbl(strNumber)
ElseIf Position = k Then
OutA(Row, 1) = CDbl(strNumber)
Exit For
End If
strNumber = ""
k = k + 1
PrevMatch = False
End If
End If
Next i
If PrevMatch = True Then
If Position = 0 Then
OutA(Row, k) = CDbl(strNumber)
ElseIf k = Position Then
OutA(Row, 1) = CDbl(strNumber)
End If
End If
If k < Position Then OutA(Row, 1) = CVErr(xlErrNA)
Next Row

Regular Expressions also have many other uses, and the Reg_Exp UDF allows them to be applied quickly and easily. The Regular Expression object has the methods “List”, “Test”, and “Replace”, and the screen-shots below show the use of these methods on a list of strings:

Extract all numbers

Extract all occurrences of the string “Node”

Find strings containing “Node”

Replace “Node” with “Number”

Find all four letter strings starting with “n” and ending with “e”

For more details of setting up regular expression patterns see:

VBA RegEx: How to Evaluate and Use “Regular Expressions”

Finally note that if you want to apply these UDFs to a range they must be entered as an Array Function to return all the results.