Using Excel formulas to parse a substring from a cell value is usually difficult especially if the cell value or pattern is not easily defined. I have found that it requires various combinations of SEARCH, FIND, LEFT, RIGHT, MID, etc. The resulting formula is both unreadable and it does not successfully handle all cell values thus resulting in manual clean up. It would be nice if the SEARCH or FIND functions were able to use regular expressions by default.
The following VBA function allows you to parse a cell value based on a regular expression. To use this function, you must enable Microsoft VBScript Regular Expressions 5.5.
- On the Developer ribbon, click Visual Basic
- Once the Visual Basic window opens, go to Tools -> References…
- From the References dialog box, check/enable Microsoft Regular Expressions 5.5
- Click the OK button
Now that the regular expression reference is activated, insert a new module into the workbook if one doesn’t already exist. The code will not work if added directly to a worksheet object.
Add the below code to the module. You can then use it as part of formulas in the workbook, e.g. =RegExParse(A1,”ID[0-9]+”). This would parse the value in cell A1 returning the substring matching the pattern ID[0-9]+, e.g. ID0, ID1234, ID183274917234. I’ve also included an regular expression example where a slightly more complex pattern,e.g. US telephone numbers, is matched and extracted from a cell value.
Option Explicit Public Function RegExParse(val As String, SearchPattern As String) As String On Error GoTo errorHandler Dim regEx As New RegExp With regEx .Global = True .MultiLine = True .IgnoreCase = True .Pattern = SearchPattern End With If regEx.Test(val) Then RegExParse = regEx.Execute(val)(0) Else RegExParse = CVErr(xlErrNA) End If errorHandler: If Err.Number <> 0 Then RegExParse = CVErr(xlErrNA) End If End Function