Automate E-mail Information Extraction from Outlook into Excel Using VBA

In this scenario, assume that well-formatted e-mails containing information to be added to an Excel sheet are received. By using VBA to automate the data extract from Outlook directly into Excel, the data quality/accuracy is improved and it’s significantly faster over large volumes of e-mail vs. a human performing a copy/paste from each e-mail.

I’ll use a simple example to demonstrate the code. I receive account creation e-mails and I need to track the names and attributes associated with those accounts in an Excel worksheet. Each e-mail has the Subject “Template Email Subject” and the content follows the format “The account for <Full Name> has been created.” where <Full Name> represents the full name associated with the created account. The e-mail also contains a structured table with a few additional attributes. Using the sample e-mail below, the macro extracts the Full Name, e.g. John Dalesandro, and adds it to an Excel worksheet. The macro also extracts the additional information about the user such as Last Name, First Name, Name, Department, Company, and Job Title from the structured table.

This example will use two different methods to extract the information. The text-based extract is performed using a combination of REPLACE and SPLIT functions assuming there is common text surrounding the information to extract while the structured table data is extracted using HTML object library functions.

Sample E-mail Structure
Sample E-mail Structure

Outlook Configuration

In Outlook, I create a sub-folder under the default Inbox named “Automation” to collect the relevant e-mails. An Outlook rule may be used to automatically move the relevant e-mails into this folder. By having the e-mails located in a separate folder, it helps reduce the VBA search effort. This folder name “Automation” is used in the VBA code to reference the correct folder. The code assumes the folder exists under the default inbox. If the folder is created elsewhere, then the code would obviously need to be adjusted.

Outlook Folder Structure
Outlook Folder Structure

Enable Microsoft Outlook Object Library and Microsoft HTML Object Library

To use this function, you must enable both the Microsoft Outlook 16.0 Object Library which is needed to access Outlook and the Microsoft HTML Object Library which is used to extract HTML table based information. The version number will vary based on the installed version of Microsoft Excel.

  1. On the Developer ribbon, click Visual Basic
  2. Once the Visual Basic window opens, go to Tools -> References…
  3. From the References dialog box, check/enable:
    • Microsoft Outlook 16.0 Object Library
    • Microsoft HTML Object Library
  4. Click the OK button
Microsoft Excel Visual Basic Tools References
Microsoft Excel Visual Basic Tools References

Add Source Code to VBA Project

The code is added to “ThisWorkbook” within the VBA Project. For the example, I am using the default worksheet named “Sheet1” to capture the full name extract from the e-mails.

VBA Project Structure
VBA Project Structure

Run Macro

From the Developer ribbon, click Macros. The Macro dialog box opens and the ExtractDetailsFromOutlookFolderEmails procedure is listed. Click the Run button to execute the procedure.

Run VBA Macro in Excel
Run VBA Macro in Excel

Result

After executing the macro, the worksheet is populated with both the information extracted through the text-based method and the table-based method. This example only used one e-mail to test, but if multiple e-mails were in the Outlook folder, the sheet would reflect an additional row per e-mail.

Information Extract Result
Information Extract Result

Source Code

This code has been tested in Microsoft Office Professional Plus 2019. Again, there are some hardcoded values that would need to be changed, e.g. the structure of the e-mail template.

Using the example e-mail above, the following code connects to Outlook and obtains a reference to the folder named “Automation” in the default inbox.

It then loops through all of the items in the folder looking for any items with the phrase “Template Email Subject” in the Subject.

For the text-based information extract, the code takes the item’s Body and replaces the text phrases “The account for” and “has been created.” with “####################”. Replacing those phrases with a common delimiter makes it easier to extract the text found between the delimiters. As an example, the newly delimited Body would appear as follows after the replacing the phrases:

############## Dalesandro, John ##############

Now that the Body has been properly delimited, the Split function is used to extract the text found between the delimiters. Surrounding whitespace, line feeds, form feeds, and carriage returns are removed using through a combination of Trim and Replace functions.

The cleaned text is then written to “Sheet1” in Column A.

For the HTML table-based information extract, the code obtains a reference to the HTML table tag in the source data through the getElementsByTagName. The code then loops through the table rows and columns to pull all information in the second column. The first column is ignored since it contains information headers.

Option Explicit

Public Sub ExtractDetailsFromOutlookFolderEmails()
  On Error GoTo errorHandler
  
  Dim olApp As Outlook.Application
  Dim olNameSpace As Namespace
  Dim olFolder As Folder
  Dim olItem As Object
  Dim olItemBody As String
  Dim olHTML As MSHTML.HTMLDocument
  Dim olElementCollection As MSHTML.IHTMLElementCollection
  Dim htmlElementTable As IHTMLElement
  Dim iTableRow As Integer
  Dim iTableColumn As Integer
  Dim wksList As Worksheet
  Dim iRow As Integer
  Dim iColumn As Integer
  Dim strFullName As String
  
  Set wksList = ThisWorkbook.Sheets("Sheet1")
  
  Set olApp = CreateObject("Outlook.Application")
  Set olNameSpace = olApp.GetNamespace("MAPI")
  Set olFolder = olNameSpace.GetDefaultFolder(olFolderInbox).Folders("Automation")
  
  iRow = 1

  For Each olItem In olFolder.Items
    If InStr(olItem.Subject, "Template Email Subject") > 0 Then
      iColumn = 1
      
      'Begin extracting the full name from the e-mail body text:  The account for LAST NAME, FIRST NAME has been created.
      olItemBody = Replace(olItem.body, "The account for", "####################")
      olItemBody = Replace(olItemBody, "has been created.", "####################")
      
      strFullName = Trim(Replace(Replace(Replace(Split(olItemBody, "####################")(1), Chr(10), ""), Chr(12), ""), Chr(13), ""))
  
      wksList.Cells(iRow, iColumn) = strFullName
      'End full name extract.
      
      'Begin extracting table data.
      Set olHTML = New MSHTML.HTMLDocument

      olHTML.body.innerHTML = olItem.HTMLBody
      
      Set olElementCollection = olHTML.getElementsByTagName("table")
      
      For Each htmlElementTable In olElementCollection
        For iTableRow = 0 To htmlElementTable.Rows.Length - 1
          For iTableColumn = 0 To htmlElementTable.Rows(iTableRow).Cells.Length - 1
            On Error Resume Next
            'Only extract data from the second column in the table since the first column is a header.
            If iTableColumn Mod 2 = 1 Then
              iColumn = iColumn + 1
              wksList.Cells(iRow, iColumn).Value = htmlElementTable.Rows(iTableRow).Cells(iTableColumn).innerText
            End If
            On Error GoTo errorHandler
          Next iTableColumn
        Next iTableRow
      Next htmlElementTable
      'End extracting table data.
      
      iRow = iRow + 1
    End If
  Next olItem
  
errorHandler:
  If Err.Number <> 0 Then
    MsgBox "Error extracting details. [ExtractDetailsFromOutlookFolderEmails]"
  End If
End Sub