Automate E-mail Information Extraction from Outlook into Excel Using VBA

In this scenario, assume that well-formatted e-mails containing information to be added to an Excel sheet are received. By using VBA to automate the data extract from Outlook directly into Excel, the data quality/accuracy is improved and it’s significantly faster over large volumes of e-mail vs. a human performing a copy/paste from each e-mail.

I’ll use a simple example to demonstrate the code. I receive account creation e-mails and I need to track the names and attributes associated with those accounts in an Excel worksheet. Each e-mail has the Subject “Template Email Subject” and the content follows the format “The account for <Full Name> has been created.” where <Full Name> represents the full name associated with the created account. The e-mail also contains a structured table with a few additional attributes. Using the sample e-mail below, the macro extracts the Full Name, e.g. John Dalesandro, and adds it to an Excel worksheet. The macro also extracts the additional information about the user such as Last Name, First Name, Name, Department, Company, and Job Title from the structured table.

This example will use two different methods to extract the information. The text-based extract is performed using a combination of REPLACE and SPLIT functions assuming there is common text surrounding the information to extract while the structured table data is extracted using HTML object library functions.

Sample E-mail Structure
Sample E-mail Structure

Outlook Configuration

In Outlook, I create a sub-folder under the default Inbox named “Automation” to collect the relevant e-mails. An Outlook rule may be used to automatically move the relevant e-mails into this folder. By having the e-mails located in a separate folder, it helps reduce the VBA search effort. This folder name “Automation” is used in the VBA code to reference the correct folder. The code assumes the folder exists under the default inbox. If the folder is created elsewhere, then the code would obviously need to be adjusted.

Outlook Folder Structure
Outlook Folder Structure

Enable Microsoft Outlook Object Library and Microsoft HTML Object Library

To use this function, you must enable both the Microsoft Outlook 16.0 Object Library which is needed to access Outlook and the Microsoft HTML Object Library which is used to extract HTML table based information. The version number will vary based on the installed version of Microsoft Excel.

  1. On the Developer ribbon, click Visual Basic
  2. Once the Visual Basic window opens, go to Tools -> References…
  3. From the References dialog box, check/enable:
    • Microsoft Outlook 16.0 Object Library
    • Microsoft HTML Object Library
  4. Click the OK button
Microsoft Excel Visual Basic Tools References
Microsoft Excel Visual Basic Tools References

Add Source Code to VBA Project

The code is added to “ThisWorkbook” within the VBA Project. For the example, I am using the default worksheet named “Sheet1” to capture the full name extract from the e-mails.

VBA Project Structure
VBA Project Structure

Run Macro

From the Developer ribbon, click Macros. The Macro dialog box opens and the ExtractDetailsFromOutlookFolderEmails procedure is listed. Click the Run button to execute the procedure.

Run VBA Macro in Excel
Run VBA Macro in Excel

Result

After executing the macro, the worksheet is populated with both the information extracted through the text-based method and the table-based method. This example only used one e-mail to test, but if multiple e-mails were in the Outlook folder, the sheet would reflect an additional row per e-mail.

Information Extract Result
Information Extract Result

Source Code

This code has been tested in Microsoft Office Professional Plus 2019. Again, there are some hardcoded values that would need to be changed, e.g. the structure of the e-mail template.

Using the example e-mail above, the following code connects to Outlook and obtains a reference to the folder named “Automation” in the default inbox.

It then loops through all of the items in the folder looking for any items with the phrase “Template Email Subject” in the Subject.

For the text-based information extract, the code takes the item’s Body and replaces the text phrases “The account for” and “has been created.” with “####################”. Replacing those phrases with a common delimiter makes it easier to extract the text found between the delimiters. As an example, the newly delimited Body would appear as follows after the replacing the phrases:

############## Dalesandro, John ##############

Now that the Body has been properly delimited, the Split function is used to extract the text found between the delimiters. Surrounding whitespace, line feeds, form feeds, and carriage returns are removed using through a combination of Trim and Replace functions.

The cleaned text is then written to “Sheet1” in Column A.

For the HTML table-based information extract, the code obtains a reference to the HTML table tag in the source data through the getElementsByTagName. The code then loops through the table rows and columns to pull all information in the second column. The first column is ignored since it contains information headers.

Option Explicit

Public Sub ExtractDetailsFromOutlookFolderEmails()
  On Error GoTo errorHandler
  
  Dim olApp As Outlook.Application
  Dim olNameSpace As Namespace
  Dim olFolder As Folder
  Dim olItem As Object
  Dim olItemBody As String
  Dim olHTML As MSHTML.HTMLDocument
  Dim olElementCollection As MSHTML.IHTMLElementCollection
  Dim htmlElementTable As IHTMLElement
  Dim iTableRow As Integer
  Dim iTableColumn As Integer
  Dim wksList As Worksheet
  Dim iRow As Integer
  Dim iColumn As Integer
  Dim strFullName As String
  
  Set wksList = ThisWorkbook.Sheets("Sheet1")
  
  Set olApp = CreateObject("Outlook.Application")
  Set olNameSpace = olApp.GetNamespace("MAPI")
  Set olFolder = olNameSpace.GetDefaultFolder(olFolderInbox).Folders("Automation")
  
  iRow = 1

  For Each olItem In olFolder.Items
    If InStr(olItem.Subject, "Template Email Subject") > 0 Then
      iColumn = 1
      
      'Begin extracting the full name from the e-mail body text:  The account for LAST NAME, FIRST NAME has been created.
      olItemBody = Replace(olItem.body, "The account for", "####################")
      olItemBody = Replace(olItemBody, "has been created.", "####################")
      
      strFullName = Trim(Replace(Replace(Replace(Split(olItemBody, "####################")(1), Chr(10), ""), Chr(12), ""), Chr(13), ""))
  
      wksList.Cells(iRow, iColumn) = strFullName
      'End full name extract.
      
      'Begin extracting table data.
      Set olHTML = New MSHTML.HTMLDocument

      olHTML.body.innerHTML = olItem.HTMLBody
      
      Set olElementCollection = olHTML.getElementsByTagName("table")
      
      For Each htmlElementTable In olElementCollection
        For iTableRow = 0 To htmlElementTable.Rows.Length - 1
          For iTableColumn = 0 To htmlElementTable.Rows(iTableRow).Cells.Length - 1
            On Error Resume Next
            'Only extract data from the second column in the table since the first column is a header.
            If iTableColumn Mod 2 = 1 Then
              iColumn = iColumn + 1
              wksList.Cells(iRow, iColumn).Value = htmlElementTable.Rows(iTableRow).Cells(iTableColumn).innerText
            End If
            On Error GoTo errorHandler
          Next iTableColumn
        Next iTableRow
      Next htmlElementTable
      'End extracting table data.
      
      iRow = iRow + 1
    End If
  Next olItem
  
errorHandler:
  If Err.Number <> 0 Then
    MsgBox "Error extracting details. [ExtractDetailsFromOutlookFolderEmails]"
  End If
End Sub

Retrieve Outlook Address Book Data Using Custom Excel VBA Function

I have an Excel worksheet containing thousands of rows of names. For each name, I need to add the individual’s job title as listed in the Outlook address book. I came across several solutions that would loop through the worksheet and add every available attribute from Outlook to the corresponding row. These solutions were all implemented as a sub procedure that operated across the entire worksheet. Instead of a sub procedure and looping, I implemented this as a generic function. This allows for more precision with respect to the attributes/properties to retrieve from Outlook and avoids operating on the entire worksheet when a name is added/changed.

The following custom VBA function will query the Outlook address book given a specific name/alias and return the requested attribute/property value.

To use this function, you must enable Microsoft Outlook 16.0 Object Library. The version number will vary based on the installed version of Microsoft Excel.

  1. On the Developer ribbon, click Visual Basic
  2. Once the Visual Basic window opens, go to Tools -> References…
  3. From the References dialog box, check/enable Microsoft Outlook 16.0 Object Library
  4. Click the OK button
Microsoft Excel Visual Basic Tools References - Microsoft Outlook Object Library
Microsoft Excel Visual Basic Tools References – Microsoft Outlook Object Library

Now that the Microsoft Outlook reference is activated, insert a new module into the workbook if one doesn’t already exist. The code will not work if added directly to a worksheet object.

Add Module to Microsoft Excel VBA Project
Add Module to Microsoft Excel VBA Project

Add the below code to the module. The function is then a usable formula in the workbook.

Usage

Once the code is properly added to the workbook, the function may be called as part of an Excel formula.

=GetOutlookAddressBookProperty(alias, propertyName)

alias accepts any string value, but the function works best if the value is a uniquely resolvable attribute of an address book entry (full name, alias name, alias, etc.). The function fails if it is unable to resolve to an individual entry in the address book.

propertyName accepts the following values:

  • Job Title
  • Company Name
  • Department
  • Name
  • First Name
  • Last Name

There are many other properties available as part of the ExchangeUser, however, I’ve only exposed a handful through this function. The code may be easily modified to include additional properties as needed.

Examples

Assume the Outlook address list includes an entry with the following attributes:

  • Last Name: Dalesandro
  • First Name: John
  • Name: Dalesandro, John
  • Department: Super Important Button Clickers
  • Company Name: Widgets International
  • Job Title: Supervisor
Example Outlook Address Book Entry Contact Information
Example Outlook Address Book Entry Contact Information
=GetOutlookAddressBookProperty("Dalesandro, John", "Job Title")

Returns the value “Supervisor”.

=GetOutlookAddressBookProperty("Dalesandro, John", "Last Name")

Returns the value “Dalesandro”.

VBA Function Results - Information Pulled from Outlook Address Book (Formula)
VBA Function Results – Information Pulled from Outlook Address Book (Formula)
VBA Function Results - Information Pulled from Outlook Address Book (Results)
VBA Function Results – Information Pulled from Outlook Address Book (Results)

Source Code

Option Explicit

Public Function GetOutlookAddressBookProperty(alias As String, propertyName As String) As Variant
  On Error GoTo errorHandler

  Dim olApp As Outlook.Application
  Dim olNameSpace As Namespace
  Dim olRecipient As Outlook.Recipient
  Dim olExchUser As Outlook.ExchangeUser
  Dim olContact As Outlook.AddressEntry

  Set olApp = CreateObject("Outlook.Application")
  Set olNameSpace = olApp.GetNamespace("MAPI")
  Set olRecipient = olNameSpace.CreateRecipient(LCase(Trim(alias)))

  olRecipient.Resolve

  If olRecipient.Resolved Then
    Set olExchUser = olRecipient.AddressEntry.GetExchangeUser

    If Not olExchUser Is Nothing Then
      'Attempt to extract information from Exchange
      GetOutlookAddressBookProperty = Switch(propertyName = "Job Title", olExchUser.JobTitle, _
                                             propertyName = "Company Name", olExchUser.CompanyName, _
                                             propertyName = "Department", olExchUser.Department, _
                                             propertyName = "Name", olExchUser.Name, _
                                             propertyName = "First Name", olExchUser.FirstName, _
                                             propertyName = "Last Name", olExchUser.LastName)
    Else
      'If Exchange not available, then attempt to extract information from local Contacts
      Set olContact = olRecipient.AddressEntry
      
      If Not olContact Is Nothing Then
        GetOutlookAddressBookProperty = Switch(propertyName = "Job Title", olContact.GetContact.JobTitle, _
                                               propertyName = "Company Name", olContact.GetContact.CompanyName, _
                                               propertyName = "Department", olContact.GetContact.Department, _
                                               propertyName = "Name", olContact.GetContact.FullName, _
                                               propertyName = "First Name", olContact.GetContact.FirstName, _
                                               propertyName = "Last Name", olContact.GetContact.LastName)
      Else
        GetOutlookAddressBookProperty = CVErr(xlErrNA)
      End If
    End If
  Else
    GetOutlookAddressBookProperty = CVErr(xlErrNA)
  End If

errorHandler:
  If Err.Number <> 0 Then
    GetOutlookAddressBookProperty = CVErr(xlErrNA)
  End If
End Function

Bash Script to Automatically Backup Amazon Lightsail WordPress Server to Google Drive

The intent of this Bash script is to backup the WordPress installation directory, the MySQL database, and a handful of Apache/PHP/MySQL configuration files to Google Drive on a regular basis in an encrypted archive. While this script references paths/files specific to a Bitnami based Amazon Lightsail instance, it can be easily modified to any other Linux based environment. I fully understand that AWS has an automatic snapshot/backup capability, but I want to use Google Drive and I don’t want to incur any additional costs since AWS snapshots are not free.

Usage

If no options are specified when executing this script it will backup all MySQL databases accessible to the user specified in the .mysqldump.cnf file. No other files will be included in the backup. All files are packaged in a tar.gz file and encrypted using AES-256.

  • -f (full backup) includes both the MySQL dump as well as other files specified in the script.
  • -e is used to encrypt the backup archive. If this flag is set, then a password must be supplied in the .backup.cnf file.
  • -n x sets the number of files to retain on Google Drive in the upload directory (x is an integer value). The default is 30 files.
  • -d sets the parent folder where the backup file is uploaded on Google Drive. If the folder doesn’t exist, then the script will create it. The default folder name is “backup”.
  • -p sets the filename prefix of the backup file. The script will use the prefix followed by the date/time to set the full filename. The default filename prefix is “backup”.
  • -m is used to specify the database name to backup. The default is all databases.

Examples

This example uses all of the available options and will create a full encrypted backup of both MySQL and other files. The backup file is named backup-example-com (with a date/time suffix) and uploaded it to a folder named backup-full-example-com on Google Drive. If the specified folder contains more than 5 files, the script deletes any files beyond the 5 based on a filename sort.

./backup.sh -e -f -n 5 -d backup-full-example-com -p backup-example-com -m db_example_com_wp_prd

Installation

As a personal preference, I store user scripts and associated configuration files in a subdirectory under home named “scripts”.

The following command creates the “scripts” subdirectory.

mkdir -m 700 ~/scripts

Next, create a file named .mysqldump.cnf. This file will be used by the mysqldump command in the backup script to pass the MySQL database user and password to dump the WordPress database.

touch ~/scripts/.mysqldump.cnf
chmod 600 ~/scripts/.mysqldump.cnf
vim ~/scripts/.mysqldump.cnf

In the vim editor, add the database username and password with access to dump all of the required tables and data for a backup. Apostrophes surrounding the password are needed if any special characters are used in the actual password.

[mysqldump]
user=username
password='password'

The following commands create the .backup.cnf file which contains a secret key used by the script to encrypt the backup file.

touch ~/scripts/.backup.cnf
chmod 600 ~/scripts/.backup.cnf
vim ~/scripts/.backup.cnf

In the vim editor, add a secret key/password to the file.

pass=passsword

Now we’ll create the actual backup script. Please be sure to modify any referenced paths and constants to your own environment.

touch ~/scripts/backup.sh
chmod 700 ~/scripts/backup.sh
vim ~/scripts/backup.sh

Source Code

The following is the full Bash script. Please modify any constants/paths for your own environment and requirements.

#!/bin/bash

getConfigVal() {
  echo `grep ${1} "${2}" | cut -d '=' -f 2`
}

dirname_temp_backup="automated_backup"
filename_prefix_backup="backup"
filename_suffix_backup="_$(date +'%Y_%m_%d_%H_%M')"
db_name="--all-databases"
gdrive_backup_dirname="backup"
num_files_to_retain=30
path_script="$(dirname "$(readlink -f "$0")")"
path_mysqldump="$(which mysqldump)"
path_gdrive="$(which drive)"
path_backup_cnf="${path_script}/.backup.cnf"
path_mysqldump_cnf="${path_script}/.mysqldump.cnf"
path_backup="${HOME}/${dirname_temp_backup}"

if [ -z "${path_script}" ] || [ -z "${path_mysqldump}" ] || [ -z "${path_gdrive}" ] || [ -z "${path_backup_cnf}" ] || [ -z "${path_mysqldump_cnf}" ] || [ -z "${path_backup}" ]
then
  echo "ERROR: One or more required path variable(s) are undefined or missing."
  exit 1
fi

while getopts efn:d:p:m: flag
do
  case "${flag}" in
    e)
      bool_encrypt_backup=1
      encryption_key=$(getConfigVal "pass" "${path_backup_cnf}")

      if [ -z "${encryption_key}" ]
      then
        echo "ERROR: Encryption key not found."
        exit 1
      fi
      ;;
    f)
      bool_full_backup=1
      ;;
    n)
      if ! [[ "${OPTARG}" =~ ^[0-9]+$ ]]
      then
        echo "WARNING: Number of backups to retain must be an integer. Reverting to default value (${num_files_to_retain})."
      else
        num_files_to_retain="${OPTARG}"
      fi
      ;;
    d)
      gdrive_backup_dirname="${OPTARG}"
      ;;
    p)
      filename_prefix_backup="${OPTARG}"
      ;;
    m)
      db_name="--databases ${OPTARG}"
      ;;
  esac
done

num_files_to_retain=$((num_files_to_retain+1))

sudo rm -rf "${path_backup}"
sudo rm "${HOME}/${filename_prefix_backup}"*.tar.gz

mkdir "${path_backup}"

if [ -n "${bool_full_backup}" ]
then
  find /www -type d -not -path '*/cache/*' -exec mkdir -p "${path_backup}"/{} \;
  find /www -type f -not -path '*/cache/*' -exec sudo cp '{}' "${path_backup}"/{} \;
  cp /opt/bitnami/apache2/conf/vhosts/example-com-https-vhost.conf "${path_backup}"
  cp /opt/bitnami/apache2/conf/vhosts/example-com-vhost.conf "${path_backup}"
  cp /opt/bitnami/apache/conf/httpd.conf "${path_backup}"
  cp /opt/bitnami/apache/conf/bitnami/bitnami-ssl.conf "${path_backup}"
  cp /opt/bitnami/php/etc/php.ini "${path_backup}"
  cp /opt/bitnami/mysql/conf/my.cnf "${path_backup}"
fi

"${path_mysqldump}" --defaults-extra-file="${path_mysqldump_cnf}" ${db_name} --no-tablespaces -ce > "${path_backup}/mysqldump.sql"

touch "${HOME}/${filename_prefix_backup}.tar.gz"

chmod 600 "${HOME}/${filename_prefix_backup}.tar.gz"

sudo tar -czf "${HOME}/${filename_prefix_backup}.tar.gz" -C "${path_backup}" .

if [ -n "${bool_encrypt_backup}" ]
then
  touch "${HOME}/${filename_prefix_backup}_enc.tar.gz"

  chmod 600 "${HOME}/${filename_prefix_backup}_enc.tar.gz"

  openssl enc -e -a -md sha512 -pbkdf2 -iter 100000 -salt -AES-256-CBC -pass "pass:${encryption_key}" -in "${HOME}/${filename_prefix_backup}.tar.gz" -out "${HOME}/${filename_prefix_backup}_enc.tar.gz"

  sudo rm "${HOME}/${filename_prefix_backup}.tar.gz"
  mv "${HOME}/${filename_prefix_backup}_enc.tar.gz" "${HOME}/${filename_prefix_backup}.tar.gz"
fi

mv "${HOME}/${filename_prefix_backup}.tar.gz" "${HOME}/${filename_prefix_backup}${filename_suffix_backup}.tar.gz"

folder_id=`"${path_gdrive}" list -m 1000 -n -q 'trashed = false' | grep -m 1 "${gdrive_backup_dirname}" | head -1 | cut -d ' ' -f1`

if [ -z "${folder_id}" ]
then
  "${path_gdrive}" folder -t "${gdrive_backup_dirname}"
  folder_id=`"${path_gdrive}" list -m 1000 -n -q 'trashed = false' | grep -m 1 "${gdrive_backup_dirname}" | head -1 | cut -d ' ' -f1`
fi

"${path_gdrive}" upload --file "${HOME}/${filename_prefix_backup}${filename_suffix_backup}.tar.gz" --parent "${folder_id}"

sudo rm -rf "${path_backup}"
sudo rm "${HOME}/${filename_prefix_backup}"*.tar.gz

expired_file_ids=`"${path_gdrive}" list -m 1000 -n -q "'${folder_id}' in parents and trashed = false and mimeType != 'application/vnd.google-apps.folder'" | sort -k 2r | tail -n +"${num_files_to_retain}" | cut -d ' ' -f1`

if [ -n "${expired_file_ids}" ]
then
  while read -r file_id; do
    "${path_gdrive}" delete --id "${file_id}"
  done <<< "${expired_file_ids}"
fi

Google Drive

In order for the script to communicate with Google Drive, it has a dependency on a freely available binary on GitHub. The following commands download the binary to a new subdirectory under home named “bin”.

cd ~
mkdir -m 700 ~/bin
cd ~/bin
wget -O drive https://drive.google.com/uc?id=0B3X9GlR6Embnb095MGxEYmJhY2c
chmod 500 ~/bin/drive

Execute the binary once it is downloaded and follow the directions to grant it permission to Google Drive.

~/bin/drive

Go to the following link in your browser:

Enter verification code:

Scheduling

Now that the script and supporting files are set up, the following commands are used to schedule the script in the user crontab. The following example schedules the script to run a full encrypted backup once per month on the 1st of the month at 1 AM and a database only encrypted backup runs every day at 2 AM. The full backup is uploaded to a Google Drive folder named “backup_full-example-com” and the five most recent backups are retained. The database only backup is uploaded to a Google Drive folder named “backup_db_only-example-com” and the thirty most recent backups are retained. Since the script depends on other binaries, e.g. mysqldump and drive, please make sure cron has access to the appropriate user path. If it doesn’t have access to the appropriate path, then add it to the crontab. Use the command env $PATH todisplay the current user path and replace the example below with the appropriate path for the environment.

env $PATH

crontab -e

PATH=$PATH:/home/bitnami/bin:/opt/bitnami/mysql/bin

0 1 * * 1 /home/bitnami/scripts/backup.sh -e -f -n 5 -d backup_full-example-com
0 2 * * * /home/bitnami/scripts/backup.sh -e -n 30 -d backup_db_only-example-com

crontab -l

Decrypting

Let’s assume everything is executing as expected and Google Drive is populated with backups from the script. Download a backup from Google Drive to your local machine. Since the file is encrypted, you won’t be able to open it directly. In order to decrypt the file, OpenSSL is used with the -d flag and the same options that were used to encrypt it. Specify the filename of the downloaded encrypted file with the -in option and the filename of resulting decrypted file in the -out option. When prompted, please be sure to enter the same password used by the script to encrypt the file (found in the .backup.cnf file). If you are downloading/decrypting on a Windows machine, you will need to install OpenSSL.

openssl enc -d -a -md sha512 -pbkdf2 -iter 100000 -salt -AES-256-CBC -in "c:\backup_full-example-com_2021_03_23_01_00.tar.gz" -out "c:\decrypted_backup_full-example-com_2021_03_23_01_00.tar.gz"

The resulting decrypted file is now readable as a regular tar.gz file.