Google Apps domain administrators can use the Email Audit API to download mailbox accounts for audit purposes in accordance with the Customer Agreement. To improve the security of the data retrieved, the service creates a PGP-encrypted copy of the mailbox which can only be decrypted by providing the corresponding RSA key.
When decrypted, the exported mailbox will be in mbox format, a standard file format used to represent collections of email messages. The mbox format is supported by many email clients, including Mozilla Thunderbird and Eudora.
If you don’t want to install a specific email client to check the content of exported mailboxes, or if you are interested in automating this process and integrating it with your business logic, you can also programmatically access mbox files.
You could fairly easily write a parser for the simple, text-based mbox format. However, some programming languages have native mbox support or libraries which provide a higher-level interface. For example, Python has a module called mailbox that exposes such functionality, and parsing a mailbox with it only takes a few lines of code:
import mailbox def print_payload(message): # if the message is multipart, its payload is a list of messages if message.is_multipart(): for part in message.get_payload(): print_payload(part) else: print message.get_payload(decode=True) mbox = mailbox.mbox('export.mbox') for message in mbox: print message['subject'] print_payload(message)
Let me know your favorite way to parse mbox-formatted files by commenting on Google+.
For any questions related to the Email Audit API, please get in touch with us on the Google Apps Domain Info and Management APIs forum.