Getting data out of IMAP or Gmail with python

I have a contact form that has been sending me data via email. It was quick and dirty and I didn’t expect to get nearly as many responses as I did. The data really needs to be in a spreadsheet but the thought of manually doing this bothered me so I wrote some python code to do it. Here it is. Note that:

  • it is quick and dirty
  • change the gmail user name to whatever, this should work on any imap server so change server name if not gmail
  • change the name of the imap folder containing the messages - mine go into a folder called P/package-registration
  • once I’ve downloaded a batch of messages I mark them as read using my email client so that they won’t download again when the script runs (that’s what the “UNSEEN” bit is for)
import imaplib, base64, re, getpass

def parseline(line):
    data = line.split('||||')

    results = {'product': '', 'fname': '', 'lname': '', 'company': '', 'email': '', 'phone': ''}

    for i in data:
        k = re.match(r"A new (?P<product>[^ ]+) registration", i)

        if k:
            results['product'] = k.group(1)

        elif i:
            j = i.split(': ', 1)

            if len(j) == 2:
                results[j[0]] = j[1]

    return results


mail = imaplib.IMAP4_SSL('imap.gmail.com', 993)

mail.login('yourloginname', getpass.getpass())
log = open('emaillog.txt', 'w')

log.write('product\tfname\tlname\tcompany\temail\tphone\n')

mail.select('P/package-registration')
typ, data = mail.search(None, 'UNSEEN')

for num in data[0].split():
    typ, data = mail.fetch(num, '(RFC822)')

    body = data[0][1].split('\r\n')

    b = False
    msg = ''
    for line in body:

        if not line and not b: b = True

        if b: msg += line
        email = base64.b64decode(msg)

    e2 = email.replace('\r\n\r\n', '||||').replace('\r\n', '||||').replace(',||||','||||')

    e3 = parseline(e2)
    try:
        log.write('%s\t%s\t%s\t%s\t%s\t%s\n' % (e3['product'], e3['fname'], e3['lname'], e3['company'], e3['email'], e3['phone']))

    except KeyError:
        print e3

log.close()

mail.close()

I’m not proud of the messy code but at least I can justify it… I started out using a simple regex, then I realized that the fields in the email sometimes were in different order and my regex started growing big and ugly. Then the project started taking too long and I just wanted to get it done. Voila, you get a tangled mess.

Now if I could just get a system in place for pasting code to my blog in a way that comes out looking pretty…

Comments

Post new comment

The content of this field is kept private and will not be shown publicly.
  • You can use Markdown syntax to format and style the text.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
8 + 3 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.

Back to top