Getting data out of IMAP or Gmail with python
I have a contact form that has been sending me data via email. It was quick and dirty and I didn’t expect to get nearly as many responses as I did. The data really needs to be in a spreadsheet but the thought of manually doing this bothered me so I wrote some python code to do it. Here it is. Note that:
- it is quick and dirty
- change the gmail user name to whatever, this should work on any imap server so change server name if not gmail
- change the name of the imap folder containing the messages - mine go into a folder called P/package-registration
- once I’ve downloaded a batch of messages I mark them as read using my email client so that they won’t download again when the script runs (that’s what the “UNSEEN” bit is for)
import imaplib, base64, re, getpass def parseline(line): data = line.split('||||') results = {'product': '', 'fname': '', 'lname': '', 'company': '', 'email': '', 'phone': ''} for i in data: k = re.match(r"A new (?P<product>[^ ]+) registration", i) if k: results['product'] = k.group(1) elif i: j = i.split(': ', 1) if len(j) == 2: results[j[0]] = j[1] return results mail = imaplib.IMAP4_SSL('imap.gmail.com', 993) mail.login('yourloginname', getpass.getpass()) log = open('emaillog.txt', 'w') log.write('product\tfname\tlname\tcompany\temail\tphone\n') mail.select('P/package-registration') typ, data = mail.search(None, 'UNSEEN') for num in data[0].split(): typ, data = mail.fetch(num, '(RFC822)') body = data[0][1].split('\r\n') b = False msg = '' for line in body: if not line and not b: b = True if b: msg += line email = base64.b64decode(msg) e2 = email.replace('\r\n\r\n', '||||').replace('\r\n', '||||').replace(',||||','||||') e3 = parseline(e2) try: log.write('%s\t%s\t%s\t%s\t%s\t%s\n' % (e3['product'], e3['fname'], e3['lname'], e3['company'], e3['email'], e3['phone'])) except KeyError: print e3 log.close() mail.close()
I’m not proud of the messy code but at least I can justify it… I started out using a simple regex, then I realized that the fields in the email sometimes were in different order and my regex started growing big and ugly. Then the project started taking too long and I just wanted to get it done. Voila, you get a tangled mess.
Now if I could just get a system in place for pasting code to my blog in a way that comes out looking pretty…
Bearfruit
Comments
Post new comment