Submit Hint Search The Forums LinksStatsPollsHeadlinesRSS
14,000 hints and counting!

Extract names and emails from a text file UNIX
I have a recurring need to extract full names and email addresses from a plaintext archive of email messages. The archive is created by selecting a bunch of emails in Mail, copying them, pasting into TextEdit, and converting to plain text.

For each message in the file, the first line contains the information I wanted:
From: Joe Example <joe@example.com>
I wanted one email address per line, suitable for pasting into another location. I am far from an expert with the bash shell, but here's what I came up with—I imagine there are many more efficient ways to do this, as I'm sure experienced perl, sed, awk, etc. users may point out. Note that this is highly dependent on the format created by Apple's Mail app in OS X 10.8.

grep 'From:' /path/to/archive.txt | cut -f2 -d\< | cut -f1 -d\> | pbcopy

The grep bit pulls out the entire From: line, then the first cut command grabs the email address and the trailing close-bracket, by setting the delimiter to an open bracket. The second cut eliminates the closing bracket, by setting that as the delimiter. The output will be one email address per line, sitting on your clipboard ready for pasting. (To debug, just remove the | pbcopy bit to see the output.)

I also wanted to extract the names, and came up with a variant to do just that:

grep 'From:' ~/Desktop/testfile.txt | sed -e 's/: /:^/g' | sed -e 's/ \</^\</g' | cut -f2 -d^ | pbcopy

This one is messier, as names can contain one or more spaces. After getting the From: line, sed is used (twice) to add a carat delimiter immediately after From:, and immediately before the opening bracket of the email address. I then used cut, with the delimiter changed to the carat, to extract the full name (field two) from the found lines. Again, the results are copied to the clipboard; leave this bit off for debugging.

With the names and addresses extracted, it's fairly easy to do other stuff with them. In my case, I'm reading them into a couple of array variables in a bash script, so I can then output a name and email address pair to consecutive locations on my multi-pasteboard. If you want to use the names in an array in a bash script, you'll want to change the array delimiter from a space to a newline:

IFS='
'

Without this, your array will get split anywhere there's a space in the name values ... or so I've heard, not that it's ever happened to me!
    •    
  • Currently 3.20 / 5
  You rated: 1 / 5 (10 votes cast)
 
[7,361 views]  

Extract names and emails from a text file | 12 comments | Create New Account
Click here to return to the 'Extract names and emails from a text file' hint
The following comments are owned by whoever posted them. This site is not responsible for what they say.
Extract names and emails from a text file
Authored by: Lri on Feb 20, '13 09:52:53AM

You could also use sed -n 's///p':

sed -En 's/^From: .*<(.*)>$/\1/p'
sed -En 's/^From: (.*) <.*>$/\1/p'

IFS affects ( $value ) but not ( "$value" ):

line=x,y,z; IFS=, cols=( $line ); echo "${cols[1]}" # y
line="x y z"; cols=( "$line" ); echo "${cols[0]}" # x y z


[ Reply to This | # ]
Extract names and emails from a text file
Authored by: arcticmac on Feb 20, '13 10:08:46AM
You want to be careful because both
From: User Name <email@site.com>
and
From: email@site.com
Are valid ways to address an email, where the second method doesn't include the username. In a little bit I'll throw together an example of how to deal with this in perl or maybe sed...
Edited on Feb 20, '13 10:10:27AM by arcticmac


[ Reply to This | # ]
Extract names and emails from a text file
Authored by: Supp0rtLinux on Feb 20, '13 01:37:32PM

While this works on a Mac, its far from a Mac hint. Technically, this is just UNIX command line work. Any of the O'reilly Press books of UNIX Power Tools, Learning Sed and Awk, Learning vi, etc cover this type of stuff. I guess this is a good example of one use case for a Mac… and of course its a power you don't get on a PC without something like cygwin or ming installed, but a "Mac OS X Hints" it is not. grep, sed, awk, & cut are all common UNIX/Linux command line, text formatting tools, not Mac specific. But thanks for sharing...



[ Reply to This | # ]
Extract names and emails from a text file
Authored by: Supp0rtLinux on Feb 20, '13 01:40:41PM

Of course, on the flip side it just dawned on me that you posted a way to harvest email addresses. Hopefully you aren't planning to spam anybody with your newly found command line-fu. If you do… be sure the gods will find you and the end result will be worse than anything you can ever imagine. ;)



[ Reply to This | # ]
Extract names and emails from a text file
Authored by: robg on Feb 21, '13 05:30:44AM

No spamming going on, just something I have to do for our small company to send licenses to users from time to time.

-rob.

---
Now: at Many Tricks, maker of useful apps
http://manytricks.com. Previously: founded this site.



[ Reply to This | # ]
Extract names and emails from a text file
Authored by: kirkmc on Feb 21, '13 12:24:17AM

Right, I'll just go delete all the command line hints on the site, then, since they're not Mac-specific...

---
Mac OS X Hints editor - Macworld senior contributor
http://www.mcelhearn.com



[ Reply to This | # ]
Extract names and emails from a text file
Authored by: robg on Feb 21, '13 05:47:18AM

When I founded macosxhints back in November of 2000, one of the key reasons for the site's existence was for Mac users to have a place to learn about everything new in OS X, including the Unix underbelly. The Unix category has been here since day one, and has been (in my eyes) an important feature of the site: it's a way for Mac users to gain Unix knowledge without having to dive into a full-blow Unix book, or to visit a Unix-centric site (which may contain information not relevant to the version of Unix in OS X).

If you argue this doesn't belong here as related to OS X, then it also seems valid that Applications should go away, because those programs aren't directly related to OS X either -- they merely run on top of the OS, they're not part of it (I guess hints on Apple's bundled apps would remain).

In short, Mac OS X Hints was always about getting the most out of all aspects of OS X, including the Unix side.

-rob.

---
Now: at Many Tricks, maker of useful apps
http://manytricks.com. Previously: founded this site.



[ Reply to This | # ]
Extract names and emails from a text file
Authored by: Supp0rtLinux on Feb 21, '13 07:36:44AM

Actually, I wasn't complaining that a UNIX hint was posted here… OS X is, after all, UNIX at its core. And I've personally enjoyed many of the Mac-specific UNIX command line tools… I use some to grep out my network status to feed it into GeekTool, for example. I was just pointing out that this specific hint was not Mac specific at all… just UNIX specific. It could have been run unaltered on most Solaris, AIX, or Linux systems, for example. Wasn't trying to flame anyone or piss anyone off. Just making a comment… hence why the prompt is to "Post a Comment" :)



[ Reply to This | # ]
Extract names and emails from a text file
Authored by: trueger on Feb 20, '13 02:27:17PM

I recall from O'Reilly's "Mastering Regular Expressions" that correctly matching any RFC-compliant email address with a regexp was highly nontrivial. The example expression in the back of the book expanded to over 1k characters. (Awesome book, by the way.)



[ Reply to This | # ]
Extract names and emails from a text file
Authored by: timmacman on Feb 21, '13 04:53:22AM

A side note: if you use a clear plastic iphone case such as mine, and lay it led side down on a desk the case will flash with light making a nice noticeable silent alert.



[ Reply to This | # ]
Wrong hint
Authored by: SeanAhern on Feb 25, '13 08:45:32AM

Uh, wrong hint.



[ Reply to This | # ]
Extract names and emails from a text file
Authored by: gopes on Feb 23, '13 01:15:53PM
Not for nothing, but you can do this with AppleScript as well.
set theAddress to "Tim Cook "
tell the application "Mail" to extract address from theAddress -- returns "tim@apple.com"
tell the application "Mail" to extract name from theAddress -- returns "Tim Cook"
…Or on the command line:
osascript -e "tell the application \"Mail\" to extract address from \"Tim Cook <tim@apple.com>\""
That being said, I appreciate robg's original hint, as I use those command line tools from AppleScript as well, and it's always good to learn :)

[ Reply to This | # ]