Submit Hint Search The Forums LinksStatsPollsHeadlinesRSS
14,000 hints and counting!

A script to clean up Mozilla's mailbox files UNIX
After several years of using Mozilla Mail on Linux, I recently switched to the Mac and decided to switch to Apple's Mail.app, as well. Mail.app has most of the functionality of Mozilla Mail, but it has a more polished feel (although its junk mail filtering doesn't seem nearly as good as Mozilla's).

Mail.app and Mozilla Mail both use the standard Unix mbox format, so I assumed that moving my mail archive would be trivial. Mail.app even has a Mozilla option for importing mailboxes! Unfortunately, I had very little luck importing: Mail.app wouldn't even detect some of my mailboxes as valid mbox files, and many of those that it could detect were only imported partially. (A mailbox containing 750 messages in Mozilla Mail would show up in Mail.app with only 12, for example.)

After lots of investigation, I discovered that the mbox files that Mozilla Mail had created were very inconsistent. Message boundaries had different forms, lines would end with different characters (sometimes carriage return, other times carriage return plus linefeed), etc. In a word, the files had simply gotten dirty, perhaps because my archive had been growing through several versions of Mozilla Mail, each of which had slightly different algorithms for reading and writing mbox files. As a result, Mail.app's inflexible mbox importer couldn't handle my mbox files.

I first tried manually correcting the files by loading them into TextEdit, but I soon realized that this would require days of effort. Instead, I wrote a Python script that cleaned up the Mozilla Mail mbox files and allowed every one to be imported into Mail.app. I used the script successfully on over 10,000 emails scattered throughout several dozen mailboxes.

I am posting the script here in the hope that someone may find it useful in solving the same problem with Mail.app that I had. Although I have not tested it with other mbox-based email clients such as Mozilla Thunderbird, it should be of use for them, as well.

[robg adds: Due to the length of the script, I uploaded it as a separate text file: clean_moz.txt.]
    •    
  • Currently 3.00 / 5
  You rated: 1 / 5 (6 votes cast)
 
[28,148 views]  

A script to clean up Mozilla's mailbox files | 11 comments | Create New Account
Click here to return to the 'A script to clean up Mozilla's mailbox files' hint
The following comments are owned by whoever posted them. This site is not responsible for what they say.
A script to clean up Mozilla's mailbox files
Authored by: kidventus on Jun 13, '04 11:39:43PM
In a word, the files had simply gotten dirty, perhaps because my archive had been growing through several versions of Mozilla Mail, each of which had slightly different algorithms for reading and writing mbox files. As a result, Mail.app's inflexible mbox importer couldn't handle my mbox files.

Sounds like someone is picking on Mail.app unfairly. Inflexable mbox importing? Sunshine, if Mozilla couldn't keep their mbox format the same it's not apple's fault. Don't call Mail.app inflexable, call Mozilla poor quality.

If you like people not adhereing to standards... go back to Mozilla. If your done with having to do all this Python conversion crap due to open source sloppyness, welcome to the world of Mac OSX.

[ Reply to This | # ]
A script to clean up Mozilla's mailbox files
Authored by: vocaro on Jun 28, '04 04:05:03PM

Sounds like someone is picking on Mail.app unfairly. Inflexable mbox importing? Sunshine, if Mozilla couldn't keep their mbox format the same it's not apple's fault. Don't call Mail.app inflexable, call Mozilla poor quality.

Mail.app is inflexible when it comes to importing mbox files, and it does not fully support the mbox standard. See this post for more details.

If you like people not adhereing to standards... go back to Mozilla.

Actually, Mozilla is one of the most standards-compliant browsers available. The mbox problem I refer to is simply a matter of excess whitespace that Mail.app cannot handle.

If your done with having to do all this Python conversion crap due to open source sloppyness, welcome to the world of Mac OSX.

If you are a representative of the Mac community, then I don't want to have any part of it. I have never seen such a rude and hateful response to a macosxhints post!



[ Reply to This | # ]
A script to clean up Mozilla's mailbox files
Authored by: rgillig on Sep 08, '05 03:30:46PM

The link to the apple article is not working. Any chance you could find it again?
Thanks,
R-



[ Reply to This | # ]
A script to clean up Mozilla's mailbox files
Authored by: norrix on Jun 21, '04 07:44:50AM

Thanks a lot, this Script is perfect.
The Bug isn't at Mozilla it's on Mail.app as you can read here:
http://discussions.info.apple.com/webx?50@6.05Lea0L4p3a.2@.6893cb28

Greetings

NoRriX



[ Reply to This | # ]
A script to clean up Mozilla's mailbox files
Authored by: norrix on Jun 23, '04 06:45:49AM

The Script repairs not all e-mails. If anyone could help I would be very happy.



[ Reply to This | # ]
A script to clean up Mozilla's mailbox files
Authored by: PeaceFreak on May 29, '05 03:17:07AM

Hi,

This sounds like the answer to the problems I have been having, however... How do I run the script? What do I do with it?

Could someone give me the basics?

Thanks!



[ Reply to This | # ]
A script to clean up Mozilla's mailbox files
Authored by: PeaceFreak on Jun 08, '05 08:51:54AM

I have tried this script on my Mozilla Folder and unfortunately, it did not work for me, which is very dissapointing! Some 'cleaned' folders are unchanged from their earlier import count... Would really love to see this fixed!



[ Reply to This | # ]
A script to clean up Mozilla's mailbox files
Authored by: PeaceFreak on Jun 18, '05 05:34:06PM

What follows below finally worked for me!!! I copied it from a thread in the Apple Users Forums:

Hi all,

With the help of somebody having these problems (since I have not used Mozilla, I didn't have a testbed myself), I took a look at the problem and came up with a solution.

1. The problem:
Mozilla does not seem to use mbox files according to the relevant RFCs - some older versions of Mozilla seemed to have used Mac end-of-line (EOL) characters instead of the *nix ones (i.e., char(13) instead of char(10)) - some mailbox files might even have a mixture of the two
Mozilla stores the message status flags as well as the information whether a message has been deleted in two additional headers (X-Mozilla-Status and X-Mozilla-Status2) - Mail's import ignores those headers and all messages therefore import as unread. Additionally, Mail might also import messages which had been deleted in Mozilla before

2. The solution:
First, you will have to change all EOLs in all mailboxes to *nix style - I created an AppleScript which takes care of that (in order not to hit the maximum post length here, I will post the script in a separate message).
Second, you use Apple's import function to import the mailboxes (you can use "Other" or "Mozilla" for the import)
Third, you run another AppleScript (an improved version of the one I posted above) which adjusts all the message flags and deletes messages in Mail which should not have been imported
Fourth, you let Apple know that they should fix this problem (and, by linking here, tell them how to fix it) :-) http://www.apple.com/macosx/feedback/

Andreas

Here is the script to change all EOL characters to *nix style.

NOTE: the script changes the files you drop onto it - if you plan on using the mailboxes in Mozilla again, you might want to work on a copy of the files instead of the originals.

Select all the text inside the gray box and create a new AppleScript from it (use the "Services ? Apple Script ? Make New AppleScript" menu entry in the application menu (which currently most probably says "Safari"):

on run
set theFolder to choose folder with prompt "Please choose the folder containing your mailbox files"
processObject(theFolder)
display dialog "Conversion finished" buttons {"OK"} default button 1
end run

on open theObjects
repeat with theObject in theObjects
processObject(theObject)
end repeat
display dialog "Conversion finished" buttons {"OK"} default button 1
end open

on processObject(theObject)
set theInfo to info for theObject
if theInfo is folder then
set theContents to list folder theObject
repeat with theItem in theContents
set thePath to ((theObject as string) & theItem)
processObject(thePath as alias)
end repeat
else
set thePath to POSIX path of theObject
do shell script "perl -pi -e 's/\\r\\n/\\n/g' " & quoted form of thePath
do shell script "perl -pi -e 's/\\r/\\n/g' " & quoted form of thePath
end if
end processObject


Save the script as an application (using the drop-down item in the save dialog)

You can now drop files and/or folders onto the scripts icon (or run the script by double-clicking and select a folder to convert all files therein) - the script will change the EOL character for all files inside this folder (recursively).

Here is the script which adjust all message status flags and deletes messages in Mail which had been deleted in Mozilla.

Select all the text inside the gray box and create a new AppleScript from it (use the "Services ? Apple Script ? Make New AppleScript" menu entry in the application menu (which currently most probably says "Safari"):

tell application "Mail"
set theMsgs to (get selection)
repeat with eachMsg in theMsgs
try
set MozHeader to content of header "X-Mozilla-Status" of eachMsg
if (character 4 of MozHeader) is in {"8", "9", "A", "B", "C", "D", "E", "F"} then
set deleted status of eachMsg to true -- 0x0008
else
if (character 4 of MozHeader) is in {"1", "3", "5", "7"} then
set read status of eachMsg to true -- 0x0001
else
set read status of eachMsg to false
end if
if (character 4 of MozHeader) is in {"2", "3", "6", "7"} then
set was replied to of eachMsg to true -- 0x0002
else
set was replied to of eachMsg to false
end if
if (character 1 of MozHeader) is in {"1", "3", "5", "7", "9", "B", "D", "F"} then set was forwarded of eachMsg to true -- 0x1000
end if
end try
end repeat
end tell



You can run this script either directly from the Script Editor (make sure not to show the "Event Log" in the bottom pane since this will significantly slow down its operation) or save it as an application and run it by double-clicking.

Select all the messages in your imported mailboxes (you can select multiple mailboxes in the mailbox list and all messages from all those mailboxes will be shown in the message list - click into the message list and chose "Select All") and run the script - when it is finished, all your messages should have the proper message status and you shouldn't see any messages you deleted in Mozilla.

Andreas
A possilbe future version of my freeware Eudora Mailbox Cleaner should be able to handle this importing problem. Check it out... <http://homepage.mac.com/aamann/Eudora_Mailbox_Cleaner.html>



[ Reply to This | # ]
A script to clean up Mozilla's mailbox files
Authored by: schalliol on Oct 22, '07 09:15:38PM

These AppleScripts are great. Thanks!



[ Reply to This | # ]
A script to clean up Mozilla's mailbox files
Authored by: vocaro on May 15, '08 12:51:01PM

To run the clean_moz.txt script:

  1. Right-click on its link and select the option to download it.
  2. Launch the Terminal and type:
    python SCRIPT MBOX
    where SCRIPT is the clean_moz script you downloaded and MBOX is an mbox file you want to clean. For example:
    python ~/Downloads/clean_moz.txt.py ~/Desktop/inbox.mbox


[ Reply to This | # ]
A script to clean up Mozilla's mailbox files
Authored by: autumn_oaks on Mar 10, '10 01:10:21PM

For a hint to have been posted in 2004 and exceeding now 20,000 hits, it must be pretty good!

W/R to your comment on Thunderbird not having been tested, I just tested your script on a small Thunderbird (09/2009, vers 3.0?, I don't remember) mbox file:

localhost:~ Me$ python ~/Desktop/clean_moz.py ~/Desktop/Mails.mbox
Reading messages from '/Users/Me/Desktop/Mails.mbox'
Processing 101 messages... done.
Wrote to '/Users/Me/Desktop/Mails.mbox.cleaned'
Cleaned 1 mailbox(es).

Under Import in Mail (Version 2.1.3 (753.1)), it reads the 101 messages, but appends them as a single message in Mail. Looking at the single message, Mail displays a blank line between each individual message entry.

Not being very knowledgeable of Mail's message structure, any hints on how to change your script so that the result in Mail contains 101 individual messages? Thanks...



[ Reply to This | # ]