[SATLUG] I've newly moved to Ubuntu 11.10 and a few SED & bash questions

Bruce Dubbs bruce.dubbs at gmail.com
Sat Dec 10 19:09:58 CST 2011


John-Eric wrote:

> I would like to 
> make a single pass through the file, reading each line only once. I have 
> placed each substitution on a separate line for easier reading, but is 
> there a way to span -e over several lines? 

Your script is a little hard to read.  Try to keep the lines less than 
80  characters.

#!/bin/bash
# This script trims the subject line of Thunderbird folder files to make
# duplicates easier to find. It would be ideal if this script only
# evaluated lines that start with "Subject:"

# Syntax= sed -e 's/find1/sub1/' -e 's/find2/sub2/g' <old >new
# Syntax= sed '/baz/s/foo/bar/g' #Replace "foo" with "bar" only
#   on lines that begin with "Subject:"

# input1= Subject: [Group One] (OT)
#   "This is the text." [Archivo Adjunto 1]

# input2= Subject: {Message: 1234)
#    @##-Some Different Text##@ [Archivos Adjuntos 78]


#Remove anything inside any mixed pair of [,{,( and ),},]
# if it follows "Subject:"

#Remove "fsrv" anywhere on the line

#Remove "[Archivo Adjunto 1]" and "[Archivos Adjuntos 78]"
# and surrounding whitespace anywhere on the line

#Allow only alphanumeric characters and spaces

#Compress white space


sed -e '/^subject:/s/^subject: *[\[\{\(](.*?)[\)\}\]] */Subject: /g' \
     -e '/^subject:/s/ fsrv//g'                                       \
     -e '/^subject:/s/ \[fsc\]//g'                                    \
     -e '/^subject:/s/ *\[Archivos? adjuntos? *\d+\] */g'             \
     -e '/^subject:/s/\(\s-\|[^A-Za-z0-9-]\)/ /g'                     \
     -e '/^subject:/s/  +/ /g'                                        \
   <'/media/RAID-S/Thunderbird/Mail/Local Folders/InFile'             \
   >'/media/RAID-S/Thunderbird/Mail/Local Folders/OutFile'

--------

1.  You can probably remove some of the backslashes with the -r
option.  Check 'man 7 regex' as well as 'man sed'

2.  You probably don't need /g on every line

3,  Does the TBird path really have a space in the path
'Local Folders'?

4.  I'd debug this with a very simple input file that is a subset of the 
whole file, checking one line at a time.  For instance, the 4th line 
could probably be made a bit more explicit with:

sed -r -e '/^subject:/s/ *\[Archivos.*adjuntos.*[:digit:]+\] *//'

   -- Bruce


More information about the SATLUG mailing list