2

I'm facing an issue with mails. I need to get all messages between 2 people: [email protected] and [email protected].

The file:

From: [email protected]
to: [email protected]
<body of the message1>

From: [email protected]
to: [email protected]
<body of the message1>

From: [email protected]
to: [email protected]
<body of the message1>

From: [email protected]
to: [email protected]
<body of the message1>

From: [email protected]
to: [email protected]
<body of the message1>

I tried to use the following sed:

sed -n "/From: [Ss]omebody1/,/From: /p" inputfile > test.txt

As a result I got all mails from somebody1 to test.txt file.

Question is: What should be the structre of sed to get only mails between somebody1 and person?

chaos
  • 4,204
  • 2
  • 19
  • 28
wtk
  • 21
  • 1

2 Answers2

1

With sed:

sed -n '/^From: [email protected]/{h;n;/^to: [email protected]/{H;g;p;:x;n;p;s/.//;tx}}' file

  • /^From: [email protected]/: first search for the From: email-address
    • h; store that line int the hold space.
    • n; load the next line (the to: line).
  • /^to: [email protected]/: search for the to: email-address
    • H; append that line to the hold space.
    • g; copy the hold space to the pattern space.
    • p; print the pattern space.
    • :x; set a label called x.
    • n; load the next line (the email body)
    • p; print that line.
    • s/.// do a substitution in that line (just replace one character)...
    • tx ... that the t command can check if that substitution is successful (when the line is not empty, as in the end of the email body). If yes jump back to the label x and repeat until an empty line appears, if not jump to the end of the script.

The output:

From: [email protected]
to: [email protected]
<body of the message1>

From: [email protected]
to: [email protected]
<body of the message1>
chaos
  • 4,204
  • 2
  • 19
  • 28
  • Probably you can obtain an output more clean without the first `p;`. Just to avoid a list of isolated matches with `From: [email protected]` not followed by the second person match and the block of the letter. – Hastur Oct 13 '15 at 12:58
  • @Hastur Good hint, I corrected it, now It's not printing isolated matches anymore – chaos Oct 13 '15 at 13:11
  • Thanks a lot for that. I would like to ask another question: thing is that what i should get in return is whole message body (which may containt new line characters) till next occurence of "From:" Right now i get more info but it's not enough: example output From: [email protected] To: [email protected] Date: Mon, 06 Jul 2015 17:41:03 GMT Subject: *************** Content-type: ********************************* X-Scanned-By: ********************** and no body after it – wtk Oct 13 '15 at 13:31
  • Search for your file the point in which it stops your chunk, and _probably_ you will find another time the keyword `From: [email protected]`... You have to select a different unique key that you will not find again in the body of your message. It will be the same with the `awk` answer. Give it a try too. – Hastur Oct 13 '15 at 15:09
0

With awk:

awk '/From: [Ss]omebody1/{flag=1;next} \
  /to\: person1/ {if (flag>0) {flag=2; print; next} else {flag=0; next}} \
 /From/{flag=0} {if (flag==2){print NR,flag, $0}} ' input.txt 
  • /From: [Ss]omebody1/{flag=1;next} \ Put a flag variable to 1 on match and skip the line.
  • /to\: person1/ If the flag is 1 update it to 2 else reset it to 0.
  • /From/{flag=0} On match it reset the flag value.
  • {if (flag==2){print NR, $0}} if flag is 2 it will print the linenumber and the line.

Change the value of person1 to have different matches.

Input file used

From: [email protected]
to: [email protected]
<body of the message1>

From: [email protected]
to: [email protected]
<body of the message2>

From: [email protected]
to: [email protected]
<body of the message3>

From: [email protected]
to: [email protected]
<body of the message4>

From: [email protected]
to: [email protected]
<body of the message5>
Hastur
  • 18,764
  • 9
  • 52
  • 95