Sed/Awk save text between patterns if contains string

Question

I'm facing an issue with mails. I need to get all messages between 2 people: [email protected] and [email protected].

The file:

From: [email protected]
to: [email protected]
<body of the message1>

From: [email protected]
to: [email protected]
<body of the message1>

From: [email protected]
to: [email protected]
<body of the message1>

From: [email protected]
to: [email protected]
<body of the message1>

From: [email protected]
to: [email protected]
<body of the message1>

I tried to use the following sed:

sed -n "/From: [Ss]omebody1/,/From: /p" inputfile > test.txt

As a result I got all mails from somebody1 to test.txt file.

Question is: What should be the structre of sed to get only mails between somebody1 and person?

chaos · Answer 1 · 2015-10-13T13:10:44.220

1

With sed:

sed -n '/^From: [email protected]/{h;n;/^to: [email protected]/{H;g;p;:x;n;p;s/.//;tx}}' file

/^From: [email protected]/: first search for the From: email-address
- h; store that line int the hold space.
- n; load the next line (the to: line).
/^to: [email protected]/: search for the to: email-address
- H; append that line to the hold space.
- g; copy the hold space to the pattern space.
- p; print the pattern space.
- :x; set a label called x.
- n; load the next line (the email body)
- p; print that line.
- s/.// do a substitution in that line (just replace one character)...
- tx ... that the t command can check if that substitution is successful (when the line is not empty, as in the end of the email body). If yes jump back to the label x and repeat until an empty line appears, if not jump to the end of the script.

The output:

From: [email protected]
to: [email protected]
<body of the message1>

From: [email protected]
to: [email protected]
<body of the message1>

edited Oct 13 '15 at 13:10

answered Oct 13 '15 at 12:25

chaos

4,204
2
19
28

Probably you can obtain an output more clean without the first `p;`. Just to avoid a list of isolated matches with `From: [email protected]` not followed by the second person match and the block of the letter. – Hastur Oct 13 '15 at 12:58
@Hastur Good hint, I corrected it, now It's not printing isolated matches anymore – chaos Oct 13 '15 at 13:11
Thanks a lot for that. I would like to ask another question: thing is that what i should get in return is whole message body (which may containt new line characters) till next occurence of "From:" Right now i get more info but it's not enough: example output From: [email protected] To: [email protected] Date: Mon, 06 Jul 2015 17:41:03 GMT Subject: *************** Content-type: ********************************* X-Scanned-By: ********************** and no body after it – wtk Oct 13 '15 at 13:31
Search for your file the point in which it stops your chunk, and _probably_ you will find another time the keyword `From: [email protected]`... You have to select a different unique key that you will not find again in the body of your message. It will be the same with the `awk` answer. Give it a try too. – Hastur Oct 13 '15 at 15:09

score 0 · Answer 2 · answered Oct 13 '15 at 12:48

With awk:

awk '/From: [Ss]omebody1/{flag=1;next} \
  /to\: person1/ {if (flag>0) {flag=2; print; next} else {flag=0; next}} \
 /From/{flag=0} {if (flag==2){print NR,flag, $0}} ' input.txt

/From: [Ss]omebody1/{flag=1;next} \ Put a flag variable to 1 on match and skip the line.
/to\: person1/ If the flag is 1 update it to 2 else reset it to 0.
/From/{flag=0} On match it reset the flag value.
{if (flag==2){print NR, $0}} if flag is 2 it will print the linenumber and the line.

Change the value of person1 to have different matches.

Input file used

From: [email protected]
to: [email protected]
<body of the message1>

From: [email protected]
to: [email protected]
<body of the message2>

From: [email protected]
to: [email protected]
<body of the message3>

From: [email protected]
to: [email protected]
<body of the message4>

From: [email protected]
to: [email protected]
<body of the message5>

Sed/Awk save text between patterns if contains string

2 Answers2