3

I have many files. File format is year(4-digit)month(2-digit)day(2-digit)

Sample filenames:

  • 20150101.txt
  • 20150102.txt

Content of sample filenames

00:00:13 -> 001528

I want to extract data as date from filename and then to insert it in the file

Desired output

2015-01-01T00:00:13 001528

or

2015-01-01 00:00:13 001528

I tried one of below code

for files in *txt; do
awk -F "->" 'BEGIN{OFS=""} {print FILENAME" ",$1, $2}' <$files > $files.edited
mv $files.edited $files
done

Please guide.

chess_freak
  • 121
  • 1
  • 8

2 Answers2

4

If you have GNU awk (gawk) then you could use its built-in Time Functions to convert pieces of the file name and contents into an epoch time, and then convert it according to a chosen format.

Ex. given

$ cat 20150101.txt 
00:00:13 -> 001528

Then

$ awk -F ' -> ' '
    split($1,a,/:/) {
      ds = sprintf("%04d %02d %02d %02d %02d %02d", substr(FILENAME,1,4), substr(FILENAME,5,2), substr(FILENAME,7,2), a[1], a[2], a[3]); 
      $1 = strftime("%FT%T", mktime(ds))
    } 
    1
  ' 20150101.txt 
2015-01-01T00:00:13 001528
steeldriver
  • 131,985
  • 21
  • 239
  • 326
2

This will give you the desired output using sed:

for files in *.txt; do
sed -e "s/^./$files&/;s/./&-/4;s/./&-/7;s/.txt/T/;s/ -> / /" "$files"
done

To actually insert each output into each file, you do not need to redirect as you did in your loop. You can simply use the -i option instead of -e.

  • the s (substitute) command uses the following syntax: s/regexp/replacement/flags
  • . matches any character and ^. matches the first character of a line
  • & back-references the whole matched portion of the pattern space
  • s/^./$files&/ says to substitute the first character with the filename before the first character
  • s/./&-/4 uses the number flag 4 to substitute the 4th character (the 4th match of .) with - after the 4th character
  • s/./&-/7 replace the 7th character with - after the 7th character (note that the 6th character becomes the 7th character after inserting - after the 4th character).

And of course,

  • s/.txt/T/ substitutes .txt with T and
  • s/ -> / / substitutes -> with a single blank space.

This is the output:

2015-01-01T00:00:13 001528
2015-01-02T00:00:13 001528
mchid
  • 42,315
  • 7
  • 94
  • 147
  • This example output uses the same content for both files but will work as long as the first field is time and the second field is any value. Also, it is important to use double-quotes `"` with the `sed` command instead of single quotes `'` because `$files` is used in the command. – mchid Dec 16 '19 at 09:24