2

I save and archive a lot of Paypal receipts, and the files come with tons of garbage script code not necessary for archival purposes. I would like to come up with a way to strip all that code so I can delete the included javascript files to save space without breaking the look of the receipt.

Tried the 'find and replace' function in Notepad++ with the following regular expression (didn't put anything in the 'replace' field)

<script.*?/script>

This seems to take care of the issue mostly, but leaves a blank line for all deleted code. Is there a better way to do this?

Matthew S
  • 559
  • 1
  • 5
  • 19
  • 3
    Why not just print it as a PDF? Chrome can do it by default or you could download a free pdf printer. – Spencer5051 Mar 30 '15 at 20:32
  • I thought exactly what @Spencer5051 said... Why reinvent the wheel over a format issue when you can print it digitally and preserve the format (even as your archive retrieval tools and display change over the years?) – Austin T French Mar 30 '15 at 20:55
  • Saving as PDF definitely is the best option. But to solve your regex problem, you can of course include a newline into your search parameter, and replace that final char too. Then the replace will not just remove the script part, but also the line-ending afterwards. – LPChip Mar 30 '15 at 22:05
  • PDF is not optimal because of file size (we're talking about lots of files on a daily basis). HTML is also better in case you need to convert the file to another format later, because all the original images are kept. – Matthew S Mar 30 '15 at 22:26

1 Answers1

2

You can use (\r\n)*<script.*?/script>(\r\n)* to remove the script tags along with its leading and trailing blank lines.

Ayan
  • 2,951
  • 3
  • 19
  • 22