How to download pages that end in a certain way with httrack?

Question

I was trying to download certain pages that end in a certain phrase. I looked through the documentation and couldn't find out how. If there is or isn't a way please tell me and if so how.

EDIT: Say for example I am trying to get these websites: example.com/sdfsdfs/awrf235/sdgsdg/important_page.html example.com/sdfsasdasddfs/awrfg235/sdgsdg/important_page.html example.com/sdfsdfsdfs/awrf235g/sdsagsdg/important_page.html

And there are 100 more of those that end in /important_page.html and 1000 more of other useless stuff. How could I download the ones that end just in /important_page.html

Can you provide more information. Describe more what you are trying to accomplish (example), the obsticles and the things you have tried so far. — Tom Ruh, Apr 15 '15 at 20:22
in general, is the format of what you're trying to scrape `example.com/RANDOM/RANDOM/important_page.html` (i.e., `example.com` and `important_page.html` are fixed and the other parts fo the path can vary? — meatspace, Apr 15 '15 at 20:36

score 0 · Accepted Answer · answered Apr 15 '15 at 22:38

Go to Options / Scan Rules, click Include link(s) and then add a scan rule for the filename you want to match:

(The image above shows an exclusion rule being added, but the UI's the same for inclusion rules as well.)

Documentation for filters/scan rules and advanced filters.

How to download pages that end in a certain way with httrack?

1 Answers1