How can I download archive of ubuntu community documentation so that i can use it offline?

Question

I would just use some wget magic to grab the whole site pages and a program to print the pdf out of that. Or programmatically create a stack of that pages. — ashutosh, May 01 '12 at 20:11
I might be wrong, but I am guessing that Akshar is relatively new to Linux, hence his desire to download documentation. Is it possible you can provide some "how to" details in your answer. If my assumption is wrong I apologize? — stephenmyall, May 01 '12 at 20:23

score 2 · Answer 1 · answered May 01 '12 at 20:07

2

One simple way of doing this is when you are on your chosen a page in the community documentation website just select Print, then Print to File and save it as a PDF in your home folder

answered May 01 '12 at 20:07

stephenmyall

9,855
15
46
67

@Akshar wants to get all of the pages in the help, not just one. – nanofarad May 02 '12 at 20:29
1

Agreed, but look at the title of the question. Both our answers will be helpful to members wishing to do this in the future. Whether they need one document or all. Having options is a good thing. – stephenmyall May 02 '12 at 20:56
I do apologize for my ignorance. – nanofarad May 02 '12 at 23:07

score 1 · Answer 2 · answered May 01 '12 at 20:17

1

Just use wget wget -p --convert-links -r website.com -o logfile to download the whole site and convert the links to have it for offline viewing. Then you can view it offline in html format. You can then convert it afterwards individually to pdfs or whatever suits you, as it is a hasstle to convert it instantly online while doing.

answered May 01 '12 at 20:17

ashutosh

1,282
11
15

This is missing some parameters that would avoid incurring robots.txt issues, extra pages downloaded, etc. – nanofarad May 02 '12 at 11:47
1

I dont take the robot.txt into the account. Sorry for that. Only thing that matters most are the webpage we want to download.Extra stuff would be deleted later. Does the above command download the stuff you needed irrespective of extra stuffs? If so please that an answer – ashutosh May 02 '12 at 11:50
the command I gave just downloads the pages necessary, and not pages used for editing, login, page discussion, etc, etc. Your command downloads all of the pages linked, including redundant edit pages. – nanofarad May 02 '12 at 17:41

score 1 · Accepted Answer · answered May 01 '12 at 20:27

1

Use wget -r -l 0 -np -e robots=off -U "Mozilla/5.0 (Windows NT x.y; rv:10.0) Gecko/20100101 Firefox/10.0" -k -p -R *action=* https://help.ubuntu.com/community in your target directory. @asutosh 's would hypothetically work, but one would incur robots.txt issues, extra pages downloaded, etc. I have tested this and it will download all HTML. If you want to get images, you may need to do a bit of tweaking.

answered May 01 '12 at 20:27

nanofarad

20,597
12
65
91

Open up terminal by searching it in the launcher, then type(or paste) the command. You need to use rt-click, then paste option in terminal, as terminal does not recognize `ctrl+v` as paste. – nanofarad May 02 '12 at 11:47

How can I download archive of ubuntu community documentation so that i can use it offline?

3 Answers3

Linked