6

I'm looking for free software that can convert OpenDocument to HTML or markdown.

Pandoc can convert HTML to OpenDocument, but not the reverse.

odt2html.py failed to install using both pip and easy_install.

LibreOffice can reportedly do the conversion; however, I couldn't get it to work with the following command:

soffice --convert-to --outdir . htm:HTML my.odt
evan.bovie
  • 3,202
  • 20
  • 30
the
  • 2,751
  • 1
  • 26
  • 35

2 Answers2

8

You're using --convert-to, but you're not specifying the value for it.

The correct syntax is:

soffice --headless --convert-to htm:HTML --outdir . my.odt

Or try to use the following script:

#! /bin/bash

CONFIG=/path/to/tidy_options.conf
# rm -rv "$2"
mkdir -p "$2"

for F in `find $1 -type f -name "*.doc" -or -name "*.odt"`; do
  BASE=`basename $F .doc` ; BASE=`basename $BASE .odt`
  soffice --headless --convert-to htm:HTML --outdir $2 $F
  tidy -q -config $CONFIG -f $2/$BASE.err -i $2/$BASE.htm | sed 's/ class="c[0-9]*"//g' > $2/$BASE.html
done

Usage:

$ convert_doc_to_html.sh SOURCE_DIR TARGET_DIR

See:

kenorb
  • 24,736
  • 27
  • 129
  • 199
  • Right! I copied the command from somewhere else and I hadn't noticed the problem with syntax because of the `/Applications/LibreOffice.app/Contents/MacOS/` part. Thanks. – the Mar 04 '15 at 14:40
  • 1
    I had to change `--convert-to htm:HTML` to just `--convert-to html`. Otherwise, I got no output file at all. Plus 1, though! Thanks! – JellicleCat May 10 '20 at 22:06
3

New versions of pandoc, the open source universal document converter, work now:

pandoc -t html -s input.odt -s -o output.html 
Daniel-KM
  • 31
  • 3
  • Update 2022: my `pandoc` command insists that I also add `--metadata pagetitle="..."` (even tough there _is_ a `title` property set in the docx file) – knb May 10 '22 at 12:40