12

I received a ZIP file from a Japanese customer.

When I try to unzip it the file and folders names are messed up:

$ unzip ~/Downloads/【新入荷ECM】資料.zip
...
 inflating: БyРVУ№Й╫ECMБzОСЧ┐/123_ГЖБ[ГXГPБ[ГX.xlsx

What is the problem, and how to avoid it?

muru
  • 193,181
  • 53
  • 473
  • 722
Nicolas Raoul
  • 11,473
  • 27
  • 93
  • 149

2 Answers2

19

The problem is that most ZIPs circulating in Japan have their content encoded as Shift JIS, which is not shown correctly by default on Ubuntu.

The solution is to use the -O shift-jis option in your command:

$ unzip -O shift-jis ~/Downloads/【新入荷ECM】資料.zip
...
 inflating: 【新入荷ECM】資料/123_ユースケース.xlsx

This way, the expanded files are perfectly readable in Ubuntu.

Nicolas Raoul
  • 11,473
  • 27
  • 93
  • 149
  • What version of `unzip` are you using? I am using Info-ZIP v6.0.0 on Cygwin. The option `-O` is not supported. – kevinarpe Mar 03 '22 at 03:26
  • 1
    @kevinarpe: The option might have disappeared? In which case you might have to patch unzip using https://gist.github.com/hamano/573753 or use the repository suggested by Sadaharu Wakisaka. – Nicolas Raoul Mar 07 '22 at 08:39
4

Simple answer for this

$ sudo apt install unar
$ unar ~/Downloads/【新入荷ECM】資料.zip

unar can automatically recognize which encoding is used. It does only extract and not for compressing.

Then use 'convmv' to encode text after extract.

$ convmv -f shift_jis -t utf8 БyРVУ№Й╫ECMБzОСЧ┐/123_ГЖБ[ГXГPБ[ГX.xlsx --notest

Vice versa, if you'd like to create file(s) from utf8 into shift_jis text for windows.

$ convmv -f utf8 -t shift_jis <filename> --notest

Alternative answers, use of Ubuntu Japanese team built automatic encoding 'unzip' but you have to add repository.

Nicolas Raoul
  • 11,473
  • 27
  • 93
  • 149
Sadaharu Wakisaka
  • 1,329
  • 1
  • 13
  • 25