2

All the CSV to TSV tutorials are suggesting a simple:

tr ',' '\t'

though some CSVs look like this:

1,310,"IntAct,PINA"

in which case I would like to keep "IntAct,PINA":

1   310 "IntAct,PINA"

How could I parameterize the tr command (or sed, etc.) in order to do that?

I appreciate any suggestions.

Eliah Kagan
  • 116,445
  • 54
  • 318
  • 493
tr3quart1sta
  • 329
  • 1
  • 2
  • 4

3 Answers3

2

Use csvformat from csvkit:

csvformat -d, -D$'\t' file

or shorter:

csvformat -T file

-d input delimiter (not needed here, as , is the default input delimiter)

-D output delimiter

-T set tabs as output delimiter

It will remove the quotes, as they are not needed for a tsv.


You should be able to install csvkit via pip:

sudo apt install python-pip
pip install csvkit
pLumo
  • 26,204
  • 2
  • 57
  • 87
  • ... FWIW I had a bit of a play with the quoting options and it seems you can use `-U2` to force quoting of the non-numeric field in the output, but it only works if you use `-u2` for the input format - which appears to force floating-point conversion as well – steeldriver Jun 25 '19 at 13:39
0

If csvkit (which I recommend) is not available, then you could use the perl Text::CSV module:

perl -MText::CSV -lne '
  BEGIN{$p = Text::CSV->new} print join "\t", $p->fields() if $p->parse($_)
' file

If you insist on retaining the quoting (which is unnecessary, since the embedded , is no longer a separator), then you could do something like

print join "\t", map { $_ =~ s/.*,.*/"$&"/r } $p->fields() if $p->parse($_)
steeldriver
  • 131,985
  • 21
  • 239
  • 326
0

Using your CSV without heading

1,310,"IntAct,PINA"

and Miller (https://github.com/johnkerl/miller)

mlr --nidx --ifs "," --ofs "\t" cat input.csv

gives you back

1       310     "IntAct PINA"
aborruso
  • 714
  • 4
  • 11