18

I have a string like this:

"aString that may haveSpaces IN IT" bar foo "bamboo" "bam boo"

I want to be able to split it like this:

aString that may haveSpaces IN IT
bar
foo
bamboo  
bam boo

How do I do that? (preferrably using a one-liner)

foxneSs
  • 283
  • 1
  • 2
  • 6
  • [so] duplicate: [Split a string only by spaces that are outside quotes](http://stackoverflow.com/q/12821302) – DavidPostill Apr 17 '16 at 09:56
  • @DavidPostill the questions are quite different actually. – foxneSs Apr 17 '16 at 10:50
  • Not really, it's the same general problem. – DavidPostill Apr 17 '16 at 11:01
  • @DavidPostill - This is a much simpler problem: all it needs is `for l in "aString that may haveSpaces IN IT" bar foo "bamboo" "bam boo"; do echo $l; done` – AFH Apr 17 '16 at 11:24
  • @AFH lol. I just posted a much longer answer. The only difference in the output was that mine preserved the `"`s. I missed the fact that the OP doesn't need them in the output. – DavidPostill Apr 17 '16 at 11:27
  • @AFH You should post your comment as the answer. – DavidPostill Apr 17 '16 at 11:27
  • @DavidPostill - It's more complicated if the string is in a variable. If the string is in `$s`, then `for l in $s; do echo $l; done` takes the quotes as literals and breaks at the spaces. I need to go out now, so feel free to work it out. – AFH Apr 17 '16 at 11:40
  • it's called tokenizing a string.. e.g. modern programming/scripting languages / libraries, have a string tokenizer facility. For bash http://stackoverflow.com/questions/5382712/bash-how-to-tokenize-a-string-variable – barlop Apr 17 '16 at 11:57
  • @barlop The tokening in the linked question splits on every space not just the ones outside quotes. – DavidPostill Apr 17 '16 at 12:05

7 Answers7

9

The simplest solution is using making an array of the quoted args which you could then loop over if you like or pass directly to a command.

eval "array=($string)"

for arg in "${array[@]}"; do echo "$arg"; done   

p.s. Please comment if you find a simpler way without eval.

Edit:

Building on @Hubbitus' answer we have a fully sanitized and properly quoted version. Note: this is overkill and will actually leave additional backslashes in double or single quoted sections preceding most punctuation but is invulnerable to attack.

declare -a "array=($( echo "$string" | sed 's/[][`~!@#$%^&*():;<>.,?/\|{}=+-]/\\&/g' ))"

I leave it to the interested reader to modify as they see fit http://ideone.com/FUTHhj

  • I have a testcase that breaks this: `string="bash -c 'echo \$USER'"` which "leaves" the backslash. You sometimes need this for e.g `ssh` – Flamefire Jun 06 '19 at 15:18
6

When I saw David Postill's answer, I thought "there must be a simpler solution". After some experimenting I found the following works:-

string='"aString that may haveSpaces IN IT" bar foo "bamboo" "bam boo"'
echo $string
eval 'for word in '$string'; do echo $word; done'

This works because eval expands the line (removing the quotes and expanding string) before executing the resultant line (which is the in-line answer):

for word in "aString that may haveSpaces IN IT" bar foo "bamboo" "bam boo"; do echo $word; done

An alternative which expands to the same line is:

eval "for word in $string; do echo \$word; done"

Here string is expanded within the double-quotes, but the $ must be escaped so that word in not expanded before the line is executed (in the other form the use of single-quotes has the same effect). The results are:-

[~/]$ string='"aString that may haveSpaces IN IT" bar foo "bamboo" "bam boo"'
[~/]$ echo $string
"aString that may haveSpaces IN IT" bar foo "bamboo" "bam boo"
[~/]$ eval 'for word in '$string'; do echo $word; done'
aString that may haveSpaces IN IT
bar
foo
bamboo
bam boo
[~/]$ eval "for word in $string; do echo \$word; done"
aString that may haveSpaces IN IT
bar
foo
bamboo
bam boo
AFH
  • 17,300
  • 3
  • 32
  • 48
4

It looks that xargs can do it pretty well:

$ a='"aString that may haveSpaces IN IT" bar foo "bamboo" "bam boo"'
$ printf "%s" "$a" | xargs -n 1 printf "%s\n"
aString that may haveSpaces IN IT
bar
foo
bamboo
bam boo
Olivier
  • 41
  • 1
  • Excellent! Though as a developer you might want to document `xarg quoting applies` for your end users. xargs interprets double quotes slightly different than bash. bash interprets some `\ ` escape sequences inside `"..."` (e.g. `"a \" b"` -bash-> 1. `a " b`) while `xargs` treats `"..."` and `'...'` equally (`\ ` has no special meaning in either of them. E.g. `"a \" b"` -xargs-> 1. `a \ ` and 2. a broken string starting with `b` and missing a closing quote. To get `a " b` you could write `'a " b'` or `a\ \"\ b` or `"a "\"" b"`). – Socowi Nov 30 '22 at 23:33
2

How do I do that?

$ for l in "aString that may haveSpaces IN IT" bar foo "bamboo" "bam boo"; do echo $l; done
aString that may haveSpaces IN IT
bar
foo
bamboo
bam boo

What do I do if my string is in a bash variable?

The simple approach of using the bash string tokenizer will not work, as it splits on every space not just the ones outside quotes:

DavidPostill@Hal /f/test
$ cat ./test.sh
#! /bin/bash
string='"aString that may haveSpaces IN IT" bar foo "bamboo" "bam boo"'
for word in $string; do echo "$word"; done

DavidPostill@Hal /f/test
$ ./test.sh
"aString
that
may
haveSpaces
IN
IT"
bar
foo
"bamboo"
"bam
boo"

To get around this the following shell script (splitstring.sh) shows one approach:

#! /bin/bash 
string=$(cat <<'EOF'
"aString that may haveSpaces IN IT" bar foo "bamboo" "bam boo" 
EOF
)
echo Source String: "$string"
results=()
result=''
inside=''
for (( i=0 ; i<${#string} ; i++ )) ; do
    char=${string:i:1}
    if [[ $inside ]] ; then
        if [[ $char == \\ ]] ; then
            if [[ $inside=='"' && ${string:i+1:1} == '"' ]] ; then
                let i++
                char=$inside
            fi
        elif [[ $char == $inside ]] ; then
            inside=''
        fi
    else
        if [[ $char == ["'"'"'] ]] ; then
            inside=$char
        elif [[ $char == ' ' ]] ; then
            char=''
            results+=("$result")
            result=''
        fi
    fi
    result+=$char
done
if [[ $inside ]] ; then
    echo Error parsing "$result"
    exit 1
fi

echo "Output strings:"
for r in "${results[@]}" ; do
    echo "$r" | sed "s/\"//g"
done

Output:

$ ./splitstring.sh
Source String: "aString that may haveSpaces IN IT" bar foo "bamboo" "bam boo"
Output strings:
aString that may haveSpaces IN IT
bar
foo
bamboo
bam boo

Source: StackOverflow answer Split a string only by spaces that are outside quotes by choroba. Script has been tweaked to match the requirements of the question.

DavidPostill
  • 153,128
  • 77
  • 353
  • 394
2

You may do it with declare instead of eval, for example:

Instead of:

string='"aString that may haveSpaces IN IT" bar foo "bamboo" "bam boo"'
echo "Initial string: $string"
eval 'for word in '$string'; do echo $word; done'

Do:

declare -a "array=($string)"
for item in "${array[@]}"; do echo "[$item]"; done

But please note, it is not much safer if input comes from user!

So, if you try it with say string like:

string='"aString that may haveSpaces IN IT" bar foo "bamboo" "bam boo" `hostname`'

You get hostname evaluated (there off course may be something like rm -rf /)!

Very-very simple attempt to guard it just replace chars like backtrick ` and $:

string='"aString that may haveSpaces IN IT" bar foo "bamboo" "bam boo" `hostname`'
declare -a "array=( $(echo $string | tr '`$<>' '????') )"
for item in "${array[@]}"; do echo "[$item]"; done

Now you got output like:

[aString that may haveSpaces IN IT]
[bar]
[foo]
[bamboo]
[bam boo]
[?hostname?]

More details about methods and pros and cons you may found in that good answer: https://stackoverflow.com/questions/17529220/why-should-eval-be-avoided-in-bash-and-what-should-i-use-instead/17529221#17529221

But there still leaved vector for attack. I very want have in bash method of string quote like in double quotes (") but without interpreting content.

Hubbitus
  • 151
  • 4
1

Expanding on Oliver's answer, using xargs and declare the list can be translsted into an assignment expression safe to eval

echo "1 2 '3 4' 5" |  xargs bash -c 'declare -a array=("$@"); declare -p array' --
Facundo
  • 33
  • 7
  • This seems to be the only answer which gives the right result, and seems safe. Of course, you still have to eval the result, which feels scary: `eval "$(echo "1 2 '3 4' 5" | xargs bash -c 'declare -a array=("$@"); declare -p array' --)"`... but as @Facundo notes, it should be safe... – Paul Molodowitch Nov 01 '22 at 22:38
-1

use awk

echo '"aString that may haveSpaces IN IT" bar foo "bamboo" "bam boo"' | awk 'BEGIN {FPAT = "([^ ]+)|(\"[^\"]+\")"}{for(i=1;i<=NF;i++){gsub("\"","",$i);print $i} }'
aString that may haveSpaces IN IT
bar
foo
bamboo
bam boo

Or convert the space to "%20" or "_", so it can be processed by next command through pipeline:

echo '"aString that may haveSpaces IN IT" bar foo "bamboo" "bam boo"' | awk 'BEGIN {FPAT = "([^ ]+)|(\"[^\"]+\")"}{for(i=1;i<=NF;i++){gsub("\"","",$i);gsub(" ","_",$i)} print }'
aString_that_may_haveSpaces_IN_IT bar foo bamboo bam_boo

reference:Awk consider double quoted string as one token and ignore space in between

tinyhare
  • 99
  • 3