Capturing contents of loop into a variable

Question

I have a for loop that outputs everything from /etc/trueuserowners into a du. I'm trying to figure out how to get the output of this for loop and place ALL of it into a variable without fitting it into an array.

for i in $(grep dook /etc/trueuserowners | cut -d : -f 1); do du -sh /home/$i; done | sort -nr

The primary purpose of this is so I can take the output of the du -sk, put it into variable p and take p to calculate the human readable forms and get a total at the very end.

I'm trying to keep this down into a one liner but can/will convert to bash script if I ABSOLUTELY have to.

This is being run on a server with cPanel users.

The contents of /etc/trueuserowners looks like this.

user1: reseller
user2: reseller
user3: reseller

This would be the output example from what I'm using would look like the following.

for i in $(grep root /etc/trueuserowners | cut -d : -f 1); do du -sh /home/$i; done | sort -nr
992K    /home/demoabusecenter
820K    /home/lakshdljkashdlkj
151M    /home/hyg8
58M /home/bugsabusecenter
21G /home/esportsoverlay
3.3G    /home/yourabuse
2.8M    /home/kf30ls08f2k0
1.4M    /home/perm2term
1.2M    /home/justcheck
1.1M    /home/myinfo1

The overall objective here is to get it to provide the output as it is now and give me the total of each user.

I think I've got a solution for that but I can't figure out how to get that output into a variable.

This is what happens when I try.

for i in $(grep root /etc/trueuserowners | cut -d : -f 1); do p=$(du -sh /home/$i); done | sort -nr | echo $p
20981044 /home/esportsoverlay

For some reason it converts the output for space into MB. Which is fine because I can convert that to GB. I just can't get it to display everything. It only shows me the last user output from the for loop.

well, you said objective is to `give...the total of each user`, but `du -sh` already does that for each directory, so there's kinda lack of clarity here on what you actually mean by total. Also, nice to see you updated the question with the example of output. As I understand you want each line to be `1234 /home/user` and at the end have something like `4567 total`, is that correct ? — Sergiy Kolodyazhnyy, Dec 01 '17 at 22:45
That is correct. And I do apologize as I was editing and didn't see your answer until I had made the edit. I have written a few different variants though can't seem to get the results I was after. The output is to get for example: 24MB /home/user1 37MB /home/user2 1000MB /home/user3 Total: 1.61GB Or at the very least to just handle the total in storage specifics. — Binary Accepted, Dec 01 '17 at 22:53
Very well. I've polished up my answer a bit. The very last code segment should give you exactly what you want, the human readable + directory name for each user, and at the end ` total` line. Let me know if you have any questions about any part of it. — Sergiy Kolodyazhnyy, Dec 01 '17 at 23:01

Sergiy Kolodyazhnyy · Accepted Answer · 2017-12-01T23:00:41.647

The answer is going to be a bit long, but I've to address quite a few parts, so stick with me here.

Well, let's start off with the fact that your approach for iterating over grep output isn't quite right. The for i in $(); do ...done approach breaks with lines that contain leading spaces (due to word splitting), and generally isn't recommended(see this).

However, what typically is done is command | while IFS=<word separator> read -r variable; do...done (which as additional bonus is portable between Bourne-like shells). Also, I see that you're making use of cut -d : -f 1 to get first item out of colon-separated list in each line. We can make use of IFS=":" then to get rid of cut part. Your original command is thus transformed as so:

grep 'dook' /etc/trueuserowners | while IFS=":" read -r first_word everything_else 
do 
    du -sh /home/$i
done | sort -nr

The first_word variable obviously will contain only the first field, and the unused everything_else will contain...everything else.

Note that I also quoted 'dook' part; although grep understands first item. Although grep understands first non-option item as PATTERN, it really is a good practice to protect pattern with single quotes, because if you use * or some other regex patterns that are used by the shell, that will unintentionally perform filename expansion by bash, and will attempt to read files in your current working directory. See this one for a detailed example.

Next, let's address arrays. We surely can add new item to arrays with the while loop I showed above, and += operator:

$ declare -a my_array
$ while IFS=":" read -r name null; do my_array+=("$name"); done < /etc/passwd
$ echo "${my_array[0]}"
root

This can be convenient if you want to process the items you're extracting later. However, considering the fact that you have pipe there, variables disappear once subshell in which while runs exits. See my old question about that.

What I'd recommend is processing everything in the while loop and use du -sb for precision in calculations. Thus, you could do:

# read what grep finds line by line and print total once there's no more lines from grep
grep 'dook' /etc/trueuserowners | while IFS=":" read -r first_word everything_else || { echo "$total"; break;} 
do 
    user_usage=$( du -sb /home/"$first_word" | awk '{print $1}' )
    # output the usage for current user
    printf "%s\t/home/%s\n" "$user_usage"  "$first_word"
    total=$(($total+$user_usage))
done

Notice how I used || { echo "$total"; break;}. Once there's nothing to read from stdin ( which in this case comes from pipe ), read command returns exit status of 1, so when read returns 1 we know it's done reading and processing, and we can output the total usage we calculated.

As for outputting human-readable data, we could make use of numfmt or some other utilities. Something like numfmt --to=iec-i --suffix=B $user_usage would suffice.

Overall, this can be used as one-liner if we trim variable names to something short, but there's no particular advantage in having a one-liner. Just do things as correctly as possible and don't worry about code length.

Putting it all together, the complete solution should be:

grep 'dook' /etc/trueuserowners | while IFS=":" read -r username trash || { printf "%s\ttotal\n" $( numfmt --to=iec-i --suffix=B "$total"); break;} 
do 
    # get usage in bytes
    user_usage=$( du -sb /home/"$username" | awk '{print $1}' )
    # get human-readable
    usage_human_readable=$( numfmt --to=iec-i --suffix=B "$user_usage" )
    # output the usage for current user
    printf "%s\t/home/%s\n" "$usage_human_readable"  "$username"
    total=$(($total+$user_usage))
done

Can you tell me what numfmt is? I tried getting the man page for it and it appears the command doesn't exist ` man numfmt No manual entry for numfmt [root@pro:/root]$ numfmt -bash: numfmt: command not found ` — Binary Accepted, Dec 01 '17 at 23:10
@TheDocs `numfmt` is command, part of `coreutils` package, just like standard `mv` and `cp` utilities are. Check your `coreutils` package version, via `apt-cache show coreutils | grep 'Version'`. The `numfmt` utility is there since version 8.21 ( i.e. since February 2013 ) — Sergiy Kolodyazhnyy, Dec 01 '17 at 23:13
Would there be an alternative to using numfmt? I wouldn't be able to install this on production servers. This would be running primarily on CentOS 6/7 servers. The idea is to utilize this help me find resold users going over a certain threshold. Additionally I must thank you for showing me a different way to use the while loop. I have been looking into this and it has changed my perspective a great deal and have learned a lot from it. I really appreciate it. — Binary Accepted, Dec 04 '17 at 17:21

Capturing contents of loop into a variable

1 Answers1