3

rsync --info=progress2 percent complete of copying a large directory is non-uniform in that the last 10% seems to take longer than the first 90%.

Why is this and is there a way to make it a more uniform progress indicator?

eyn
  • 133
  • 1
  • 6
  • 1
    If there's anything that Windows has taught us, "proper" progress indication is *hard*. – muru Jul 20 '18 at 01:11
  • 1
    @muru especially when a filesystem cache is in action that pretends high writing speeds until it is full. – PerlDuck Jul 20 '18 at 07:28

3 Answers3

7

The reason for the observed behaviour is likely a file system cache:

When files are written (as rsync does), then usually the data gets written to a cache (in memory) first and the write operation almost instantly returns. The data is then written to the disk in background while the user can already do other things.

If the cache is large enough to hold the data to be written this pretends a huge writing speed.

If the data to be written doesn't fit into the file system cache, then the excess data is actually written to disk before the write operation completes, and writing to disk is slower than writing to in-memory cache.

The excess data doesn't bypass the cache but rather waits until previous content has been moved from the cache to the disk giving again some free space in the cache, so the new data can be written to the cache.

So the first part of data (90% in your case) appears to be written in an instant (to cache) while the last 10% take more time because then actual disk operation kicks in.

PerlDuck
  • 13,014
  • 1
  • 36
  • 60
  • 1
    See also https://askubuntu.com/questions/5051/how-to-switch-off-caching-for-usb-device-when-writing-to-it – muru Jul 20 '18 at 09:21
  • 1
    This makes total sense. Thanks. If I had enough points to upvote, I would. But, it's my first questions here. ;) – eyn Jul 24 '18 at 22:22
  • @eyn Oh, I wasn't aware that we need 15 reputation points to upvote. But you don't need any reputation to _accept_ an answer if you think it solved your problem. By far I don't want to talk you into that. Take your time and perhaps read [_What should I do when someone answers my question?_](https://askubuntu.com/help/someone-answers). :-) – PerlDuck Jul 25 '18 at 09:33
  • yep. Gotcha and thanks. I hadn't even realized I could accept an answer. – eyn Jul 26 '18 at 02:51
2

In addition to PerlDuck's answer, it's worth noting that writing one large file is faster than writing a huge number of files that add up to the same size.

For example: You copy a 4gb file and a directory containing 100000 files that add up to 1gb. If the single file is transfered first, the first 80% will be way faster than the last 20%.

danzel
  • 5,794
  • 1
  • 18
  • 28
1

The percentage shown is only the percentage of the total size of the files rsync has already scanned. In the output from --info=progress2, for instance:

71,256,901,358  99%   36.30MB/s    0:31:12 (xfr#173389, ir-chk=1000/361047)

the last number, 361047, is the number of files scanned so far. When you recursively copy a large directory with lots of subdirectories and files, this number will typically keep growing until the operation is almost complete, and the files that have already been scanned but haven’t been copied yet will typically only be a small fraction of the total number of files, so unless an unusually large file has been scanned but not copied, most of the data in the files that have been scanned has already been copied, and thus the percentage will typically be above 90% most of the time.

joriki
  • 111
  • 2