0

I am trying to reduce image size using dd conv=sparse, but no matter what I do, the image size is still huge.

I have a CentOS image which is around 136GB. The main reason it being so big is that it has a 128GB swap.

When I ran du -sh, this is the output

# du -sh centos_raw.img
2.0G    centos_raw.img

So what I have attempted to do is to use virt-sparsify to reduce it. Here are the results:

# virt-sparsify centos_raw.img centos_raw_sparse.img
[   0.2] Create overlay file in /tmp to protect source disk
[   0.2] Examine source disk
- 25% [###############################################----------------------------------------------------------------------------------------------------------------------------------------------] --:--
 100% [#############################################################################################################################################################################################] 00:00
[  32.1] Fill free space in /dev/sda2 with zero
 100% [#############################################################################################################################################################################################] 00:00
[  43.5] Clearing Linux swap on /dev/sda3
 100% [#############################################################################################################################################################################################] 00:00
[ 955.3] Fill free space in /dev/sda4 with zero
 100% [#############################################################################################################################################################################################] 00:00
[1054.3] Copy to destination and make sparse
[1068.3] Sparsify operation completed with no errors.
virt-sparsify: Before deleting the old disk, carefully check that the
target disk boots and works correctly.

However, the size is still the same. It is still 136GB.

When I ran du -sh again, the size did get reduced.

# du -sh centos_raw_sparse.img
1.3G    centos_raw_sparse.img

I am planning to use this image to use for openstack baremetal image, and the image is really big to transferred through the network

CSLser
  • 1
  • 2

2 Answers2

1

You say the image is about 136GB. The output from du

2.0G    centos_raw.img

says its size on disk is 2G. You can see both numbers in the output of

ls -hls centos_raw.img

This is exactly how sparseness works: the file appears large but it takes less disk space. It looks like the original image is sparse in the first place.

virt-sparsify managed to create an even sparser file, the new size on disk is only 1.3G. The new file still reports 136GB, this is expected. If the file was truncated, it would most probably become corrupted; it had to keep its old size.

I think everything worked as it should have. The new file is sparse indeed.


the image is really big to transferred through the network

This is your real problem but in general it doesn't depend directly on sparseness. Blocks of zeros can help. The point of sparseness is to effectively "compress" such blocks, so they take no disk space virtually. Your files are very sparse so large sequences of zeros will emerge if you read them. Still a non-sparse file with explicit blocks of zeros (that do take disk space) can as well be transferred through the network efficiently, if you know how to do this. That's why I said "it doesn't depend directly on sparseness". And block of zeros (sparse or not) may not help you if you transfer the file with a "wrong" method.

Few "right" methods:

  1. Using a protocol that transparently compresses data (e.g. SSH with proper settings). This doesn't depend on the source file being sparse. Note you will most likely need an additional step (possibly on the fly) to make the resulting file sparse.
  2. Compressing the file (possibly on the fly) and decompressing at the other end (possibly on the fly). This also doesn't depend on sparseness of the source file; and the additional step is needed to make the resulting file sparse. Example (with dd conv=sparse as the additional step):

    • At the destination:

      nc -l … | gzip -dc | dd bs=512 conv=sparse of=centos_raw_sparse.img
      

      (bs=512 because of this).

    • At the source:

      <centos_raw_sparse.img gzip -c | nc …
      
  3. Archiving the file (possibly on the fly) with tar -S and unarchiving at the other end. The -S option does rely on sparseness of the source file. This is one of the very few scenarios where tarring a single file makes sense. The advantage is you don't need the additional step. If supported, tar will make the file sparse at the destination by default, all you need is -S at the source.

    In my workflow I would use SSH anyway. This seems reasonable:

    tar -cS centos_raw_sparse.img | ssh destination 'cd /dest/dir && tar -x'
    
Kamil Maciorowski
  • 69,815
  • 22
  • 136
  • 202
0

Manual and sure way without virt-sparsify (guestfs-tools via libguestfs)

load nbd driver:

sudo modprobe nbd

bind qcow2 image to /dev/nbd0 :

sudo qemu-nbd --connect=/dev/nbd0 tosparsify.qcow2 

shrink partition:

sudo gparted /dev/nbd0 (resize to 4GB - apply) 

unbind from /dev/nbd0:

sudo qemu-nbd --disconnect /dev/nbd0 

resize the qcow2 file:

sudo qemu-img resize --shrink tosparsify.qcow2 4.5G 

at the end gpt partition backup is missing reallocate backup to the end of disk in gdisks extended operations:

sudo qemu-nbd --connect=/dev/nbd0 tosparsify.qcow2 
sudo gdisk tosparsify.qcow2

press x press e (confirm relocate) press w (confirm to write)

sudo qemu-nbd --disconnect /dev/nbd0

Now you can compress it:

qemu-img convert -c -p -f qcow2 -O qcow2 tosparsify.qcow2 compressed_sparsified.qcow2 

Result is you have reduced from 138 GB to probably 0.9 GB by also compressing the file

I have written all this from memory there may be some flaws but its the general idea. Do not use before testing it on a test qcow2 disk.

Gediz GÜRSU
  • 140
  • 1
  • 7