Você pode usar o applet dd
do busybox com seus argumentos bs
, count
e skip
para dividir um arquivo grande em partes.
dd
parte de manpage de busybox
:
dd [if=FILE] [of=FILE] [ibs=N] [obs=N] [bs=N] [count=N] [skip=N]
[seek=N] [conv=notrunc|noerror|sync|fsync]Copy a file with converting and formatting if=FILE Read from FILE instead of stdin of=FILE Write to FILE instead of stdout bs=N Read and write N bytes at a time ibs=N Read N bytes at a time obs=N Write N bytes at a time count=N Copy only N input blocks skip=N Skip N input blocks seek=N Skip N output blocks conv=notrunc Don't truncate output file conv=noerror Continue after read errors conv=sync Pad blocks with zeros conv=fsync Physically write data out before finishing
Então basicamente você faria algo assim:
$ dd if=bigfile of=part.0 bs=1024 count=1024 skip=0
$ dd if=bigfile of=part.1 bs=1024 count=1024 skip=1024
$ dd if=bigfile of=part.2 bs=1024 count=1024 skip=2048
Para cada part.X
arquivo dd
escreve count * bs bytes
ignorando primeiro skip
bytes do arquivo de entrada.
Um one-liner muito básico (combinando sed
, xargs
e dd
applet do busybox) poderia ser assim:
seq 0 19 | xargs -n1 sh -c 'dd if=bigfile of=part.$0 bs=1024 count=1024 skip=$(expr $0 \* 1024)'
produzindo 20% de arquivospart.X
com no máximo 1048576 bytes
de tamanho.
Exemplo splittig bigfile
:
$ ls -l
total 2940
-rw-rw-r-- 1 user user 3000000 Apr 27 13:21 bigfile
$ seq 0 20 | xargs -n1 sh -c 'dd if=bigfile of=part.$0 bs=1024 count=1024 skip=$(expr $0 \* 1024)'
1024+0 records in
1024+0 records out
1024+0 records in
1024+0 records out
881+1 records in
881+1 records out
0+0 records in
0+0 records out
[...]
$ ls -l
total 5968
-rw-rw-r-- 1 user user 3000000 Apr 27 13:21 bigfile
-rw-rw-r-- 1 user user 1048576 Apr 27 13:43 part.0
-rw-rw-r-- 1 user user 1048576 Apr 27 13:43 part.1
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.10
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.11
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.12
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.13
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.14
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.15
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.16
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.17
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.18
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.19
-rw-rw-r-- 1 user user 902848 Apr 27 13:43 part.2
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.3
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.4
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.5
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.6
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.7
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.8
-rw-rw-r-- 1 user user 0 Apr 27 13:43 part.9
A restauração pode ser feita facilmente com cat
(ou dd
novamente com o parâmetro seek
). Arquivos de 0 byte podem ser ignorados:
$ cat part.0 part.1 part.2 > bigfile.res
$ diff bigfile bigfile.res
Dependendo de suas necessidades, você não deve usar seq
e calcular o tamanho específico de seu bigfile e fazer todas as coisas em um shell script.