Intro To 'split' Command In Linux
2023-12-20 - By Robert Elder
I use the 'split' command to split a file into a series of smaller files. For example, if I start with the following file 'example.txt':
Hello World!
This message will
be split into
multiple files.
ls
example.txt
Then, when I use the following 'split' command:
split -l 1 example.txt
You'll see that there are 4 additional files that each contain a line of the above file:
ls
example.txt xaa xab xac xad
cat xaa
Hello World!
cat xab
This message will
cat xac
be split into
cat xad
multiple files.
Using 'split' To Split Large Email Attachments
Here, I have .zip file that I want to send as an email attachment:
ls -l
total 14164
-rw-rw-r-- 1 robert robert 14500000 Nov 20 15:53 my-docs.zip
I can only send attachments that are up to 5 megabytes in size, but this file is almost 15 megabytes. I can use the 'split' command with the -b flag to split the file into three smaller files:
split -b 5MB my-docs.zip
ls -l
total 28344
-rw-rw-r-- 1 robert robert 14500000 Nov 20 15:53 my-docs.zip
-rw-rw-r-- 1 robert robert 5000000 Nov 20 15:54 xaa
-rw-rw-r-- 1 robert robert 5000000 Nov 20 15:54 xab
-rw-rw-r-- 1 robert robert 4500000 Nov 20 15:54 xac
Now, I can attached each file part in a separate email.
On the receiving end, I can use the cat command to reconstruct the original file from the three parts like this:
cat xaa xab xac > restored-my-docs.zip
The hash checksums show a match for both files, indicating that the original file was successfully reconstructed:
md5sum *.zip
45079b805f40208dde4d401dd74b6543 my-docs.zip
45079b805f40208dde4d401dd74b6543 restored-my-docs.zip
Custom Prefixes For File Parts
You can also add a prefix to each file part by providing a second argument to the 'split' command:
split -b 5MB my-docs.zip 'part-'
ls -l
total 28344
-rw-rw-r-- 1 robert robert 14500000 Nov 20 15:53 my-docs.zip
-rw-rw-r-- 1 robert robert 5000000 Nov 20 15:55 part-aa
-rw-rw-r-- 1 robert robert 5000000 Nov 20 15:55 part-ab
-rw-rw-r-- 1 robert robert 4500000 Nov 20 15:55 part-ac
Size Units With '-b' Flag
The '-b' flag supports all of the standard size units:
info split
...
‘-b SIZE’
‘--bytes=SIZE’
Put SIZE bytes of INPUT into each output file. SIZE may be, or may
be an integer optionally followed by, one of the following
multiplicative suffixes:
‘b’ => 512 ("blocks")
‘KB’ => 1000 (KiloBytes)
‘K’ => 1024 (KibiBytes)
‘MB’ => 1000*1000 (MegaBytes)
‘M’ => 1024*1024 (MebiBytes)
‘GB’ => 1000*1000*1000 (GigaBytes)
‘G’ => 1024*1024*1024 (GibiBytes)
and so on for ‘T’, ‘P’, ‘E’, ‘Z’, and ‘Y’.
...
Split Files Into A Target Number Of Parts
You can also use the '-n' flag to split the file into a target number of (approximately) equal parts:
split -n 8 my-docs.zip
ls -l
total 28344
-rw-rw-r-- 1 robert robert 14500000 Nov 20 15:53 my-docs.zip
-rw-rw-r-- 1 robert robert 1812500 Nov 20 15:57 xaa
-rw-rw-r-- 1 robert robert 1812500 Nov 20 15:57 xab
-rw-rw-r-- 1 robert robert 1812500 Nov 20 15:57 xac
-rw-rw-r-- 1 robert robert 1812500 Nov 20 15:57 xad
-rw-rw-r-- 1 robert robert 1812500 Nov 20 15:57 xae
-rw-rw-r-- 1 robert robert 1812500 Nov 20 15:57 xaf
-rw-rw-r-- 1 robert robert 1812500 Nov 20 15:57 xag
-rw-rw-r-- 1 robert robert 1812500 Nov 20 15:57 xah
Split Files By Number Of Lines
The '-l' flag can be used to split text files by line quantity. For example, if the file 'foo.txt' contains the following text:
Line One.
Line Two.
Line Three.
Line Four.
Line Five.
Line Six.
Line Seven.
ls -l
total 4
-rw-rw-r-- 1 robert robert 76 Nov 20 16:02 foo.txt
Using the following 'split' command:
split -l 3 foo.txt
Will produce the following 3 additional files:
ls -l
total 16
-rw-rw-r-- 1 robert robert 76 Nov 20 16:02 foo.txt
-rw-rw-r-- 1 robert robert 32 Nov 20 16:03 xaa
-rw-rw-r-- 1 robert robert 32 Nov 20 16:03 xab
-rw-rw-r-- 1 robert robert 12 Nov 20 16:03 xac
And the contents of these files will be the following:
cat xaa
Line One.
Line Two.
Line Three.
cat xab
Line Four.
Line Five.
Line Six.
cat xac
Line Seven.
Filter File Parts Through External Command
The '--filter' flag allows you filter each output file through a shell command. After running the following command:
split -l 3 --filter='gzip > $FILE.gz' foo.txt
These 3 new files will be created:
ls -l
total 16
-rw-rw-r-- 1 robert robert 76 Nov 20 16:02 foo.txt
-rw-rw-r-- 1 robert robert 42 Nov 20 16:08 xaa.gz
-rw-rw-r-- 1 robert robert 43 Nov 20 16:08 xab.gz
-rw-rw-r-- 1 robert robert 32 Nov 20 16:08 xac.gz
and the resulting binary contents of these files will look something like this:
xxd xaa.gz
00000000: 1f8b 0800 0000 0000 0003 f3c9 cc4b 55f0 .............KU.
00000010: cf4b d5e3 f201 b142 caf3 61ac 8ca2 54a0 .K.....B..a...T.
00000020: 2800 9359 250c 2000 0000 (..Y%. ...
xxd xab.gz
00000000: 1f8b 0800 0000 0000 0003 f3c9 cc4b 5570 .............KUp
00000010: cb2f 2dd2 e3f2 0133 33cb 52a1 cce0 cc0a ./-....33.R.....
00000020: 3d2e 00fd 0611 cc20 0000 00 =...... ...
xxd xac.gz
00000000: 1f8b 0800 0000 0000 0003 f3c9 cc4b 5508 .............KU.
00000010: 4e2d 4bcd d3e3 0200 587f 7cdc 0c00 0000 N-K.....X.|.....
And that's why the 'split' command is my favourite Linux command.
Intro To 'stty' Command In Linux
Published 2023-10-04 |
$1.00 CAD |
Intro To 'nproc' Command In Linux
Published 2023-07-15 |
Intro To 'comm' Command In Linux
Published 2023-09-06 |
How To Force The 'true' Command To Return 'false'
Published 2023-07-09 |
A Surprisingly Common Mistake Involving Wildcards & The Find Command
Published 2020-01-21 |
A Guide to Recording 660FPS Video On A $6 Raspberry Pi Camera
Published 2019-08-01 |
Intro To 'chroot' Command In Linux
Published 2023-06-23 |
Join My Mailing List Privacy Policy |
Why Bother Subscribing?
|