| Title: How to split a file into small parts | |
| Author: Solène | |
| Date: 21 March 2021 | |
| Tags: openbsd unix | |
| Description: | |
| # Introduction | |
| Today I will present the userland program "split" that is used to split | |
| a single file into smaller files. | |
| OpenBSD split(1) manual page | |
| # Use case | |
| Split will create new files from a single files, but smaller. The | |
| original file can be get back using the command cat on all the small | |
| files (in the correct order) to recreate the original file. | |
| There are several use cases for this: | |
| - store a single file (like a backup) on multiple medias (floppies, | |
| 700MB CD, DVDs etc..) | |
| - parallelize a file process, for example: split a huge log file into | |
| small parts to run analysis on each part | |
| - distribute a file across a few people (I have no idea about the use | |
| but I like the idea) | |
| # Usage | |
| Its usage is very simple, run split on a file or feed its standard | |
| input, it will create 1000 lines long files by default. -b could be | |
| used to tell a size in kB or MB for the new files or use -l to change | |
| the default 1000 lines. Split can also create a new file each time a | |
| line match a regex given with -p. | |
| Here is a simple example splitting a file into 1300kB parts and then | |
| reassemble the file from the parts, using sha256 to compare checksum of | |
| the original and reconstructed files. | |
| ```split and reassemble example | |
| solene@kongroo ~/V/pmenu> split -b 1300k pmenu.mp4 | |
| solene@kongroo ~/V/pmenu> ls | |
| pmenu.mp4 xab xad xaf xah xaj xal xan | |
| xaa xac xae xag xai xak xam | |
| solene@kongroo ~/V/pmenu> cat x* > concat.mp4 | |
| solene@kongroo ~/V/pmenu> sha256 pmenu.mp4 concat.mp4 | |
| SHA256 (pmenu.mp4) = e284da1bf8e98226dc78836dd71e7dfe4c3eb9c4172861bafcb1e2afb… | |
| SHA256 (concat.mp4) = e284da1bf8e98226dc78836dd71e7dfe4c3eb9c4172861bafcb1e2afb… | |
| solene@kongroo ~/V/pmenu> ls -l x* | |
| -rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xaa | |
| -rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xab | |
| -rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xac | |
| -rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xad | |
| -rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xae | |
| -rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xaf | |
| -rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xag | |
| -rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xah | |
| -rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xai | |
| -rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xaj | |
| -rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xak | |
| -rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xal | |
| -rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xam | |
| -rw-r--r-- 1 solene wheel 810887 Mar 21 16:50 xan | |
| ``` | |
| # Conclusion | |
| If you ever need to split files into small parts, think about the | |
| command split. | |
| For more advanced splitting requirements, the program csplit can be | |
| used, I won't cover it here but I recommend reading the manual page for | |
| its usage. | |
| csplit manual page |