I need to split large text files around 10 GB into multiple text files (mostly 1gb files) and join those same text files into one file.
Asked
Active
Viewed 303 times
1 Answers
1
If you have split
command then try this,
Example:
split -b1024 your_large_file.txt sample_prefix
It will split the large file and produce the list of files with 1024 bytes.
Join:
cat sample_prefixaa sample_prefixab sample_prefixac > final_org_largefile.txt
It will concatenate the contents of the spitted files and produce the single file.
Note: Linux will have split
command. But, I don't know about GNUwin32
.

sat
- 14,589
- 7
- 46
- 65
-
Thanks! Can we do this same scenario with csplit – Bobby Feb 20 '14 at 07:27
-
In Join is that mandatory to give file name like `sample_prefixaa` i mean i want to join like `*.*`. Because i don't know that how many files split command may split – Bobby Feb 20 '14 at 07:42
-
@Raj, Not like that. It is a spitted filenames. But, It should be in the order. If you see the example, the last argument of `split` command is prefix of filename for splitting files. Prefix name can be anything. – sat Feb 20 '14 at 07:45
-
I got that! but i want to join like `*.*`. Because i don't know that how many files split command may split – Bobby Feb 20 '14 at 07:47
-
@Raj, If you split the file by 1GB, then you will have only 10 files. – sat Feb 20 '14 at 08:06
-
Ya but if some time the files comes abound 12 gb then how can i get the perfect result in runtime – Bobby Feb 20 '14 at 08:10
-
@Raj, Like this: `sample_prefix.*` – sat Feb 20 '14 at 08:20
-
used the `-u` or `-unbuffered` option on your sed with huge file to avoid memory problem – NeronLeVelu Feb 20 '14 at 09:01