1

I need to split large text files around 10 GB into multiple text files (mostly 1gb files) and join those same text files into one file.

cdlane
  • 40,441
  • 5
  • 32
  • 81
Bobby
  • 320
  • 5
  • 23

1 Answers1

1

If you have split command then try this,

Example:

split -b1024 your_large_file.txt sample_prefix

It will split the large file and produce the list of files with 1024 bytes.

Join:

cat sample_prefixaa sample_prefixab sample_prefixac > final_org_largefile.txt

It will concatenate the contents of the spitted files and produce the single file.

Note: Linux will have split command. But, I don't know about GNUwin32.

sat
  • 14,589
  • 7
  • 46
  • 65
  • Thanks! Can we do this same scenario with csplit – Bobby Feb 20 '14 at 07:27
  • In Join is that mandatory to give file name like `sample_prefixaa` i mean i want to join like `*.*`. Because i don't know that how many files split command may split – Bobby Feb 20 '14 at 07:42
  • @Raj, Not like that. It is a spitted filenames. But, It should be in the order. If you see the example, the last argument of `split` command is prefix of filename for splitting files. Prefix name can be anything. – sat Feb 20 '14 at 07:45
  • I got that! but i want to join like `*.*`. Because i don't know that how many files split command may split – Bobby Feb 20 '14 at 07:47
  • @Raj, If you split the file by 1GB, then you will have only 10 files. – sat Feb 20 '14 at 08:06
  • Ya but if some time the files comes abound 12 gb then how can i get the perfect result in runtime – Bobby Feb 20 '14 at 08:10
  • @Raj, Like this: `sample_prefix.*` – sat Feb 20 '14 at 08:20
  • used the `-u` or `-unbuffered` option on your sed with huge file to avoid memory problem – NeronLeVelu Feb 20 '14 at 09:01