1

i'am trying to use csplit command on a file of 700 Mo. I would like to split the file into 30 smallers files and also respect the tag name i use to start a new file.

<head>
 <test>1 </test>
</head>

<head>
 <test>2 </test>
</head>

<head>
 <test>3 </test>
</head>
...
<head>
 <test> 80 </test>
 </head>

Suppose i have 80 groups tags so i would like to generate exactly 30 files. So how can i do it using csplit. The begining i know is

csplit output_prefix File '/<head>/' '{*}'
iceman225
  • 109
  • 1
  • 2
  • 9

1 Answers1

2

Discovering csplit is half the battle! This should work:

% csplit --prefix File --elide-empty-files foo.xml '/<head>/' '{*}'
33
33
...

% ls
File00  File01  File02  ...  foo.xml

The option/argument order is important. Try csplit --help to see all of its options.

% csplit --help
Usage: csplit [OPTION]... FILE PATTERN...
Micah Elliott
  • 9,600
  • 5
  • 51
  • 54
  • thank you the problem i face is to define the number of generate file exactly. How can i fixe the number of generated files because depending of the user i can have 40 or more – iceman225 Jul 29 '15 at 08:09
  • All `csplit` can do is split on a pattern. If you only want 30 files, but there are 40 groups, then your last file with have 10 extra groups if you’ve limited it. You can limit the number of splits with the last argument. In this case change `{*}` to `{30}`. – Micah Elliott Jul 29 '15 at 13:11