5

I currently work in the position of Data Warehouse programmer and as such have to put numerous flat files through ETL process. Of course prior to loading the file I have to be aware of its content, the problem is that majority of the files are > 1 GB large and I can not open them using my dear old friend "notepad". Kidding. I usually use VIM or Notepad++ but it still takes a while to open the file. Could I perform a "partial" read of the file using VIM or some other editor?

P.S. I know that I could write a 10 liner script to "data sample" the file, but it would be simpler to convince team members to use a feature of an editor than a script that I wrote.

Thank you for any insight you might have.

a_person
  • 177
  • 1
  • 2
  • 12

6 Answers6

3

If you want to stick with using vim, you could have a look at the LargeFile script.

Alternatively, I've always found that UltraEdit opens large files extremely quickly.

Chad Birch
  • 73,098
  • 23
  • 151
  • 149
3

You said you had VIM, that makes me wonder if you have a unix environment as well?

If you like, you can pipe the input through unix utility top and display the raw imput on your screen. Like this:

EDIT: (thanks Honk)

terminal$> head -N 15 file.csv

(Where that 15 indicates you want to see 15 lines only).

rlb.usa
  • 14,942
  • 16
  • 80
  • 128
2

Pretty sure there are loads of similar questions, but hey, Textpad is a good choice for this.

Simon
  • 78,655
  • 25
  • 88
  • 118
  • Verified & Confirmed. Textpad opened a 1.3 GB file flawlessly in 6 seconds for me (although saving it took much, much longer). – rlb.usa Apr 01 '10 at 19:32
  • TextPad ended up being waaay too slow when tasked with opening the file taking quite a bit longer than Notepad++. – a_person Apr 01 '10 at 20:22
2

use the head command.

frankc
  • 11,290
  • 4
  • 32
  • 49
1

Use the 'less' on solaris ... use the same through cygwin on windows. On mainframes this problem doesn't appear, ISPF editor handles it pretty well.

ankur
  • 11
  • 1
0

UltraEdit claims to handle files over 4GB...

Dave Swersky
  • 34,502
  • 9
  • 78
  • 118