0

Sometime when edit large files like more than 100GB. (my pc physical memory is 128GB and using nvme ssd)

  1. small change . vs fast save ? when i did a small change on the file, like deleting the first line of a file. Is there a more efficient way to complete this function ? 200gb file takes half a hour to save.

  2. Sometime emeditor will detect json or csv error rows. Is it easy to mark these rows as bookbooks? So it will be easy to extract or delete these lines.

  3. Can sequence number auto-fulling used in replace ?

When edit more than 100M rows. As I know, the normal function should be switch into csv mode and insert a new column. and then filled with sequence numbers. these steps also time-consuming.

These step can be fullfilled by replace function? an example bellow.

example:

{"Genres":"Drama","Product":"Ice Cream - Super Sandwich","Title":"White Lightnin'"} {"Genres":"Drama|War","Product":"Raspberries - Frozen","Title":"Leopard, The (Gattopardo, Il)"} {"Genres":"Crime|Drama|Film-Noir","Product":"Cookie Dough - Chunky","Title":"Limits of Control, The"} {"Genres":"Drama|Mystery","Product":"Watercress","Title":"Echoes from the Dead (Skumtimmen)"} {"Genres":"Drama|Thriller","Product":"Cumin - Whole","Title":"Good People"}

need to convert into

{"id":1,"Genres":"Drama","Product":"Ice Cream - Super Sandwich","Title":"White Lightnin'"} {"id":2,"Genres":"Drama|War","Product":"Raspberries - Frozen","Title":"Leopard, The (Gattopardo, Il)"} {"id":3,"Genres":"Crime|Drama|Film-Noir","Product":"Cookie Dough - Chunky","Title":"Limits of Control, The"} {"id":4,"Genres":"Drama|Mystery","Product":"Watercress","Title":"Echoes from the Dead (Skumtimmen)"} {"id":5,"Genres":"Drama|Thriller","Product":"Cumin - Whole","Title":"Good People"}

data created by Mockaroo

Jon Shaw
  • 3
  • 3
  • 1
    Putting aside the whole text editor stuff for a moment, it looks like you have essentially a flat list that you're using as a database. Perhaps a _database_ is what you actually need (_e.g._ sqlite), since such technology is designed to handle large amounts of data. Second question: are these names, IPs and email addresses from real people? You should not be publishing that information here. – paddy Aug 17 '21 at 03:35
  • this is random data. created by mockaroo ,it's fake data – Jon Shaw Aug 17 '21 at 07:09
  • Are you using the latest version of EmEditor? Which version are you using? – Yutaka Aug 17 '21 at 15:54
  • using the latest version 21.0.0 – Jon Shaw Aug 18 '21 at 01:00
  • How to remove certain lines of a large file (>5G) using linux command https://stackoverflow.com/questions/40431149/how-to-remove-certain-lines-of-a-large-file-5g-using-linux-commands – Jon Shaw Aug 18 '21 at 01:09
  • from above link Because of the way files are stored on standard filesystems (NTFS, EXTFS, ...), you cannot remove parts of a file in-place. The only thing you can do in-place is append to the end of a file (append mode) modify data in a file (read-write mode) Other operations must use a temporary file, or temporary memory to read the file fully and write it back modified. – Jon Shaw Aug 18 '21 at 01:11
  • How many lines exist in your 100GB file (or 200GB)? Is it ANSI or UTF-8 file? What is your CPU? How many minutes to delete the first line, and how many minutes to save? I will try to reproduce the speed issue. – Yutaka Aug 18 '21 at 16:41

1 Answers1

0

Assuming you are running a relatively recent version of EmEditor

Find (Ctrl+ F): {

Options: Match Case, Close when Finished, (None)

Click [Select All] (should all be selected in your file)

Edit Menu - Advanced - Numbering (or Alt+ N)

First Line:{"id":1

Increment:1

Make sure Decimal is selected

Click [OK]

Venturer
  • 84
  • 6