Background
I am not a programmer or technical person I have a project where I need to convert a large text file to an access database. The text file is not in traditional flat file format so I need some help pre processing. The files are large (millions of records) between 100MB and 1GB and seem to be choking all of the editors I have tried (word pad, notepad, vim, em editor)
The following is a sample of the source text file:
product/productId:B000H9LE4U
product/title: Copper 122-H04 Hard Drawn Round Tubing, ASTM B75, 1/2" OD, 0.436" ID, 0.032" Wall, 96" Length
product/price: 22.14
review/userId: ABWHUEYK6JTPP
review/profileName: Robert Campbell
review/helpfulness: 0/0
review/score: 1.0
review/time: 1339113600review/summary: Either 1 or 5 Stars. Depends on how you look at it.
review/text: Either 1 or 5 Stars. Depends on how you look at it.1 Star because they sent 6 feet of 2" OD copper pipe.0 Star because they won't accept returns on it.5 stars because I figure it's actually worth $12-15/foot and since they won't take a return I figure I can sell it and make $40-50 on this deal
product/productId: B000LDNH8I
product/title: Bacharach 0012-7012 Sling Psychrometer, 25?F to 120?F, red spirit filled
product/price: 84.99
review/userId: A19Y7ZIICAKM48
review/profileName: T Foley "computer guy"
review/helpfulness: 3/3
review/score: 5.0
review/time: 1248307200
review/summary: I recommend this Sling Psychrometer
review/text: Not too much to say. This instrument is well built, accurate (compared) to a known good source. It's easy to use, has great instructions if you haven't used one before and stores compactly.I compared prices before I purchased and this is a good value.
Each line represents a specific attribute of a product, starting at "product/productId:"
What I need
I need to convert this file to a character delimited field (i think @ symbol work) by stripping out each of the codes (i.e. product/productId:, product/title:, etc and replacing with the @ and replacing the line feeds.
I want to eliminate the review/text: line
The output would look like this:
B000H9LE4U@Copper 122-H04 Hard Drawn Round Tubing, ASTM B75, 1/2" OD, 0.436" ID, 0.032" Wall, 96" Length@22.14@ABWHUEYK6JTPP@Robert Campbell@0/0@1.0@1339113600@Either 1 or 5 Stars. Depends on how you look at it.
B000LDNH8I@Bacharach 0012-7012 Sling Psychrometer, 25?F to 120?F, red spirit filled@84.99@A19Y7ZIICAKM48@T Foley "computer guy"@3/3@5.0@1248307200@I recommend this Sling Psychrometer
B000LDNH8I@Bacharach 0012-7012 Sling Psychrometer, 25?F to 120?F, red spirit filled@84.99@A3683PMJPFMAAS@Spencer L. Cullen@1/1@5.0@1335398400@A very useful tool
I now would have a flat file delimited with "@" that I can easily import into access.
Sorry for the ramble. I am open to suggestions, but don't understand programming enough to write using the editor language. Thanks in advance