-1

I have some data in a text file in the following format:

1079,40,011,1,301 17,310 4,668 6,680 1,682 1,400 7,590 2,591 139,592 332,565 23,568 2,569 2,595 1,471 1,470 10,481 12,540 117,510 1,522 187,492 9,533 41,558 15,555 12,556 9,558 27,546 1,446 1,523 4000,534 2000,364 1,999/
1083,40,021,1,301 4,310 2,680 1,442 1,400 2,590 2,591 90,592 139,595 11,565 6,470 2,540 66,522 4,492 1,533 19,546 3,505 1,523 3000,534 500,999/

These examples represent what would be two rows in a spreadsheet. The first four values (in the first example, "1079,40,011,1") each go into their own column. The rest of the data are in a paired format, first listing a name of a column, designated by a number, then a space followed by the value that should appear in that column. So again, example: 301 17,310 4,668 6: in this row, column 301 has a value of 17, column 310 has value of 4, column 668 has value of 6, etc. Then 999/ indicates an end to that row. Any suggestions on how I can transform this text file format into a usable spreadsheet would be greatly appreciated. There are thousands of "rows" and so can't just manually convert them and I don't possess the coding skills to execute such a transformation myself.

Progman
  • 16,827
  • 6
  • 33
  • 48
  • 1
    Welcome to Stack Overflow. Please take the [tour] to learn how Stack Overflow works and read [ask] on how to improve the quality of your question. Then check the [help/on-topic] to see which questions are on-topic on this site. Please show your attempts you have tried and the problems/error messages you get from your attempts. – Progman Jun 06 '22 at 14:35

1 Answers1

0

This is messy but since there is a pattern it should be doable. What software are you using?

My first idea would be to identify when the delimeter changes from comma to space. Is it based on a fixed width, like always after 14 characters? Or is it based on the delimiter, like it is always after the 4th comma?

Once you've done that, you could make two passes at the data. The first pass imports the first four values from the beginning of the line which are separated by comma. The second pass imports the remaining values which are separated by space.

If you include a row number when importing you can then use it to join first and second passes at importing.

Brooke
  • 38
  • 5
  • Thanks for your reply and suggestion! I can try it in Python or R, but I'm not super adept in either honestly. And yes, it's always after the 4th comma. The part I'm finding tricky is that after the first four values, the format changes to "Column_Name Column_Value" and I don't know how to get those values read into their proper columns. – KlangWiley Jun 06 '22 at 15:45
  • How about Excel? If you open the text file from Excel, with each row in one cell, you can use the FIND function to locate the 4th comma. You can then use MID to separate the left and right sides into 2 new columns based on location of fourth comma. Then take the right column and copy it only into a new text file. That text file can then be opened with space used as delimiter. – Brooke Jun 07 '22 at 10:49
  • I just tried this in Excel, and I used these formulas: Find length of string: =LEN(B2) Find 4th comma: =FIND(CHAR(160),SUBSTITUTE(B2,",",CHAR(160),4)) Extract left side: =MID(B2,1,D2-1) Extract right side: =MID(B2,D2+1,C2-D2) Giving credit where credit due: the 4th comma formula came from https://exceljet.net/formula/find-nth-occurrence-of-character This gave me a "right side" column which could then be pasted to a new text file and imported into Excel as delimited by space. Is this what you're looking for? – Brooke Jun 07 '22 at 12:07
  • Hi, Thanks again for your reply. yes, I think that helps with the first part of the problem. However, the main problem is working with the data on the "right side." Those are number pairs in which the first number indicates the column number and the second number represents the value that should appear in that column. That's the main tricky part I'm having trouble with. Thanks again! – KlangWiley Jun 08 '22 at 15:14
  • What do you want the right side of the data to look like - can you post an example? Should each pair of numbers represent a row? Or do you want to add lots and lots of columns? – Brooke Jun 08 '22 at 20:30
  • Yeah, so the right hand side goes as follows from original example: 301 17,310 4,668 6: in this row, column 301 has a value of 17, column 310 has value of 4, column 668 has value of 6, etc. Then 999/ indicates an end to that row. So, yes, hundreds of columns. Hope that makes sense! – KlangWiley Jun 09 '22 at 20:42