0

Here is a link to a sample spreadsheet with 2 tabs, one for my data output and another for the desired result

I'm importing text data into Sheets and need to split it via a defined fixed width. I know the column width limit, which is where I'm trying to define column breaks as the data will not be wider than that for each column, however it does not have a consistent delimiter.

I've done this previously in Excel using the following VBA code:

Workbooks.OpenText Filename:=sFileName, Origin:=437, StartRow:=41, 
DataType:=xlFixedWidth, FieldInfo:=Array( _
    Array(0, 1), _
    Array(23, 1), _
    Array(34, 1), _
    Array(59, 1), _
    Array(70, 1), _
    Array(79, 1), _
    Array(87, 1), _
    Array(98, 1), _
    Array(114, 1), _
    Array(123, 1)), _
    TrailingMinusNumbers:=True

I've got a code setup to pull a text file into the workbook and I attempted to split using a SLICE function but it gave an error that the range had ~4000 rows by the data had 0 rows so it couldn't generate.

function importTXTfromDrive() {

  var fileName = Browser.inputBox("Enter the name of the text file in your 
Google Drive to import (e.g. myFile.csv):");

  var searchTerm = "title = '"+fileName+"'";

  var files = DriveApp.searchFiles(searchTerm)
  var csvFile = "";


  while (files.hasNext()) {
    var file = files.next();
    if (file.getName() == fileName) {
        csvFile = file.getBlob().getDataAsString();
    break;
    }
  }

  var csvData = Utilities.parseCsv(csvFile);
  ////var col1 = csvData.slice(23,1);
  ////This didn't work so I removed it for now
  var ss = SpreadsheetApp.getActiveSpreadsheet();
  if (ss.getSheetByName(fileName) != null){
   Browser.msgBox("That file has already been opened");
    return 0;
  }

  var create = ss.insertSheet(fileName);

  create.getRange(1, 1, csvData.length, 
csvData[0].length).setValues(csvData); 

////create.getRange(1, 1, csvData.length, 
////csvData[0].length).setValues(col1); 
////This last line was a test in using the SLICE but gave an error  
}
Tommy
  • 65
  • 8
  • 1
    Please share your spreadsheet. – Tedinoz Jun 18 '19 at 21:59
  • Provide sample csvFile and expected output for that file. – TheMaster Jun 19 '19 at 05:16
  • @Tedinoz - I have edited in a link to a sample sheet which includes an example of raw data as well as finished output. https://docs.google.com/spreadsheets/d/1dFGYYx5hUZ2zzKOmwIMb9mZpHATmWXGCqkRwFOqXh4k/edit?usp=sharing – Tommy Jun 19 '19 at 12:40
  • _I know how wide the data can be for each column, but it's variable by row and does not have a consistent delimiter._ I struggle to understand how one could write code for column widths when your prediction for Row1, Column A (24 characters) might not hold true for the rest of the rows. – Tedinoz Jun 19 '19 at 13:37
  • @Tedinoz I worded that poorly, it's the same for every row. I know that Column A will be 24 characters wide all the way down, but some of those characters will be wasted spaces so I will trim them after the fact. However, because the text data is variable, the amount of spaces is not consistent and my data has spaces so I'm not able to do a column split based on the space character. – Tommy Jun 19 '19 at 13:57
  • You could try using split('character'), but unless you manually write some special character between each column you want, I don't think it's possible. – Jescanellas Jun 19 '19 at 14:10

2 Answers2

1

Here's a starting point. There's probably some trimming and possibly some conversion of strings to numbers, but this covers the basic parsing.

function so5665682501() {
  var ss = SpreadsheetApp.getActiveSpreadsheet();
  var sheetname = "DataOutputTest";
  var sheet = ss.getSheetByName(sheetname);
  var LR = sheet.getLastRow();
  var range = sheet.getRange(1, 1, LR);
  //Logger.log("DEBUG: range: "+range.getA1Notation());
  var values = range.getValues();

  var startrow = 4;
  var outputrange = sheet.getRange(startrow,9, LR-startrow,9); 
  var output = []


  for (var i=startrow-1;i<LR;i++){
    var outputrow=[];
    var acctname = values[i][0].substring(0,23);
    var acctcode = values[i][0].substring(24,34);
    var acctline = values[i][0].substring(35,59);
    var acctdate = values[i][0].substring(60,70);
    var acctordered = values[i][0].substring(71,79);
    var acctshipped = values[i][0].substring(80,87);
    var acctcarrier= values[i][0].substring(88,98);
    var acctnumber1= values[i][0].substring(99,114);
    var acctnumber2= values[i][0].substring(115,123);
    var acctdate2= values[i][0].substring(124,133);
    // Logger.log("DEBUG: acctname:"+acctname+",acctcode:"+acctcode+",acctline:"+acctline+",acctdate:"+acctdate+"\n, acctordered:"+acctordered+",acctshipped:"+acctshipped+",acctcarrier:"+acctcarrier+",acctnumber1:"+acctnumber1+",acctnumber2:"+acctnumber2+",acctdate2:"+acctdate2);
    outputrow.push(acctname);
    outputrow.push(acctcode);
    outputrow.push(acctline);
    outputrow.push(acctdate);
    outputrow.push(acctordered);
    outputrow.push(acctdate);
    outputrow.push(acctcarrier);
    outputrow.push(acctnumber1); 
    outputrow.push(acctnumber2); 
    outputrow.push(acctdate2); 
    output.push(outputrow);
  }
  var outputrange = sheet.getRange(startrow,9, LR-startrow+1,10);
  outputrange.setValues(output);
}

BEFORE (One Column)

Screenshot-BEFORE

AFTER (Ten Columns)

Screenshot-after

Tedinoz
  • 5,911
  • 3
  • 25
  • 35
  • Thank you SO MUCH! I really appreciate it! I will give that a try and then work on getting the data trimmer up. I was having a hard time understanding how exactly to process the ````SUBSTRING```` command in the script so that really helps me see how it functions. – Tommy Jun 20 '19 at 12:52
0

To split e.g. A1:A10 into 4 columns of widths 20, 30, 40, and the remaining width, use:

=SPLIT(ARRAYFORMULA(REGEXREPLACE(TO_TEXT(A1:A10),
"\A(.{1,"&JOIN("})?(.{1,",{20,30,40})&"})?(.*)\z",
"$"&JOIN("<tab>$",SEQUENCE(COLUMNS({20,30,40})+1)))),
"<tab>",0,0)

With this formula all leading and trailing whitespace will be trimmed, and all strings in the expected format will be converted to corresponding numbers, dates, etc., So e.g. 00000123 will become just 123, and 1/1 will become 1 January of the current year, or 43862 for 2020.

If you need to format all result values as text, then use this:

=ARRAYFORMULA(REGEXREPLACE(SPLIT(REGEXREPLACE(TO_TEXT(A1:A10),
"\A(.{1,"&JOIN("})?(.{1,",{20,30,40})&"})?(.*)\z",
"$"&JOIN("<as_text><tab><as_text>$",SEQUENCE(COLUMNS({20,30,40})+1))),
"<tab>",0,0),"<as_text>",""))

<tab> and <as_text> are used here as unique strings not present in your text, so if it contains them, then you need to use something different, e.g. <column_break> and <format_values_as_text>.

vstepaniuk
  • 667
  • 6
  • 14