1

I usually use VBA in Excel to automate laborious tasks like cleaning up a table of data, but I want to try out Excel Scripts. Being less familiar with JavaScript, I thought I'd try ChatGPT to do the heavy lifting. I think it'll work apart from a couple of errors that I need help with.

My table of data contains the following (among other columns):

Location 1 Location 2 Location 3 Location 4 Location 5 Location 6 Location 7 Product ID Attribute 1 Attribute 2 Attribute 3 Attribute 4 Attribute 5 Attribute 6 Attribute 7 Attribute 8 Attribute 9
some text 1 value value value value value value value value value
more text 1 value value value value value
different text 1 value2 value2 value2 value value value
text 1 value3 value2
some text 1 value value value value value value value value value
lots of text 1 value3 value3 value2 value
text text 1 value value2 value3 value3 value value4 value4

I have a product with attributes but the product is duplicated across locations. For a product ID, all the attributes should be the same across all locations but they're not. The script finds the groups of duplicate Product ID's and working through locations in an order of preference, picks the row that should have good attributes (location 1 in this example). It then copies those attribute values to the other rows with the same duplicate product ID (i.e all 7 rows should have the same attribute values).

Here's the full Excelscript that ChatGPT produced.

function main(workbook: ExcelScript.Workbook) {
  // Step 1: Find columns with text in row 1 and store the text with associated column index in an array
  const currentSheet = workbook.getActiveWorksheet();
  const headerRow = currentSheet.getRange("1:1").getValues()[0]; // Get values of row 1
  const columnIndexes: { [key: string]: number } = {};

  for (let i = 0; i < headerRow.length; i++) {
    const columnName = headerRow[i].toString().trim();
    if (columnName !== "") {
      columnIndexes[columnName] = i + 1; // Store the column index
    }
  }

  // Step 2: Find the index for the column called "Product ID"
  const productIDIndex = columnIndexes["Product ID"];
  if (!productIDIndex) {
    console.log("Column 'Product ID' not found.");
    return;
  }

  // Step 3: Find duplicate rows based on "Product ID" column
  const dataRange = currentSheet.getUsedRange();
  const dataValues = dataRange.getValues();
  const duplicateGroups: { [key: string]: ExcelScript.Range[] } = {};

  for (let i = 1; i < dataValues.length; i++) {
    const productID = dataValues[i][productIDIndex - 1].toString().trim();
    if (productID !== "") {
      if (!duplicateGroups[productID]) {
        duplicateGroups[productID] = [currentSheet.getRangeByIndexes(i, 0, 1, currentSheet.getLastColumn())];
      } else {
        duplicateGroups[productID].push(currentSheet.getRangeByIndexes(i, 0, 1, currentSheet.getLastColumn()));
      }
    }
  }

  // Step 4: Loop through each group of matching duplicate rows
  for (const productID in duplicateGroups) {
    const rowRanges = duplicateGroups[productID];

    // Step 5: Find the first row with text in the specified columns in the given order
    let chosenRowRange: ExcelScript.Range | null = null;
    const locationColumnsOrder = [
      "Location 1",
      "Location 2",
      "Location 3",
      "Location 4",
      "Location 5",
      "Location 6",
      "Location 7",
    ];

    for (const rowRange of rowRanges) {
      for (const locationCol of locationColumnsOrder) {
        const colIndex = columnIndexes[locationCol];
        const cellValue = rowRange.getCell(0, colIndex - 1).getValue().toString().trim();
        if (cellValue !== "") {
          chosenRowRange = rowRange;
          break;
        }
      }

      if (chosenRowRange !== null) {
        break;
      }
    }

    // Step 6: If no text found, display a message and exit the script
    if (chosenRowRange === null) {
      const productIDValue = rowRanges[0].getCell(0, productIDIndex - 1).getValue().toString().trim();
      console.log(`Rows with Product ID '${productIDValue}' did not have a Location description.`);
      continue;
    }

    // Step 7: Copy cell values from specific columns of the chosen row to other rows in the group
    const chosenRowValues = chosenRowRange.getValues()[0];
    for (const rowRange of rowRanges) {
      if (!rowRange.getAddress().equals(chosenRowRange.getAddress())) {
        for (const colName of ["Attribute 1", "Attribute 2", "Attribute 3", "Attribute 4", "Attribute 5", "Attribute 6", "Attribute 7", "Attribute 8", "Attribute 9"]) {
          const colIndex = columnIndexes[colName];
          const cellValue = chosenRowValues[colIndex - 1];
          rowRange.getCell(0, colIndex - 1).setValue(cellValue);
        }
      }
    }
  }
}


Excel complains that in this section, the getLastColumn property does not exist on type Worksheet. I think I should be able to replace currentSheet.getLastColumn() with headerRow.length if all it wants is the total number of columns.

// Step 3: Find duplicate rows based on "Product ID" column
  const dataRange = currentSheet.getUsedRange();
  const dataValues = dataRange.getValues();
  const duplicateGroups: { [key: string]: ExcelScript.Range[] } = {};

  for (let i = 1; i < dataValues.length; i++) {
    const productID = dataValues[i][productIDIndex - 1].toString().trim();
    if (productID !== "") {
      if (!duplicateGroups[productID]) {
        duplicateGroups[productID] = [currentSheet.getRangeByIndexes(i, 0, 1, currentSheet.getLastColumn())];
      } else {
        duplicateGroups[productID].push(currentSheet.getRangeByIndexes(i, 0, 1, currentSheet.getLastColumn()));
      }

In Step 4, it cannot infer the data type of the variable const rowRanges = duplicateGroups[productID];.

In Step 7, Excel says that in the line if (!rowRange.getAddress().equals(chosenRowRange.getAddress())) {, .equals does not exist on type string. I tried replacing it with a = instead but the expected syntax was wrong.

Plus it complains that read methods inside a loop could lead to slow performance but I don't think that will stop the script from working.

If you're interested, this is the statement that I gave to ChatGPT:

Write an Excel Office Script that will find all cells on row 1 of the current sheet that contain text and store the text with associated column index in an array so that column indexes can be referenced by text description later in the script. With the index for the column called "Product ID", find all duplicate rows in the sheet based on this column. For each group of duplicate rows with identical "Product ID", loop through each row and evaluate the cell values from the following columns, in this specific order, "Location 1" or "Location 2" to find the first row within the group to contain text in one of these cells. Store the range of the row that evaluates to true. If no cell value contains text, search the following columns, in order, "Location 3" or "Location 4" or "Location 5" or "Location 6" or "Location 7". Store the range of the row that evaluates to true. If no cell value contains text, copy the cell value from the "Product ID" column and use it within a message that says "did not have an Location description." and exit the script. With the row that evaluated to true, copy the cell values from the columns called "Attribute 1" and "Attribute 2" and "Attribute 3" and "Attribute 4" and Attribute 5" and "Attribute 6" and "Attribute 7" and "Attribute 8" and "Attribute 9". Paste the copied values into cells, in the same column locations, for the other rows in the group that have identical "Product ID". Annotate the script to explain what each step in the script does.

Mayukh Bhattacharya
  • 12,541
  • 5
  • 21
  • 32
Dominic
  • 69
  • 1
  • 7

1 Answers1

1

Code generated by AI often requires refinement and debugging. The code sample provided was a good starting point, not a complete solution.

I aimed to follow the overall logic and flow of your code, but fixed the errors to get it working properly.

function main(workbook: ExcelScript.Workbook) {
    // Step 1: Find columns with text in row 1 and store the text with associated column index in an array
    const currentSheet = workbook.getActiveWorksheet();
    // Get values of row 1
    const headerRow = currentSheet.getUsedRange().getRow(0).getValues()[0];
    // getRange("1:1").getValues()[0]; 
    const columnIndexes: { [key: string]: number } = {};

    for (let i = 0; i < headerRow.length; i++) {
        const columnName = headerRow[i].toString().trim();
        if (columnName !== "") {
            columnIndexes[columnName] = i + 1; // Store the column index
        }
    }
    // console.log(columnIndexes)

    // Step 2: Find the index for the column called "Product ID"
    const productIDIndex = columnIndexes["Product ID"];
    if (!productIDIndex) {
        console.log("Column 'Product ID' not found.");
        return;
    }
    // console.log(productIDIndex);

    // Step 3: Find duplicate rows based on "Product ID" column
    const dataRange = currentSheet.getUsedRange();
    const dataValues = dataRange.getValues();
    const duplicateGroups: { [key: string]: ExcelScript.Range[] } = {};
    let lastCol = currentSheet.getUsedRange().getLastColumn().getColumnIndex() + 1;
    for (let i = 1; i < dataValues.length; i++) {
        const productID = dataValues[i][productIDIndex - 1].toString().trim();
        if (productID !== "") {
            if (!duplicateGroups[productID]) {
              duplicateGroups[productID] = [currentSheet.getRangeByIndexes(i, 0, 1, lastCol)];
            } else {
              duplicateGroups[productID].push(currentSheet.getRangeByIndexes(i, 0, 1, lastCol));
            }
        }
    }
    // console.log(duplicateGroups['1'][0].getAddress())

    // Step 4: Loop through each group of matching duplicate rows
    for (const productID in duplicateGroups) {
        const rowRanges = duplicateGroups[productID];

        // Step 5: Find the first row with text in the specified columns in the given order
        let chosenRowRange: ExcelScript.Range | null = null;
        const locationColumnsOrder = [
            "Location 1",
            "Location 2",
            "Location 3",
            "Location 4",
            "Location 5",
            "Location 6",
            "Location 7",
        ];

        for (const rowRange of rowRanges) {
            for (const locationCol of locationColumnsOrder) {
                const colIndex = columnIndexes[locationCol];
                const cellValue = rowRange.getCell(0, colIndex - 1).getValue().toString().trim();
                if (cellValue !== "") {
                    chosenRowRange = rowRange;
                    break;
                }
            }

            if (chosenRowRange !== null) {
                break;
            }
        }

        // Step 6: If no text found, display a message and exit the script
        if (chosenRowRange === null) {
            const productIDValue = rowRanges[0].getCell(0, productIDIndex - 1).getValue().toString().trim();
            console.log(`Rows with Product ID '${productIDValue}' did not have a Location description.`);
            continue;
        }

        // Step 7: Copy cell values from specific columns of the chosen row to other rows in the group
        const chosenRowValues = chosenRowRange.getValues()[0];
        for (const rowRange of rowRanges) {
            if (rowRange.getAddress() !== chosenRowRange.getAddress()) {
                for (const colName of ["Attribute 1", "Attribute 2", "Attribute 3", "Attribute 4", "Attribute 5", "Attribute 6", "Attribute 7", "Attribute 8", "Attribute 9"]) {
                    const colIndex = columnIndexes[colName];
                    const cellValue = chosenRowValues[colIndex - 1];
                    rowRange.getCell(0, colIndex - 1).setValue(cellValue);
                }
            }
        }
    }
}

enter image description here


Update

The previous code sample relies heavily on repeated calls to getCell to read / write data from the worksheet. This can become a performance bottleneck especially on large sheets.

Below is an optimized version that processes the data through arrays instead. By loading values into arrays upfront and minimizing cell-by-cell access, the operations can be vectorized for better efficiency.

function main(workbook: ExcelScript.Workbook) {
  // Step 1: Find columns with text in row 1 and store the text with associated column index in an array
  const currentSheet = workbook.getActiveWorksheet();
  // Get values of row 1
  const dataRange = currentSheet.getUsedRange();
  const headerRow = dataRange.getRow(0).getValues()[0];
  // getRange("1:1").getValues()[0]; 
  const columnIndexes: { [key: string]: number } = {};
  for (let i = 0; i < headerRow.length; i++) {
    const columnName = headerRow[i].toString().trim();
    if (columnName !== "") {
      columnIndexes[columnName] = i + 1; // Store the column index
    }
  }
  // Step 2: Find the index for the column called "Product ID"
  let productIDIndex = headerRow.indexOf("Product ID")
  if (productIDIndex === -1) {
    console.log("Column 'Product ID' not found.");
    return;
  }
  // console.log(productIDIndex, headerRow.length);
  // Step 3: Find duplicate rows based on "Product ID" column
  const dataValues = dataRange.getValues();
  const duplicateGroups: { [key: string]: String[] } = {};
  let lastCol = dataRange.getLastColumn().getColumnIndex() + 1;
  for (let i = 1; i < dataValues.length; i++) {
    const productID = dataValues[i][productIDIndex].toString().trim();
    if (productID !== "") {
      if (!duplicateGroups[productID]) {
        duplicateGroups[productID] = [i.toString()];
      } else {
        duplicateGroups[productID].push(i.toString());
      }
    }
  }
  // console.log(duplicateGroups['2'])
  // Step 4: Loop through each group of matching duplicate rows
  for (const productID in duplicateGroups) {
    const rowRanges = duplicateGroups[productID];
    // Step 5: Find the first row with text in the specified columns in the given order
    let chosenRowRange: ExcelScript.Range | null = null;
    const attList = ["Attribute 1", "Attribute 2", "Attribute 3",
      "Attribute 4", "Attribute 5", "Attribute 6",
      "Attribute 7", "Attribute 8", "Attribute 9"];
    const locationColumnsOrder = [
      "Location 1", "Location 2", "Location 3",
      "Location 4", "Location 5", "Location 6",
      "Location 7",
    ];
    // rewrite step 5
    const resultRow: { [key: string]: String } = {};
    for (let rowRange of rowRanges) {
      for (const locationCol of locationColumnsOrder) {
        // let locationCol = locationColumnsOrder[i]
        const colIndex = columnIndexes[locationCol] - 1;
        const cellValue = dataValues[parseInt(rowRange.toString())][colIndex];
        if (cellValue !== "") {
          // chosenRowRange = rowRange;
          resultRow[productID] = rowRange;
          // console.log(cellValue)
          console.log(Array(productID, rowRange, colIndex.toString()).join("-"))
          break;
        }
      }
      if (resultRow[productID]) { break; }
    }
    // Step 6: If no text found, display a message and exit the script
    if (resultRow[productID]) {
      // Step 7: Copy cell values from specific columns of the chosen row to other rows in the group
      for (const rowRange of rowRanges) {
        if (resultRow[productID] !== rowRange) {
          for (const colName of attList) {
            const colIndex = columnIndexes[colName] - 1;
            dataValues[parseInt(rowRange.toString())][colIndex] =
              dataValues[parseInt(resultRow[productID].toString())][colIndex]
          }
        }
      }
    }
    else {
      console.log(`Rows with Product ID '${productID}' did not have a Location description.`);
      continue;
    } // E6
  }
  // console.log(dataValues)
  dataRange.setValues(dataValues);
}
taller_ExcelHome
  • 2,232
  • 1
  • 2
  • 12
  • Brilliant, thankyou for taking the time. It will help me to spot future errors as I'm scanning code. So all it needed was a better way to get the headerRow values, to set a last column variable that used the correct get property statements and the If statement syntax to be corrected to not equals. I'm still quite impressed with ChatGPT to generate a script with a number of logic steps and get it 97% correct. – Dominic Jul 27 '23 at 08:45
  • @Dominic I have optimized the code a little bit. – taller_ExcelHome Jul 27 '23 at 17:09
  • Wow! That is so much faster. One error, which was my fault is that I actually wanted to check all the duplicate rows for Location 1, before checking them all again for Location 2, etc. Your optimized code made it very easy in step 5 to make locationCol the first For loop and rowRange the second to do exactly what I wanted. With rowRange as the first For loop, it just takes the first record of a duplicate set and rotates through the locations. All the locations would need to be blank before it moves onto the next row. – Dominic Aug 02 '23 at 13:57