1

I am using Microsoft.Office.Interop.Word and want to output a given database as a table in a Word document. But after my first attempts (example below), I quickly noticed that the time to iterate over each table row increases rapidly with the number of rows.

using Word = Microsoft.Office.Interop.Word;

private void InsertTableWithRandomValue(Word.Document doc)
{
    var par = doc.Paragraphs.Add();
    string guid = Guid.NewGuid().ToString();

    Word.Table firstTable = doc.Tables.Add(par.Range, 1000, 5);
    firstTable.Borders.Enable = 1;

    foreach (Word.Row row in firstTable.Rows)
    {
        foreach (Word.Cell cell in row.Cells)
        {
            cell.Range.Text = guid;
        }
    }
}

In a table with 200 rows it takes about 100ms to write each entry. With 2000 rows however, it is already up to 400ms per entry, which is not fast enough for my use case.

Is there a way to iterate over each row faster?

If not, my next approach would be to temporarily save the Word document as HTML, edit the text, and export it again as a Word document.

lkjasd2
  • 64
  • 1
  • 5
  • Maybe a wrong question, but: "The type name 'Document' does not exist in the type 'Word'" ? – Luuk Apr 26 '22 at 13:31
  • Sorry for the confusion, I added the namespace as "Word" to avoid duplicate assignments. I added it in the example. – lkjasd2 Apr 26 '22 at 14:05

2 Answers2

2

Office Interop is really slow. You should consider using Open XML to create documents.

I've created an example from your code. First, add DocumentFormat.OpenXml NuGet package to your project.

using var document = WordprocessingDocument.Create(path, WordprocessingDocumentType.Document); // or WordprocessingDocument.Open() if you want to open an existing one

// Since we create a new document, we need to add main document part, as well as body etc.
var mainDocumentPart = document.AddMainDocumentPart();

mainDocumentPart.Document = new Document();

var body = mainDocumentPart.Document.AppendChild(new Body());

// Adding borders
var borderType = new EnumValue<BorderValues>(BorderValues.Single);
UInt32Value borderSize = 16;

var table = new Table(new TableProperties(new TableBorders(
        new TopBorder
        {
            Val = borderType,
            Size = borderSize
        },
        new RightBorder
        {
            Val = borderType,
            Size = borderSize
        },
        new BottomBorder
        {
            Val = borderType,
            Size = borderSize
        },
        new LeftBorder
        {
            Val = borderType,
            Size = borderSize
        },

        new InsideVerticalBorder
        {
            Val = borderType,
            Size = borderSize
        },
        new InsideHorizontalBorder
        {
            Val = borderType,
            Size = borderSize
        }
    )
));

var guid = Guid.NewGuid().ToString();

const int rowCount = 1000;
const int columnCount = 5;

// Instead of creating a row with same cell every single time, creating a cell and then populating the rows with it would be better
var tableCell = new TableCell();
tableCell.Append(new Paragraph(new Run(new Text(guid))));

var tableRow = new TableRow();
for (var i = 0; i < columnCount; i++)
{
    tableRow.Append(new TableCell(tableCell.OuterXml)); // You cannot add same element twice, because you cannot add something that's part of a tree. So we clone it before adding
}

for (var i = 0; i < rowCount; i++)
{
    table.Append(new TableRow(tableRow.OuterXml));
}

body.Append(table);
cemahseri
  • 397
  • 2
  • 12
  • Thanks for pointing out Open XML, it's really fast. Even with creating a new cell and Guid each time it took only 60ms! – lkjasd2 Apr 26 '22 at 15:50
  • Of all the research the past hours, this code sample differentiated in that it illustrated the implied fact that these docs appear to be hashed/unique. Hence the need to instantiate new objects. Simple copying whether code or text won't work. Those attributes really mean something! [he said, kicking himself] – Rodney Aug 28 '23 at 22:54
2

I created a text containing the stuff you are inserting into the table, and then converting this text to a table:

Document document = new Document();
int rows = 1000;
int cols = 5;

//document.Application.Visible = true;
var par = document.Paragraphs.Add();
string guid = Guid.NewGuid().ToString();

var start = DateTime.Now;

StringBuilder sb = new StringBuilder();
for (int i = 0; i < rows; i++)
{
    for (int j = 0; j < cols; j++)
    {
        sb.Append(guid);
        if (j<cols-1) sb.Append('\t');
    }
    sb.AppendLine("");
}
par.Range.Text = sb.ToString();
var r = document.Range(0,par.Range.End);
Table t = r.ConvertToTable(NumRows:rows,NumColumns:cols);
t.Borders.Enable = 1;

document.SaveAs2(@"d:\temp\test.docx");
document.Application.Quit();
Console.WriteLine($"{(DateTime.Now-start).TotalMilliseconds/1000}");
Console.ReadLine();

This takes about 1.7 second, and your method was about 38 seconds on my system.

Luuk
  • 12,245
  • 5
  • 22
  • 33
  • Great solution! I didn't know about the `Range.ConvertToTable` method yet. On my system it takes 4.1s to execute. – lkjasd2 Apr 26 '22 at 15:52