1

I am trying to remove some rows in a table on a MS Word document. Below is how the table, before processing looks like: enter image description here

I analyzed this table to understand the open XML representation the below is how the InnerText property is being formulated :

Items Description null
Classroom empty Interactive Classroom...
empty empty Case Study Classrooms ...
empty empty Auditoria Lecture Classrooms ...
Computers empty Mainframe Computer...
empty empty Supercomputer...
empty empty Workstation Computer...

The middle empty column is where the image is inserted. Image and the description are in two different cells, having an invisible border in between them.

Below is the code to remove items "Case Study Classrooms", "Supercomputer", "Workstation Computer","Personal Computer" and "Tablet".

var itemsToBeExcluded = new List<string>{"Case Study Classrooms", "Supercomputer", "Workstation Computer","Personal Computer","Tablet"};

using (MemoryStream stream = new MemoryStream())
{
    //pageData is a byte[] to represent the word file
    stream.Write(pageData, 0, (int)pageData.Length);
    using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(stream, true))
    {
        var table = wordDoc.MainDocumentPart.Document.Body.OfType<Table>().FirstOrDefault();
        int rowCount = 0;
        string firstColumnInnerXml = string.Empty;

        for (int t = 0; t<table.ChildElements.Count; t++)
        {
            if(table.ChildElements[t] is TableRow)
            {
                // Skip the header
                if (rowCount++ != 0)
                {
                    // Gets the inner xml of first column of the table and set if it is null for the subsequent rows
                    if (table.ChildElements[t].ChildElements[1].InnerText.Length > 0) 
                    {
                        firstColumnInnerXml = table.ChildElements[t].ChildElements[1].InnerXml;
                    }
                    else
                    {
                        table.ChildElements[t].ChildElements[1].InnerXml = firstColumnInnerXml;
                    }
                    
                    foreach (var removableItem in itemsToBeExcluded)
                    {
                        if (table.ChildElements[t].ChildElements[3].InnerText.ToLower().StartsWith(removableItem.ToLower()))
                        {
                            table.ChildElements[t].Remove();
                            t--;
                            goto OUTERCONTINUE;
                        }
                    }
                    OUTERCONTINUE:;
                }
            }
        }
        wordDoc.MainDocumentPart.Document.Save();
        wordDoc.Close();
    }
}

However after execution, the below is what I am getting: enter image description here

It is obvious that the image is missing, even though I am only removing the necessary rows, the images in the irrelevant rows are also seems to be corrupted/removed. Can someone explain why does this happen and how to solve this?

user2129013
  • 102
  • 1
  • 9
  • In addition to the above things mentioned by @user2129013 - I tried to add images manually by following the approach mentioned [here](https://stackoverflow.com/questions/63529065/inserting-image-corrupts-open-xml-sdk-generated-word-file) by fixing the **http** fix as well - however the images are still not available. – Isham Mohamed May 04 '21 at 10:09

0 Answers0