-1

I have a folder full of Word forms that I want to extract a few key pieces of information from: A name, a number, a date, and the contents of a single cell. I've seen solutions for importing entire Word tables, but I don't need most of the information on the tables.

All the forms in word are the same, save for the data in the cells, as people have used the template to make these forms. All the cells are labeled to the right of the cells I actually want. I have a folder full of these Word document forms.

I've tried modifying this script (Extract Data from Word Document to an Excel SpreadSheet) but my VBA skills are terrible and I can't get more than a line of data before it crashes out.

How do I extract the specific cells containing the data I want from these word documents? I'd even be happy just to get the entire line that data is on.

  • Your question is too broad. For example, you don't say how all the required content is to be identified, which table(s), or how many tables, from each document are involved. – macropod Apr 02 '19 at 22:01
  • The form is one continuous table. I don't know how Word treats its tables, but all the data is in the same place on every document. If it were in Excel it'd be (for example) Cell D3, B12, and D25. – Sean McDonough Apr 03 '19 at 12:50

1 Answers1

0

Try the following Excel macro, which extracts the Word data from cells D3, B12, and D25 in the first table of each Word document in the selected folder. The document name is output to column A and the remaining data are output to columns B-D. That's only 3 items from each file, but your reference to "A name, a number, a date, and the contents of a single cell" implies there are 4.

Sub GetTableData()
'Note: this code requires a reference to the Word object model.
'See under the VBE's Tools|References.
Application.ScreenUpdating = False
Dim strFolder As String, strFile As String, WkSht As Worksheet, r As Long
strFolder = GetFolder: If strFolder = "" Then GoTo ErrExit
Dim wdApp As New Word.Application, wdDoc As Word.Document
Set WkSht = ActiveSheet: r = WkSht.Cells(WkSht.Rows.Count, 1).End(xlUp).Row
strFile = Dir(strFolder & "\*.doc", vbNormal)
With wdApp
  'Hide our Word session
  .Visible = False
  'Disable any auto macros in the documents being processed
  .WordBasic.DisableAutoMacros
  While strFile <> ""
    Set wdDoc = .Documents.Open(Filename:=strFolder & "\" & strFile, AddToRecentFiles:=False, Visible:=False)
    r = r + 1: WkSht.Range("A" & r) = Split(strFile, ".doc")(0)
    With wdDoc
      If .Tables.Count > 0 Then
        With .Tables(1)
          WkSht.Range("B" & r).Value = Split(.Cell(3, 4).Range.Text, vbCr)(0)
          WkSht.Range("C" & r).Value = Split(.Cell(12, 2).Range.Text, vbCr)(0)
          WkSht.Range("D" & r).Value = Split(.Cell(25, 4).Range.Text, vbCr)(0)
        End With
      End If
      .Close SaveChanges:=False
    End With
    strFile = Dir()
  Wend
  .Quit
End With
Set wdDoc = Nothing: Set wdApp = Nothing: Set WkSht = Nothing    Application.ScreenUpdating = True
End Sub
 
Function GetFolder() As String
    Dim oFolder As Object
    GetFolder = ""
    Set oFolder = CreateObject("Shell.Application").BrowseForFolder(0, "Choose a folder", 0)
    If (Not oFolder Is Nothing) Then GetFolder = oFolder.Items.Item.Path
    Set oFolder = Nothing
End Function
macropod
  • 12,757
  • 2
  • 9
  • 21