I want to create an application which will be able to parse doc/docx files structure of this file is shown bellow:
par-000.01 - some content
par-000.21 - some content
par-000.31 - some content
par-001.32 - some content
content could be multi line and not regular. What I want to do is to put these content into database I mean for first record - par-000.01
into code
column and some content
into text column.
The reason why I cannot do this manually is that I have about 15 docs where each of them contains about 10 pages of paragraphs I want to put into my database.
I cannot find any article how can i parse whole doc file so I believe it could be possible if i write proper regular expression. Can anyone redirect me to the article how I can do what I want- I can't find anything that suits me probably I am using wrong key words..