0

I need to create a program which can search a document and fill the metadata from document( eg. resume of candidate) like user experience, user skill , location etc.

for this i like to use oracle indexing mechanism(Oracle text search) because it index all the data from document. when it index the document, i like to first update my metadata field from indexed data and then content server will update their indexes. Can anyone help me how i will get to know the working of indexer and event on which i will trap and do some modification for updating my metadata.

i need to update metadata because requirement are:

Extensive choices for Search Filter criteria (that searches within Resumes and not just form keywords) : - Boolean search between multiple parameters - Have search on Skills, Years of experiences, particular company, education qualification, Geo/Location and Submission date of the profile. - Search on who referred, name, team , BU etc. - Result window adequate size of results, filters - Predefined resume filter criteria to assisting screening in case of candidate applying on job portal

2 Answers2

0

You are looking at this problem from the wrong end. The indexer (OracleText Search) is a powerful and complex tool embedded inside the workings of the database. What you are suggesting is to interpret the results of text indexing and use this as metadata for your content - if I am not mistaken? OracleText generates huge amounts of data and literally "chops" up documents word for word. For you to make meaningful metadata from this would be a huge task. Instead you should be looking at the capture of the metadata from as close to the source as possible. This could be done using (if you are using MS-OFFICE) Word vbScript when the user saves to the repository or filesystem. I believe you can fully manipulate the metadata in a document at savetime. You will of course need to install the Oracle WebCenter Content Desktop Integration suite.

OraNob
  • 684
  • 3
  • 9
0

Look into Oracle WebCenter Capture. WebCenter Capture can scan a document and allows metadata to be automatically tagged on the document. WebCenter Capture integrates with WebCenter Content (WCC) and allows you to directly checkin scanned documents to WebCenter Content.

http://www.oracle.com/technetwork/middleware/webcenter/content/index-090596.html

Jonathan Hult
  • 1,351
  • 2
  • 13
  • 19