0

I am trying to run search from elastic based on output from another application which dumps data in JSON format. Here is the format :

{
    "currentRow": 100,
    "fields": [
        { "name": "dDocName" }, { "name": "dDocTitle" }, { "name": "dDocType" }, { "name": "dSecurityGroup" },
        { "name": "dInDate" }, { "name": "xColor" }, { "name": "xPersonType" }, { "name": "xRegionDefinition" },
        { "name": "xLibraryGUID" }, { "name": "dDocLastModifiedDate" }, { "name": "xIdentityNum" },
        { "name": "xLonTriggeyu" }, { "name": "xIntField" }, { "name": "dRevClassID" }, { "name": "xFFTest" },
        { "name": "xWCWorkflowAssignment" }, { "name": "dDocClass" }, { "name": "xWebsiteObjectType" }, 
        { "name": "xCustomerCode" }, { "name": "xInvoiceNum" }, { "name": "AlternateFormat" }, { "name": "dDocAuthor" },
        { "name": "xfruit" }, { "name": "xSupplierNum" }, { "name": "xEBSParam" }, { "name": "xTestTree" },
        { "name": "xVideoRenditions" }, { "name": "xStorageRule" }, { "name": "xstatecitymemo" },
        { "name": "xPOHeaderId" }, { "name": "xTREELOCATION" }, { "name": "xDamConversionType" },
        { "name": "xInvoiceAmount" }, { "name": "xDiscussionType" }, { "name": "dDocFunction" },
        { "name": "xModifiedBy" }, { "name": "xCustomerTaxPayerId" }, { "name": "dOutDate" },
        { "name": "xIPMSYS_BATCH_SEQ" }, { "name": "dDocLastModifier" }, { "name": "dFormat" },
        { "name": "dRendition2" }, { "name": "dRendition1" }, { "name": "xCustomerName" }, { "name": "xHideThread" },
        { "name": "xGender" }, { "name": "xWCTags" }, { "name": "xExtURL" }, { "name": "xTestFolder1" },
        { "name": "xPackagedConversions" }, { "name": "xClbraRoleList" }, { "name": "xFFTest1" },
        { "name": "xInvoiceCurrency" }, { "name": "dDocCreatedDate" }, { "name": "xWebsites" },
        { "name": "xTestFiddler" }, { "name": "xDontShowInListsForWebsites" }, { "name": "dDocAccount" },
        { "name": "URL" }, { "name": "xClbraUserList" }, { "name": "xAvaya_Region" }, { "name": "dCreateDate" },
        { "name": "dID" }, { "name": "xSri2" }, { "name": "dExtension" }, { "name": "xSri1" },
        { "name": "xfwm_cat_Mercados" }, { "name": "dWebExtension" }, { "name": "xcateg1" }, { "name": "xChecksum" },
        { "name": "xPONum" }, { "name": "dDocCreator" }, { "name": "VaultFileSize" }, { "name": "dRevLabel" },
        { "name": "xFirstName" }, { "name": "xCMUTest" }, { "name": "xDiscussionCount" }, { "name": "xClbraAliasList" },
        { "name": "xPartitionId" }, { "name": "dGif" }, { "name": "xIPMSYS_APP_ID" }, { "name": "dFullTextFormat" },
        { "name": "xTest1" }, { "name": "xFamilyName" }, { "name": "xInvoice" }, { "name": "xInvoiceDate" },
        { "name": "dRevisionID" }, { "name": "xWebsiteSection" }, { "name": "xWCWorkflowApproverUserList" },
        { "name": "WebFileSize" }, { "name": "xComments" }, { "name": "xWebFlag" }, { "name": "xNewtest" },
        { "name": "xOptionListIssue" }, { "name": "xtest" }, { "name": "xIPMSYS_BATCH_ID1" }, { "name": "xIdcProfile" },
        { "name": "dOriginalName" }, { "name": "dDocOwner" }, { "name": "dPublishType" }, { "name": "otsFormat" },
        { "name": "otsCharset" }, { "name": "otsLanguage" }, { "name": "SCORE" }, { "name": "srfDocSnippet" }
    ],
    "rows": [
        ["WCCPS7_024401", "test1", "EBSAttachment", "AOK-Public", "5/3/167:18AM", "", "", "IDCNULL", "", "5/3/167:19AM", "", "", "0", "24401", "", "", "", "", "", "", "", "wccuser", "", "", "", "0", "", "DispByContentId", "", "", "", "", "", "N/A", "", "", "", "", "", "wccuser", "Application/unknown", "", "", "", "", "", "", "", "", "", "", "", "", "5/3/167:19AM", "", "", "", "", "/cs/groups/aok-public/documents/ebsattachment/czdf/mdi0/~edisp/wccps7_024401", "", "", "5/3/167:19AM", "24801", "", "", "", "", "", "", "2cd4124073fed81c624af0101ba28bda16db650fee35cbc8fd629904dead1b09/SHA-256", "", "wccuser", "377", "1", "", "", "0", "", "", "archiv.gif", "", "", "", "", "", "", "1", "", "", "377", "", "", "", "", "", "0", "EBSProfile", "UntitledDocument", "wccuser", "", "", "", "", "3", ""], 
        ["WCCPS7_024202", "DLEASE_RAW_response", "Document", "AOK-Public", "4/19/1611:11AM", "", "", "IDCNULL", "", "4/19/1611:11AM", "", "", "0", "24202", "", "", "", "", "", "", "", "weblogic", "", "", "", "0", "", "DispByContentId", "", "", "", "", "", "N/A", "", "", "", "", "", "weblogic", "text/plain", "", "", "", "", "", "", "", "", "", "", "", "", "4/19/1611:11AM", "", "", "", "", "/cs/groups/aok-public/documents/document/czdf/mdi0/~edisp/wccps7_024202.txt", "", "", "4/19/1611:11AM", "24402", "", "txt", "", "", "txt", "", "2e2a98a3af833032d4f2b5ec3a8c62b80edeb13ac417d472744c713e4cae27e5/SHA-256", "", "weblogic", "594", "1", "", "", "0", "", "", "ucm_document.png", "", "txt", "", "", "", "", "1", "", "", "594", "", "", "", "", "", "0", "", "DLEASE_RAW_response.txt", "weblogic", "", "", "", "", "3", ""]
        // ...

Have dumped this data to a json file and uploaded it to elastic.

I am unable to create a query which would list items / data based on specific values for each of the fields.

For example, how should I set up a query which will return all items where dDocAuthor is weblogic?

OrangeDog
  • 36,653
  • 12
  • 122
  • 207
Srinath Menon
  • 1,479
  • 8
  • 11

1 Answers1

0

First it should be proper json format to index to elasticsearch.For Indexing use BULK API which is provided in elasticsearch documentation.

Then you can query based on your requirement.

  • The format is correct and the upload to elastic worked fine .But, specifically for this kind of a json input , how do we do a search for listing all the items that have dDocAuthor = 'weblogic' ? – Srinath Menon May 31 '16 at 09:57
  • name=dDocAuthor right now you want to search based on weblogic right but for weblogic where is the field name. – KARTHEEK GUMMALURI May 31 '16 at 10:07
  • GET INDEX_NAME/INDEX_TYPE/_search { "query":{ "match_phrase":{ "dDocAuthor" = "weblogic" } } } This would help I think so – KARTHEEK GUMMALURI May 31 '16 at 10:07
  • Here is the simplified json format of existing item in elastic : {"currentRow":5,"fields":[{"name":"dDocName"},{"name":"dDocTitle"},{"name":"dDocType"},{"name":"dSecurityGroup"},{"name":"dDocAuthor"}], "rows":[["WCCPS7_024401","test1","EBSAttachment","AOK-Public","weblogic"] . This corresponds to a list of metadata of which one is dDocAuthor for which a value corresponds to "weblogic". Likewise there will be n number of records (values) for different items . So, for this how to create a query which will return all those items which has dDocAuthor = 'weblogic' ? – Srinath Menon May 31 '16 at 13:28
  • Hi Srinath I have tried to index your sample json to elasticsearch.It is indexed to elasticsearch but I am not able to search.I think the data to be sanitized first before indexing. – KARTHEEK GUMMALURI May 31 '16 at 15:20
  • Thank you very much for all the inputs . This data comes from an external application and is being is streamed as json to a file which in turn is being uploaded to elastic .I think the question need to be rephrased to "would elastic be able to retrieve result from this kind of json data ?" – Srinath Menon May 31 '16 at 16:03
  • Hi Srinath please help me when I need your help and I am working in a startup company which is located in visakhapatnam.I working on couchbase and elasticsearch.So as a experienced person like yourself help in my career building.I work on with your query it seems something tricky. – KARTHEEK GUMMALURI Jun 01 '16 at 03:16
  • Hi Srinath you can also post your in elasticsearch forum which would be helpful.I generally use elasticsearch forum to solve my issues. – KARTHEEK GUMMALURI Jun 01 '16 at 06:51