I have implemented below logic in scala so far for this :
val hadoopConf = new Configuration(sc.hadoopConfiguration);
//hadoopConf.set("textinputformat.record.delimiter", "2016-")
hadoopConf.set("textinputformat.record.delimiter", "^([0-9]{4}.*)")
val accessLogs = sc.newAPIHadoopFile("/user/root/sample.log", classOf[TextInputFormat], classOf[LongWritable], classOf[Text], hadoopConf).map(x=>x._2.toString)
I want to put regex to recognize the if line started with date format then treat it as a new record else continue to add lines in old record.
But this is not working. If i am passing date manually then its working fine. Below is the same code like this i want to put the regex:
//hadoopConf.set("textinputformat.record.delimiter", "2016-")
Please help on this.thanks in advance.
Here below is the sample format:
2016-12-23 07:00:09,693 [jetty-51 - /app/service] INFO org.apache.cxf.interceptor.LoggingOutInterceptor S:METHOD_NAME=METHNAME : WebAppSessionId= : ChannelSessionId=web-xxx-xxx-xxx : ClientIp=xxxxxxx : - Outbound Message
---------------------------
ID: 1978
Address: https://sample.domain.com/SampleService.xxx/basic
Encoding: UTF-8
Content-Type: text/xml
Headers: {Accept=[*/*], SOAPAction=["WebDomain.Service/app"]}
Payload: <soap:Envelope>
</soap:Envelope>
2016-12-26 08:00:01,514 [jetty-1195 - /app/service/serviceName] ERROR com.testservices.cache.impl.ActiveSpaceCacheHandler S:METHOD_NAME=ServiceInquiryWithBands : WebAppSessionId= : ChannelSessionId=SERVICE : ClientIp=client-ip : - ActiveSpaceCacheHandler:getServiceResponseFromCache(); exception: java.lang.Exception: getServiceResponseData: com.tibco.as.space.RuntimeASException: field key is not nullable and is missing in tuple for cachekey:Request.US
2016-12-26 08:00:01,624 [jetty-979 - /app/service/serviceName] ERROR com.testservices.cache.impl.ActiveSpaceCacheHandler S:METHOD_NAME=ServiceInquiryWithBands : WebAppSessionId= : ChannelSessionId=SERVICE : ClientIp=client-ip : - ActiveSpaceCacheHandler:setServiceResponseInCache(); exception: com.test.as.space.RuntimeASException: field key is not nullable and is missing in tuple for cachekey