index=aws* Method response body after transformations: sourcetype="aws:apigateway" business_unit=XX aws_account_alias="xXXX" network_environment=test source="API-Gateway-Execution-Logs*" | rex field=_raw "Method response body after transformations: (?<json>[^$]+)" | spath input=json path="header.messageGUID" output=messageGUID | spath input=json path="payload.statusType.code" output=status | spath input=json path="payload.statusType.text" output=text
| where status=200
| rename _time as request_time | fieldformat request_time=strftime(request_time, "%F %T")
| join type=inner messageGUID [ search kubernetes_cluster="eks-XXXXX*" index="awsXXXX" sourcetype = "kubernetes_logs" source = *XXXX* "sendData" | rex field=_raw "sendData: (?<json>[^$]+)"
| spath input=json path="header.messageGUID" output=messageGUID
| table messageGUID, _time ]
|table messageGUID, request_time, _time
_time is coming as Null as output
Also how can I rename this field also ?
If you want to use fieldformat, use it as the very last thing you do in the pipeline - when you fieldformat inside the join, it's likely that the format is lost in the join process - fieldformat is simply for rendering data, while keeping the underlying data type.
The stats mechanism can be used on any number of datasets - you need to understand your data to be able to process different datasets in the same pipeline, using coalesce() to get a common ID to organise the results is as you can see in my example.
It is good to try to avoid join, as it has limitations and if you hit those limitations, the results will silently ignore any effect of that limit.
If _time is empty then it's probably not finding the data, or it may be hitting join limitations. If you run the inner search and look for ONE of the messageGUID values, does it come up with a result? If you run the search without looking at a particular messageGUID how many results do you get - join has a limit of 50,000 results from the inner search.
join has many limitations and you are often best off using stats to do the same thing.
This is a stats version and may help.
(index=aws* Method response body after transformations: sourcetype="aws:apigateway" business_unit=XX aws_account_alias="xXXX" network_environment=test source="API-Gateway-Execution-Logs*") OR
(kubernetes_cluster="eks-XXXXX*" index="awsXXXX" sourcetype = "kubernetes_logs" source = *XXXX* "sendData")
``` Search both data sets above ```
``` Now get fields from data set 1 ```
| rex field=_raw "Method response body after transformations: (?<json>[^$]+)"
| spath input=json path="header.messageGUID" output=messageGUID1
| spath input=json path="payload.statusType.code" output=status
| spath input=json path="payload.statusType.text" output=text
``` Now get fields from data set 2 ```
| rex field=_raw "sendData: (?<json>[^$]+)"
| spath input=json path="header.messageGUID" output=messageGUID2
``` Find the common GUID ```
| eval messageGUID=coalesce(messageGUID1, messageGUID2)
``` Filter first data set to status=200 and all of second data set where we have a guid ```
| where status=200 OR isnotnull(messageGUID2)
``` Get request time ```
| eval request_time=if(status=200, _time, null())
| stats values(_time) as times values(request_time) as request_time by messageGUID
| fieldformat request_time=strftime(request_time, "%F %T")
| fieldformat times=strftime(times, "%F %T")
@bowesmana , Thanks for sharing one way of doing it .
I did some changes and trying to get result it is returning it in epoch form ( that value is coming from outer _time) because when I am again converting it to readable format it is getting converted to request_time .
index=aws_XXX Method response body after transformations: sourcetype="aws:apigateway" business_unit=XX aws_account_alias="XXXXX" network_environment=qa source="API-Gateway-Execution-Logs*" (application="XXX" OR application="XXXX") | rex field=_raw "Method response body after transformations: (?<json>[^$]+)" | spath input=json path="header.messageGUID" output=messageGUID | spath input=json path="payload.statusType.code" output=status | spath input=json path="payload.statusType.text" output=text
| where status=200 and messageGUID="af2ee9ec-9b02-f163-718a-260e83a877f0"
| rename _time as request_time | fieldformat request_time=strftime(request_time, "%F %T")
| table messageGUID,request_time
| join type=inner messageGUID [ search kubernetes_cluster="XXXX*" index="aws_xXXX" sourcetype = "kubernetes_logs" source = *XXXX* | rex field=_raw "sendData: (?<json>[^$]+)"
| spath input=json path="header.messageGUID" output=messageGUID
| where messageGUID="af2ee9ec-9b02-f163-718a-260e83a877f0"
| rename _time as pubsub_time | fieldformat pubsub_time=strftime(pubsub_time, "%F %T")
| table messageGUID, pubsub_time ]
|table messageGUID, request_time, pubsub_time
kubernetes_cluster="eks-XXXX" index="aws_XXXX" sourcetype = "kubernetes_logs" source = *da_XXXXX* " "sendData" | rex field=_raw "sendData: (?<json>[^$]+)"
| spath input=json path="header.messageGUID" output=messageGUID
| where messageGUID="af2ee9ec-9b02-f163-718a-260e83a877f0"
| rename _time as pubsub_time | fieldformat pubsub_time=strftime(pubsub_time, "%F %T")
| table messageGUID, pubsub_time
When I am running inner search value I am getting
Also I would like to understand option you have provided ,can I run it for multiple dataset ?
If you want to use fieldformat, use it as the very last thing you do in the pipeline - when you fieldformat inside the join, it's likely that the format is lost in the join process - fieldformat is simply for rendering data, while keeping the underlying data type.
The stats mechanism can be used on any number of datasets - you need to understand your data to be able to process different datasets in the same pipeline, using coalesce() to get a common ID to organise the results is as you can see in my example.
It is good to try to avoid join, as it has limitations and if you hit those limitations, the results will silently ignore any effect of that limit.