0

We have about 10 instances in 5 deployments running in Azure, with logging to Azure Diagnostics (WADLogsTable). I need to retrieve these logs once in several minutes for analysis locally by 3rd party tool. I already have simple version which reads logs from table, saves last partition and row keys and next time runs query PartitionKey >= SavedPartitionKey. The problem is that in such way not all logs are retrieved: WAD uses logs buffering and stores logs in table in bulks once in 5 minutes (per each instance). RowKey of logged event starts with the deployment id (which is guid).

  1. At time 00:05:30 InstanceA with DeploymentId=999... puts it's logs for last 5 minutes PK RK Message 00:01:00 999 msg1 00:01:00 999 msg2 00:02:00 999 msg3 00:02:00 999 msg4 00:05:00 999 msg5
  2. At time 00:06:00 Transfer Script is started, get's all logs, saves LastPK=00:05:00
  3. At time 00:06:30 InstanceB with DeploymentId=111... puts it's logs for last 5 minutes 00:02:00 111 msg6 00:03:00 111 msg7 00:05:00 111 msg8 00:06:00 111 msg9
  4. At time 00:07:00 Transfer Script is started, gets logs with PK>=LastPK=00:05:00, and actually retrieves only msg8 and msg9 (msg6 and msg7 are lost).

The solution I think about is each time TransferScript to retrieve all logs for last 6 minutes (5 minutes for wad sync + 1 minute for buffer), but this can greatly increase the amount of data transfered (like 5 times) + need somehow to filter out already retrieved logs, which can be problematic. In addition I thought about adding Timestamp>LastSeenTimestamp but I'm not sure whether it solves the problem of data amount and duplication and whether in such way I won't lost messages. Any ideas? Thanks

Stam
  • 151
  • 5

2 Answers2

1

Does the transfer to the third-party tool need to be realtime or "as-soon-as-possible? Can you only transfer data that is 5 minutes or older and NOT transfer data that is younger than 5 minutes? This will ensure that you will only transfer completed partitions.

Igorek
  • 15,716
  • 3
  • 54
  • 92
  • That's an option. The only problematic issue can be time sync between azure storage and host which pulls the logs. Thanks – Stam Aug 03 '12 at 10:35
1

Another possibility could be to include "DeploymentId" in your query along with "PartitionKey" to fetch diagnostics data for last "n" minutes if you have this information available.

Gaurav Mantri
  • 128,066
  • 12
  • 206
  • 241
  • DeploymentId and InstanceId actually, since the problem of times exists also between instances of same deployment. Thanks, that can be an option. – Stam Aug 03 '12 at 10:31
  • What you could do is create a dictionary of deployment id and the last fetched PartitionKey value and loop over this collection periodically and fetch the data (using both PartitionKey and deployment id). If you get the data for a particular deployment id, update the value of PartitionKey otherwise leave it as is. – Gaurav Mantri Aug 03 '12 at 10:38