1

When I try to write a record (that has missing/null measure value) in AWS Timestream, it throws the following error. Any suggestion how to ingest NULL measure values into AWS Timestream?

Error: ValidationException: An error occurred (ValidationException) when calling the WriteRecords operation: Timestream only supports finite IEEE Standard 754 floating point precision for double measure value type.

Sample code below:

import pandas as pd
import numpy as np
import s3fs
import io
import boto3
import awswrangler as wr
import datetime
import os
import time
import gc
from datetime import timedelta
from datetime import datetime
import sys

df = pd.DataFrame(
    {
        "time": [datetime.now(), datetime.now(), datetime.now()],
        "dim0": ["foo", "boo", "bar"],
        "dim1": [1, 2, 3],
        "measure": [1.0, 1.12345678, None],
    })

rejected_records = wr.timestream.write(
    df=df,
    database="tsdb",
    table="tstable1",
    time_col="time",
    measure_col="measure",
    dimensions_cols=["dim0", "dim1"],
    boto3_session = boto3.Session()
)
print(rejected_records)
adds
  • 11
  • 1
  • 2

2 Answers2

2

IEEE Standard 754 supports nan ("not a number").

In Python you have to use float('nan') instead of None to meet this standard.

If you are using numpy you can also use np.nan

kaptan
  • 3,060
  • 5
  • 34
  • 46
Lior Cohen
  • 5,570
  • 2
  • 14
  • 30
2

This solution works for the values of dimensions, but not for the values of measurements. AWS support told me that the Timestream does not support NULL values on ingest. Hence a workaround is to use 'a dummy value' for the NULL values measurements.

dxggl
  • 21
  • 1
  • What does it mean to use a dummy value for null values? – falsePockets Feb 19 '23 at 22:14
  • @falsePockets, in my use case, the measurement value should ALWAYS be positive, so I use `-1000` as the 'dummy value' for the NULL values of measurements. – dxggl Feb 21 '23 at 11:38