Great Expectations Row Based Dimensions

Question

I have data like this:

[   {
        "name": "Apple",
        "price": 1,
        "type": "Food"
    },
    {
        "name": "Apple",
        "price": 0.90,
        "type": "Food"
    },
    {
        "name": "Apple",
        "price": 1000,
        "type": "Computer"
    },
    {
        "name": "Apple",
        "price": 900,
        "type": "Computer"
    }
]

Using the Great Expectations automatic profile, a valid range for price would be 0.90 to 1,000. Is it possible to have it slice on the type dimension, so food would be 0.90 to 1 and computer would be 900 to 1000? Or would I need to transform the data first using dbt? I know the column that will create the dimension, but I don't know the particular values.

Also, same question on differences between rows. Like if they had a timestamp, instead of 900 to 1000, it validates -100 for the change in value.

How many types do you have that you'd need to group by? 2,10,100? — sgdata, Jun 06 '22 at 15:26
You've tagged dbt; are you using Great Expectations in Python, or the dbt port, https://github.com/calogica/dbt-expectations? — tconbeer, Jun 06 '22 at 19:19
@sgdata I don't know. It comes from a feed that I poll periodically, and I'm looking for drastic changes. Perhaps it would be best to do some more transforming, and create a table of percentage change. — steve76, Jun 07 '22 at 02:14

score 0 · Accepted Answer · answered Jun 13 '22 at 04:24

0

I used this approach to first load the data in a pandas data frame:

https://discuss.greatexpectations.io/t/how-can-i-use-the-return-format-unexpected-index-list-to-select-row-from-a-pandasdataset/70/2

answered Jun 13 '22 at 04:24

steve76

302
2
9

Great Expectations Row Based Dimensions

1 Answers1