I need to create a pd.DataFrame
with a multiindex. The first index level is a simple range from 1...n
. The second level is a datetime index. All columns contain floats
. Here's my example for n=2
.
from datetime import date
import pandas as pd
from hypothesis import given
from hypothesis import strategies as st
from hypothesis.extra.pandas import columns, data_frames, indexes
@given(
df1=data_frames(
columns=columns(
["asset1", "asset2", "asset3", "cash_asset"],
elements=st.floats(allow_nan=False, allow_infinity=False),
),
index=indexes(
elements=st.dates(
date.fromisoformat("2000-01-01"), date.fromisoformat("2020-12-31")
),
min_size=10,
unique=True,
).map(sorted),
),
df2=data_frames(
columns=columns(
["asset1", "asset2", "asset3", "cash_asset"],
elements=st.floats(allow_nan=False, allow_infinity=False),
),
index=indexes(
elements=st.dates(
date.fromisoformat("2000-01-01"), date.fromisoformat("2020-12-31")
),
min_size=10,
unique=True,
).map(sorted),
),
)
def test_index_level(df1, df2):
df = pd.concat([df1, df2], keys=["df1", "df2"])
assert df.index.nlevels == 2
I am wondering how to directly create the multiindex using the hypothesis
library? It's clear that I can't define df1
, df2
, etc. manually as in my toy example.
Another constraint is that the level 2
index needs to be equal for all level 1
occurrences.