2

I'm trying switch my code from pandas to polars.

there's a list of dict as below:

data = [{
    "MA5": 91.128,
    "MA10": 95.559,
    "MA20": 103.107,
    "MA30": 109.3803,
    "MA60": 114.0822
}, 
{
    "MA5": 13.776,
    "MA10": 14.027,
    "MA20": 13.768,
    "MA30": 13.6417,
    "MA60": 14.0262
}]

I want to create a series from this list in polars, and then add it to a existed dataframe.

I try many ways, but all those lost the key name, only value field left, like below:

Series: 'ma' [struct[5]]
[
    {14.426,13.718,12.672,12.7723,14.1927}
    {14.59,13.898,12.735,12.7497,14.1378}
    {14.352,13.951,12.7785,12.721,14.0727}
    {14.134,13.967,12.857,12.7493,14.0027}
    {13.966,14.062,12.979,12.7987,13.9532}
]

but I need to keep the dict type in series, hope the series as below:

Series: 'ma' [struct[5]]
[
    {
        "MA5": 91.128,
        "MA10": 95.559,
        "MA20": 103.107,
        "MA30": 109.3803,
        "MA60": 114.0822
    },
    {
        "MA5": 13.776,
        "MA10": 14.027,
        "MA20": 13.768,
        "MA30": 13.6417,
        "MA60": 14.0262
    }
]

What is the right way to achieve this goal in polars or in pyarrow?

letit
  • 23
  • 2
  • Why do you need to keep the dict type? What are you trying to do exactly? The information is still there e.g. `print(pl.Series(data)[0])` or `pl.Series(data).to_frame().unnest("")` – jqurious May 01 '23 at 11:19

2 Answers2

0

You can check out polars documentation about this method.

The from_dicts method takes a list of dictionaries and creates a pl.Series object from the values in the dictionaries. Each key in the dictionary will become a named column in the resulting Series:

import polars as pl

data = [{"name": "Alice", "age": 15}, {"name": "Bob", "age": 30}, {"name": "Charlie", "ages": 25}]
series = pl.Series.from_dicts(data)
print(series)
0

You say you want to convert your list of dicts into a Series with dicts so that you can append it to another dataframe. You don't need to make it a Series in order to append it to an existing dataframe.

If you just want to append it to an existing dataframe, assuming your existing dataframe has the same columns then you'd just do

df = pl.concat([
     df,
     pl.from_dicts(data)
     ])

If you want a series of structs, as in your example output then you'd do:

pl.from_dicts(data).select(pl.struct(pl.all())).to_series()

When you print it, it won't look like your example above but it'll still have that representation internally. For example, do

print(pl.from_dicts(data).select(pl.struct(pl.all())).to_series()[0])
{'MA5': 91.128, 'MA10': 95.559, 'MA20': 103.107, 'MA30': 109.3803, 'MA60': 114.0822}

print(pl.from_dicts(data).select(pl.struct(pl.all())).to_series()[1])
{'MA5': 13.776, 'MA10': 14.027, 'MA20': 13.768, 'MA30': 13.6417, 'MA60': 14.0262}

then you can see that it retained the column (key) information.

Dean MacGregor
  • 11,847
  • 9
  • 34
  • 72