0

I have a dataframe with person names with these fields - last, first and middle names, i'm trying to concatenating these fields to get a full_name column in a dataframe as below.

dfl.with_columns(
    pl.concat_str([pl.col('last_name'),
                   pl.col('first_name_or_initial'),
                   pl.col('middle_name_or_initial')],' ').alias('full_name')).select([
                   pl.col('full_name'),pl.col('last_name'),
                   pl.col('first_name_or_initial'),
                   pl.col('middle_name_or_initial')])

Here is the output:

enter image description here

Why I'm getting null in full_name after concatenating last, first and middle_names ? Here If any of field holds null while concatenating strings the result will be NULL like wise in above example last_name is Lee, first_name is James, middle_names is NULL hence full_name is NULL how to ignore NULL field when concatenating the fields.

myamulla_ciencia
  • 1,282
  • 1
  • 8
  • 30
  • Please, provide a minimal reproducible example https://stackoverflow.com/help/minimal-reproducible-example – alec_djinn Aug 29 '22 at 12:20
  • I think it has to do with the NaN (null) values. You are probably trying to add NaN to a string. It is hard to tell since you have not share any reproducible code. – alec_djinn Aug 29 '22 at 12:21

1 Answers1

1

We can apppend fill_null to the Expressions passed to concat_str:

dfl.with_columns(
    pl.concat_str(
        [
            pl.col("last_name").fill_null(""),
            pl.col("first_name_or_initial").fill_null(""),
            pl.col("middle_name_or_initial").fill_null(""),
        ],
        " ",
    ).str.rstrip().alias("full_name")
).select(
    [
        pl.col("full_name"),
        pl.col("last_name"),
        pl.col("first_name_or_initial"),
        pl.col("middle_name_or_initial"),
    ]
)
shape: (5, 4)
┌─────────────────────┬────────────┬───────────────────────┬────────────────────────┐
│ full_name           ┆ last_name  ┆ first_name_or_initial ┆ middle_name_or_initial │
│ ---                 ┆ ---        ┆ ---                   ┆ ---                    │
│ str                 ┆ str        ┆ str                   ┆ str                    │
╞═════════════════════╪════════════╪═══════════════════════╪════════════════════════╡
│ Dionne Thomas Roger ┆ Dionne     ┆ Thomas                ┆ Roger                  │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ Lee James           ┆ Lee        ┆ James                 ┆ null                   │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ Lee James           ┆ Lee        ┆ James                 ┆ null                   │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ Latella Domenic C.  ┆ Latella    ┆ Domenic               ┆ C.                     │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ Trumpauska Arvydas  ┆ Trumpauska ┆ Arvydas               ┆ null                   │
└─────────────────────┴────────────┴───────────────────────┴────────────────────────┘

I've filled the null values with an empty string "", but you can choose whatever value you need.

  • 1
    OK. but if if fill with " ", I'm getting an extra length to the full name as "Lee James " whereas it should be "Lee James". Of course I can trim it here. but in excel we have a function textjoin which can ignore empty values. – myamulla_ciencia Aug 29 '22 at 12:32
  • then it will be like this: "LeeJames" it might not be looked good if i don't give in any separator – myamulla_ciencia Aug 29 '22 at 12:38
  • Oh, I see. (I thought you were talking about adding " " to the `fill_null` values.) Yeah, you'll want to add a `str.rstrip()` after the `concat_str`. I've added that to the answer. –  Aug 29 '22 at 12:48
  • Yeah. I will do that for now. but it will be great if we had an option in concat_str itself to ignore NULL – myamulla_ciencia Aug 29 '22 at 12:51
  • There was some discussion of doing that in this issue: https://github.com/pola-rs/polars/issues/3534 –  Aug 29 '22 at 12:54