0

I am a newbie to Python and Pandas.

Not sure what I am doing wrong in my code but I am simply trying to convert product name values given in a csv column to a new output csv, as slug values of the corresponding product names.

Input is: product-feed.csv

product_name
V-Neck T-Shirt
Hoodie with Logo
Long Sleeve T-Shirt
Hoodie with Pocket
Hoodie with Zipper
Long Sleeve Tee
Polo Neck Tee
V-Neck T-Shirt - Red
V-Neck T-Shirt - Green
V-Neck T-Shirt - Blue

Expected output (slugged-output.csv) should be like this when I run the py file in VS Code terminal:

product_name
v-neck-t-shirt
hoodie-with-logo
long-sleeve-t-shirt
hoodie-with-pocket
hoodie-with-zipper
long-sleeve-tee
polo-neck-tee
v-neck-t-shirt-red
v-neck-t-shirt-green
v-neck-t-shirt-blue

parse_code.py is like this: Note: I am using https://pypi.org/project/python-slugify/ module to pass this to convert the slugs in the code:

import pandas as pd

from slugify import slugify

df = pd.read_csv("product-feed.csv", dtype="str")

df["product_name"] = slugify(str(df["product_name"]))

# VIEW TO DEBUG ONLY
print(df["product_name"])

df["product_name"].to_csv(path_or_buf="slugged-output.csv", index=False, sep=";", quoting=1, encoding="UTF-8")

The problem: my output is like this:

slugged-output.csv (so as the print in the console)

"product_name"
"0-v-neck-t-shirt-1-hoodie-with-logo-2-long-sleeve-t-shirt-3-hoodie-with-pocket-4-hoodie-with-zipper-5-long-sleeve-tee-6-polo-neck-tee-7-v-neck-t-shirt-red-8-v-neck-t-shirt-green-9-v-neck-t-shirt-blue-name-product-name-dtype-object"
"0-v-neck-t-shirt-1-hoodie-with-logo-2-long-sleeve-t-shirt-3-hoodie-with-pocket-4-hoodie-with-zipper-5-long-sleeve-tee-6-polo-neck-tee-7-v-neck-t-shirt-red-8-v-neck-t-shirt-green-9-v-neck-t-shirt-blue-name-product-name-dtype-object"
"0-v-neck-t-shirt-1-hoodie-with-logo-2-long-sleeve-t-shirt-3-hoodie-with-pocket-4-hoodie-with-zipper-5-long-sleeve-tee-6-polo-neck-tee-7-v-neck-t-shirt-red-8-v-neck-t-shirt-green-9-v-neck-t-shirt-blue-name-product-name-dtype-object"
"0-v-neck-t-shirt-1-hoodie-with-logo-2-long-sleeve-t-shirt-3-hoodie-with-pocket-4-hoodie-with-zipper-5-long-sleeve-tee-6-polo-neck-tee-7-v-neck-t-shirt-red-8-v-neck-t-shirt-green-9-v-neck-t-shirt-blue-name-product-name-dtype-object"
"0-v-neck-t-shirt-1-hoodie-with-logo-2-long-sleeve-t-shirt-3-hoodie-with-pocket-4-hoodie-with-zipper-5-long-sleeve-tee-6-polo-neck-tee-7-v-neck-t-shirt-red-8-v-neck-t-shirt-green-9-v-neck-t-shirt-blue-name-product-name-dtype-object"
"0-v-neck-t-shirt-1-hoodie-with-logo-2-long-sleeve-t-shirt-3-hoodie-with-pocket-4-hoodie-with-zipper-5-long-sleeve-tee-6-polo-neck-tee-7-v-neck-t-shirt-red-8-v-neck-t-shirt-green-9-v-neck-t-shirt-blue-name-product-name-dtype-object"
"0-v-neck-t-shirt-1-hoodie-with-logo-2-long-sleeve-t-shirt-3-hoodie-with-pocket-4-hoodie-with-zipper-5-long-sleeve-tee-6-polo-neck-tee-7-v-neck-t-shirt-red-8-v-neck-t-shirt-green-9-v-neck-t-shirt-blue-name-product-name-dtype-object"
"0-v-neck-t-shirt-1-hoodie-with-logo-2-long-sleeve-t-shirt-3-hoodie-with-pocket-4-hoodie-with-zipper-5-long-sleeve-tee-6-polo-neck-tee-7-v-neck-t-shirt-red-8-v-neck-t-shirt-green-9-v-neck-t-shirt-blue-name-product-name-dtype-object"
"0-v-neck-t-shirt-1-hoodie-with-logo-2-long-sleeve-t-shirt-3-hoodie-with-pocket-4-hoodie-with-zipper-5-long-sleeve-tee-6-polo-neck-tee-7-v-neck-t-shirt-red-8-v-neck-t-shirt-green-9-v-neck-t-shirt-blue-name-product-name-dtype-object"
"0-v-neck-t-shirt-1-hoodie-with-logo-2-long-sleeve-t-shirt-3-hoodie-with-pocket-4-hoodie-with-zipper-5-long-sleeve-tee-6-polo-neck-tee-7-v-neck-t-shirt-red-8-v-neck-t-shirt-green-9-v-neck-t-shirt-blue-name-product-name-dtype-object"

Any ideas what I am missing in the code please..thank you :)

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
TheMissingNTLDR
  • 247
  • 1
  • 2
  • 10
  • I don't know anything about slugify, but from what I see from your output, I would say that the slugify functioon expects a string value as a varaible. You are passing slugify a series varaible df['product_name'] and are therefor getting the entire list for each entry. – itprorh66 Nov 28 '21 at 20:23
  • yes @itprorh66 you may be correct. My apporach on "How to use slugify" was completely incorrect. However I did try to use Variable assignment to df["product_name"] too with my several tries before and was getting the same result. I just posted the answer I found so hope it may help someone! Thanks for your input though! – TheMissingNTLDR Nov 28 '21 at 21:02

1 Answers1

1

Eventually after a bit of searching on internet I came across this page and this one line of code resolved everything!!

df["product_name"] = df["product_name"].fillna('').apply(lambda x: slugify(x))

DELETED this from my original code and replaced as the line above!

df["product_name"] = slugify(str(df["product_name"]))

solution

TheMissingNTLDR
  • 247
  • 1
  • 2
  • 10