Apply function to URL and write in new column in CSV

Question

I'm a newbie and surprised there's already not a clear answer similar to what I'm asking, apologies if this is a duplicate.

I have a list of URLs in a CSV file I'm trying to shorten, I want to loop through my CSV file and then write a new column with the shortened URL right next to the original URL.

from pyshorteners import Shortener
import csv

def generate_short(url):
    x = shortener.short(url)
    return x
with open('Links_Test.csv') as csvfile:
    my_date = csv.reader(csvfile, dialect = 'excel')
    for row in my_data: 
        x = shortener.short(row)
        print(X)

EDIT: I keep getting the error "ValueError: Please enter a valid url" and don't know how to proceed from here.....I'm sure I'm the problem.

Here's what my input data looks like:

URL
http://www.google.com
http://www.facebook.com
http://www.twitter.com
http://www.linkedin.com

and here's what I want my output to look like:

URL                        Short_URL
http://www.google.com      http://goo.gle
http://www.facebook.com    http://goo.g3c
http://www.twitter.com     http://goo.g3a
http://www.linkedin.com    http://goo.g2q

Thank you for your help. I was very surprised there's not already a clear answer posted (at least I couldn't find it) so I'm sorry if this is a duplicate.

The `row` is a list, with a value for every column. If your file has only 1 column, it's a 1-element list, `row[0]` being the url. — 9000, Jul 18 '17 at 01:06
@9000, thanks for the tip, I updated my code but still having problems. — Programming_Learner_DK, Jul 18 '17 at 01:17

djokester · Accepted Answer · 2017-07-18T15:10:09.553

1

Apply the function on row[0] or row['URL'] Also you have to apply it on my_data.iterrows()and not on my_data

from pyshorteners import Shortener
import pandas as pd

def generate_short(url):
    x = shortener.short(url)
    return x
    my_date = pd.read_csv( 'Link-Tests.csv', sep = "\t") #seperator argument is optional. It can be a semi colon, a tab. Check your CSV file for knowing what the separator is.
    for index,row in my_data.iterrows(): 
        x = shortener.short(row[0])
        print(X)

If you can always store the shortened URL into a separate list, convert it into a DataFrame and then merge with the original dataframe based on index.

lst = []
my_date = pd.read_csv( 'Link-Tests.csv', sep = "\t")
    for index,row in my_data.iterrows(): 
        x = shortener.short(row[0])
        lst.append(X)
df = pd.DataFrame(lst, columns=["Short-Url"])  
my_data = my_data.join(df, how= 'outer')

edited Jul 18 '17 at 15:10

answered Jul 18 '17 at 01:18

djokester

567
9
20

I tried your code and changed 'my_date' to 'my_data', but keep getting the error: " AttributeError: '_csv.reader' object has no attributre 'iterrows' " - perhaps I'm doing something wrong? I like the idea of joining it back up in Pandas where I can write to CSV. Is important that I keep the original order for this. Thank you, Me – Programming_Learner_DK Jul 18 '17 at 08:54
@SDS you made another mistake. Instead of csv_reader use pd.read_csv(file_name) – djokester Jul 18 '17 at 11:37
@SDS you can now check the latest edit to the answer. Alternatively you can use `pd.read_excel` instead of `pd.read_csv` – djokester Jul 18 '17 at 15:11

score 0 · Answer 2 · answered Jul 18 '17 at 01:47

First try doing this:

from pyshorteners import Shortener
import csv

def generate_short(url):
    x = shortener.short(url)
    return x
with open('Links_Test.csv') as csvfile:
    my_data = csv.reader(csvfile, dialect = 'excel')
    for row in my_data: 
        print(row) # output: ['URL'], ['google.com']...

You probably want to use next() or maybe look at this thread to ignore the header. Also, you probably want to use row[0] to get the first item in the list. So your final code might be

from pyshorteners import Shortener
import csv

def generate_short(url):
    x = shortener.short(url)
    return x
with open('Links_Test.csv') as csvfile:
    next(csvfile) # skip the header row
    my_data = csv.reader(csvfile, dialect = 'excel')
    for row in my_data: 
        print(row[0]) # output: 'google.com' ....
        # do the link shortener stuff here

Apply function to URL and write in new column in CSV

2 Answers2