6

I have the following problem:

I want to convert a tab delimited text file to a csv file. The text file is the SentiWS dictionary which I want to use for a sentiment analysis ( https://github.com/MechLabEngineering/Tatort-Analyzer-ME/tree/master/SentiWS_v1.8c ).

The code I used to do this is the following:

txt_file = r"SentiWS_v1.8c_Positive.txt"
csv_file = r"NewProcessedDoc.csv"

in_txt = csv.reader(open(txt_file, "r"), delimiter = '\t')
out_csv = csv.writer(open(csv_file, 'w'))

out_csv.writerows(in_txt)

This code writes everything in one row but I need the data to be in three rows as normally intended from the file itself. There is also a blank line under each data and I don´t know why.

I want the data to be in this form:

Row1 Row2 Row3

Word Data Words

Word Data Words

instead of

Row1

Word,Data,Words

Word,Data,Words

Can anyone help me?

gHOsTaManTe
  • 63
  • 1
  • 1
  • 3
  • what is the problem? your script seems to work fine for me. can you include a few lines of the **actual** output of your script (not just "row1 row2 row3") and then the same few lines in your desired format? – maxymoo Mar 14 '17 at 02:37

2 Answers2

9
import pandas

It will convert tab delimiter text file into dataframe

dataframe = pandas.read_csv("SentiWS_v1.8c_Positive.txt",delimiter="\t")

Write dataframe into CSV

dataframe.to_csv("NewProcessedDoc.csv", encoding='utf-8', index=False)
4

Try this:

import csv

txt_file = r"SentiWS_v1.8c_Positive.txt"
csv_file = r"NewProcessedDoc.csv"

with open(txt_file, "r") as in_text:
    in_reader = csv.reader(in_text, delimiter = '\t')
    with open(csv_file, "w") as out_csv:
        out_writer = csv.writer(out_csv, newline='')
        for row in in_reader:
            out_writer.writerow(row)

There is also a blank line under each data and I don´t know why.

You're probably using a file created or edited in a Windows-based text editor. According to the Python 3 csv module docs:

If newline='' is not specified, newlines embedded inside quoted fields will not be interpreted correctly, and on platforms that use \r\n linendings on write an extra \r will be added. It should always be safe to specify newline='', since the csv module does its own (universal) newline handling.

Dan
  • 4,488
  • 5
  • 48
  • 75
  • You're welcome, @gHOsTaManTe - please upvote and mark as the accepted answer if this resolves your issue. – Dan Mar 14 '17 at 19:25