I'm getting a subreddit's contents. The subreddit is AR
.
I need to get post ID, title, post content, author, post date, score, comments, and comment ID, then write into txt file.
The problems I'm facing now are:
(1) Can I combine comments and comment ID into one file? Thus, it will be post ID, title, post content, author, post date, score, comments, and comment ID
(2) The selftext
I got has breaklines, so in my output.txt shows like
blablabla
blablabla
blablabla
For example, [this reddit][1] has multiple breaklines. I want the content all in one line because the data will be transferred into csv/excel for future analysis.
My code:
import praw, datetime, os
reddit = praw.Reddit('bot1')
subreddit = reddit.subreddit('AR')
for submission in subreddit.top(limit=1):
date = datetime.datetime.utcfromtimestamp(submission.created_utc)
for comment in submission.comments:
print("Comment author: ", comment.author)
print("Comments: ", comment.body)
indexFile_comment = open('path' + 'index_comments.txt', 'a+')
indexFile_comment.write('"' + str(comment.author) + '"' + ', ' + '"' + str(comment.body) + '"' + '\n')
print("Post ID: ", submission.id)
print("Title: ", submission.title)
print("Post Content: ", submission.selftext)
print("User Name: ", submission.author)
print("Post Date: ", date)
print("Point: ", submission.score)
indexFile = open('path' + 'index.txt', 'a+')
indexFile.write('"' + str(submission.id) + '"' + ', ' + '"' + str(submission.title) + '"' + ', ' + '"' + str(submission.selftext) + '"' + ', ' + '"' + str(submission.author) + '"' + ', ' + '"' + str(date) + '"' + ', ' + '"' + str(submission.score) + '"' + '\n')
print ("Successfuly writing in file")
indexFile.close()