0

I want to combine the different values/rows of a certain column. these values are texts and I want to combine them together to perform word count and find the most common words.

the dataframe is called df and is made of 30 columns. I want to combine all the rows of the first column (labeled 'text') into one row, or one list etc,. it doesn't matter as long as I can perform FreqDist on it. I am not interested in grouping the values according to a certain value, I just want all the values in this column to become one block.

I looked around a lot and I couldn't find what I am looking for.

thanks a lot.

Talal Ghannam
  • 189
  • 2
  • 17
  • df[‘text’].tolist() ? – ycx Jan 07 '19 at 00:32
  • 2
    Please add input sample and expected output, this will help clarify your question. – Dani Mesejo Jan 07 '19 at 00:35
  • Well, what I want is to use something like this: fd_words = FreqDist(fd['text']) where fd is the data frame and text is the column I want to do a FreqDest on. I want to find the most frequent word for the whole column. however, this is not working. what works is if I chose a specific row so fd_words = FreqDist(fd['text'].iloc[#]) works. the above suggestion of ycx worked in a sense I was able to get all the rows of the text column in one list, but the FreqDist function will not work on this list either. what I want is to find the most common word in the column as a whole, not individual rows – Talal Ghannam Jan 07 '19 at 18:24
  • i was able to somehow do it using the tolist with extend command to add all the texts together and then search in the whole list. I was hoping there was a more straight forward way to search multiple texts at the same time and find the frequency of the words. Anyway thanks a lot. – Talal Ghannam Jan 10 '19 at 01:43

0 Answers0