1

I have 100+ text files containing paragraphs of texts. I would like to manually add headers to the individual text files batch-wise, adding headers such as: "year", "id", "place", "body" to the text files, and then compiling them into a csv. Is there a way to do that in python? How might the code look like? The aim is to eventually collate something like this:

doc_id;"speech_type";"author";"date";"text"
1;"speech";"speaker"; "yyyy-mm-dd"; "speech text"
user3081750
  • 217
  • 2
  • 9
  • 2
    Is your question based on possibility of happening? , then yes it can be done – shuberman Aug 19 '19 at 14:01
  • Thanks, added question to clarify - looking for the code to see how that might be done. – user3081750 Aug 20 '19 at 03:13
  • `I would like to add headers to the text files, such as "year", "id", "place", "body" to the text files, and then compiling them into a csv.` First of all, .txt files have no stucture, so I suggest having your data in excel or .csv form DOing that will be much easier for you and is the suggested approach – shuberman Aug 20 '19 at 03:18
  • Thanks, that is the question actually - is there a quick way to add for example, headers in .txt files. – user3081750 Aug 20 '19 at 03:24
  • So before answering i had another question, All these 100+ text files you have, do they have these pieces of information at a specific location, or is it in a form of a paragraph and you're just looking to extract it by asking the program to go through it and identify if it belongs to :`doc_id;"speech_type";"president";"date";"text"` – shuberman Aug 20 '19 at 03:33
  • No, the documents have no headers, and I will add them batchwise. The documents are just paragraph text for now. – user3081750 Aug 20 '19 at 09:06
  • Then the above sounds like something we would need machine learning, because the script needs to know that is it sees `George Washington`, it needs to be categorize it as a `president` – shuberman Aug 20 '19 at 09:29
  • Thanks @mishx for your questions. I was going to add these headers batch-wise. So no need for machine-learning. Thanks! – user3081750 Aug 21 '19 at 07:58

0 Answers0