-1

I have a Text file as below:

Education: 

askdjbnakjfbuisbrkjsbvxcnbvfiuregifuksbkvjb.iasgiufdsegiyvskjdfbsldfgd

Technical skills : 
 java,j2ee etc.,

work done: 

oaugafiuadgkfjwgeuyrfvskjdfviysdvfhsdf,aviysdvwuyevfahjvshgcsvdfs,bvisdhvfhjsvjdfvshjdvhfjvxjhfvhjsdbvfkjsbdkfg

I would like to extract only the heading names such as Education,Technical Skills etc.

the code is :

with open("aks.txt") as infile, open("fffm",'w') as outfile:
    copy = False
    for line in infile:
        if line.strip() == "Technical Skills":
            copy =True
        elif line.strip() == "Workdone":
            copy = True


        elif line.strip() ==  "Education":
            copy = False
        elif copy:
            outfile.write(line)
        fh = open("fffm.txt", 'r')
        contents = fh.read()
        len(contents)
SAKETH
  • 9
  • 5
  • 1
    your lines have `:` chars in the end of them. `.strip` does not strip `:` characters so that may run into problems – R Nar Nov 30 '15 at 17:45

2 Answers2

0

To get just the headings from your text file, you could use the follows:

import re

with open('aks.txt') as f_input:
    headings = re.findall(r'(.*?)\s*:', f_input.read())
    print headings

This would display the following:

['Education', 'Technical skills', 'work done']
Martin Evans
  • 45,791
  • 17
  • 81
  • 97
0

If you are sure that the title names occure before a colon (:) then you can write a regex to search for such a pattern.

    import re
    with open("aks.txt") as infile:
      for s in re.finditer(r'(?<=\n).*?(?=:)',infile.read()):
        print s.group()

The output will be like

   Education
   Technical skills 
   work done
Rohan Amrute
  • 764
  • 1
  • 9
  • 23