how can i count paragraphs of text file using python?

Question

I'm trying to write a book cipher decoder, and the following is what i got so far.

code = open("code.txt", "r").read() 
my_book = open("book.txt", "r").read() 
book = my_book.txt 
code_line = 0 
while code_line < 6 :
      sl = code.split('\n')[code_line]+'\n'
      paragraph_num = sl.split(' ')[0]
      line_num =  sl.split(' ')[1]
      word_num = sl.split(' ')[2]
      x = x+1

the loop changes the paragraph , line , word variables and every thing is working just fine .

but what i need now is how to specify the paragraph then the line then the word ,a for loop in the while loop would work perfectly.

so i want to get from paragraph number "paragraph_num" and line number "line_num" the word number "word_num"

that's my code file ,which I'm trying to convert into words

"paragraph number","line number","word number"

and then i want my output to look something like this

word 
word  
word 
word 
word 
word

my book "that file that i need to get the words from" looks something like this

word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word

word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word

word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word

yes :-| and i read something about (\n\n) but i'm failing with describing \ implementing that ""pythonatically"" — user7451333, Mar 22 '17 at 12:45
What do you say as 'line' here ? Like sentences or what?? I mean are you seperating lines by.. uhm.. lemme show an example. Ex. "Hey Xyz. I am going to ABC. Are you coming with me" line 1 - Hey Xyz line 2 - I am going to ABC line 3 - Are you coming with me — Naveen Honest Raj K, Mar 22 '17 at 13:18

Eric Duminil · Accepted Answer · 2017-03-22T12:59:44.363

Theory

If you want to get paragraphs out of your text, you could split by "\n\n" :

>>> "word\n\nword\nword\n\nword".split("\n\n")
['word', 'word\nword', 'word']

You now have a list of paragraphs. For each paragraph, you can split by "\n" and get a list of lines.

For each line, you can split without argument and get a list of words.

Nested loops

text = """word word word word word word word word word
word word word word word word word
word word word word word word word word word word word word word word word word word word word word word
word word word word word word word word word word word word word word word word word word

word word word word boat word word word word word
word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word

word word word word word word word word word word word
word word word word word word word word word word word word word word
word word word word word word word word word word word word word word word word word word word word word word"""

for i, paragraph in enumerate(text.split("\n\n")):
    for j, line in enumerate(paragraph.split("\n")):
        for k, word in enumerate(line.split()):
            print("%d, %d, %d : %s" % (i,j,k,word))

It outputs :

0, 0, 0 : word
0, 0, 1 : word
0, 0, 2 : word
0, 0, 3 : word
0, 0, 4 : word
0, 0, 5 : word
0, 0, 6 : word
0, 0, 7 : word
0, 0, 8 : word
0, 1, 0 : word
0, 1, 1 : word
0, 1, 2 : word
0, 1, 3 : word
0, 1, 4 : word
0, 1, 5 : word
0, 1, 6 : word
0, 2, 0 : word
0, 2, 1 : word
0, 2, 2 : word
0, 2, 3 : word
0, 2, 4 : word
0, 2, 5 : word
0, 2, 6 : word
0, 2, 7 : word
0, 2, 8 : word
0, 2, 9 : word
0, 2, 10 : word
0, 2, 11 : word
0, 2, 12 : word
0, 2, 13 : word
0, 2, 14 : word
0, 2, 15 : word
0, 2, 16 : word
0, 2, 17 : word
0, 2, 18 : word
0, 2, 19 : word
0, 2, 20 : word
0, 3, 0 : word
0, 3, 1 : word
0, 3, 2 : word
0, 3, 3 : word
0, 3, 4 : word
0, 3, 5 : word
0, 3, 6 : word
0, 3, 7 : word
0, 3, 8 : word
0, 3, 9 : word
0, 3, 10 : word
0, 3, 11 : word
0, 3, 12 : word
0, 3, 13 : word
0, 3, 14 : word
0, 3, 15 : word
0, 3, 16 : word
0, 3, 17 : word
1, 0, 0 : word
1, 0, 1 : word
1, 0, 2 : word
1, 0, 3 : word
1, 0, 4 : boat
1, 0, 5 : word
1, 0, 6 : word

The loops are useful to see what the required indices are.

Nested list comprehensions

If you want fast lookup, you can use a nested list comprehension to create a "3D-list" :

table = [[[word for word in line.split()] for line in paragraph.split("\n")] for paragraph in text.split("\n\n")]

It outputs :

[[['word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word'], ['word', 'word', 'word', 'word', 'word', 'word', 'word'], ['word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word'], ['word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word']], [['word', 'word', 'word', 'word', 'boat', 'word', 'word', 'word', 'word', 'word'], ['word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word']], [['word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word'], ['word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word'], ['word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word', 'word']]]

You can get to the desired word this way :

table[1][0][4]
# "boat"

If you have a list of tuples :

codes = [
        (1, 0, 4),
        (2, 1, 3)
        ]

for i,j,k in codes:
    print(table[i][j][k])

my text is already in paragraphs, but what i'm asking about is how to print/select a specific one so i can then select the line and print the word that i'm looking for. — user7451333, Mar 22 '17 at 12:41
this var 'paragraph_num' is the paragraph number that i want to select , so my issue is how to define a paragraph then a line then a specific word .. note : my text is like 90 paragraphs long — user7451333, Mar 22 '17 at 12:43
how about making the output having the words that i'm looking for only ? '70 1 3' '50 2 2' '21 2 9' '28 1 6' '71 2 2' '27 1 4' — user7451333, Mar 22 '17 at 12:53
@user7451333: Answer updated, even though you had enough information to try it for yourself. — Eric Duminil, Mar 22 '17 at 13:00

score 0 · Answer 2 · answered Oct 28 '20 at 13:14

If someone would like another code that is a bit different,

because I strongly believe this is connected with "book cipher" here arnold/book cipher with python*

I posting my code here from that link; If this understanding is wrong, please tell me that.

# Replace "document1.txt" with whatever your book / document's name is.

BOOK="document1.txt" # This contains your "Word Word Word Word ...." I believed from the very start that you meant, they are not the same - (obviously)

# Read book into "boktxt"
def GetBookContent(BOOK):
    ReadBook = open(BOOK, "r")
    txtContent_splitted = ReadBook.read();
    ReadBook.close()
    Words=txtContent_splitted

    return(txtContent_splitted.split())


boktxt = GetBookContent(BOOK)

words=input("input text: ").split()
print("\nyou entered these words:\n",words)

i=0
words_len=len(words)
for word in boktxt:
    while i < words_len:
        print(boktxt.index(words[i]))
        i=i+1

x=0
klist=input("input key-sequence sep. With spaces: ").split()
for keys in klist:
        print(boktxt[int(klist[x])])
        x=x+1

how can i count paragraphs of text file using python?

2 Answers2

Theory

Nested loops

Nested list comprehensions

because I strongly believe this is connected with "book cipher" here arnold/book cipher with python*