0

I have been searching for an answer to this, but can not seem to get what I need. I would like a python script that reads my text file and starting from the top working its way through each line of the file and then prints out all the matches in another txt file. Content of the text file is just 4 digit numbers like 1234. example 1234 3214 4567 8963 1532 1234 ...and so on. I would like the output to be something like: 1234 : matches found = 2 I know that there are matches in the file do to almost 10000 lines. I appreciate any help. If someone could just point me in the right direction here would be great. Thank you.

Kali Berry
  • 11
  • 2
  • Please share your code. – Pirate X Aug 07 '16 at 18:49
  • 1
    Why do you think you need recursion here? – kylieCatt Aug 07 '16 at 21:19
  • I figured it would achieve the result im looking for. I want to take the .txt file that only has 4 digit numbers (one per line) and compare each number against all other lines in the file to find matches. Then output the 4 digit numbers along with how many times the occurrence existed within the file. I am narrowing down the numbers so I can see how many times a given combination has duplicates. – Kali Berry Aug 08 '16 at 01:10

3 Answers3

0
import re

file = open("filename", 'r')
fileContent=file.read()
pattern="1234"
print len(re.findall(pattern,fileContent))
csabinho
  • 1,579
  • 1
  • 18
  • 28
  • Thank you for this. I used this to do a search of the file for matches via user input. this will work for my purposes. – Kali Berry Aug 08 '16 at 00:20
0

If I were you I would open the file and use the split method to create a list with all the numbers in and use the Counter method from collections to count how many of each number in the list are dupilcates. `

from collections import Counter

filepath = 'original_file'
new_filepath = 'new_file'

file = open(filepath,'r')
text = file.read()
file.close()

numbers_list = text.split('\n')
numbers_set = set(numbers_list)

dupes = [[item,':matches found =',str(count)] for item,count in Counter(numbers_list).items() if count > 1]
dupes = [' '.join(i) for i in dupes]


new_file = open(new_filepath,'w')
for i in dupes:
    new_file.write(i)
new_file.close()

`

Matt Ellis
  • 142
  • 9
0

Thanks to everyone who helped me on this. Thank you to @csabinho for the code he provided and to @IanAuld for asking me "Why do you think you need recursion here?" – IanAuld. It got me to thinking that the solution was a simple one. I just wanted to know which 4 digit numbers had duplicates and how many, and also which 4 digit combos were unique. So this is what I came up with and it worked beautifully!

import re

a=999
while a <9999:
a = a+1

file = open("4digits.txt", 'r')
fileContent = file.read()

pattern = str(a)
result = len(re.findall(pattern, fileContent))
if result >= 1:
    print(a,"matches",result)
else:
    print (a,"This number is unique!")
Kali Berry
  • 11
  • 2