0

I have this data in a text file. (Doesn't have the spacing I added for clarity)

I am using Python3:

orders = open('orders.txt', 'r')
lines = orders.readlines()

I need to loop through the lines variable that contains all the lines of the data and separate the CO lines as I've spaced them. CO are customers and the lines below each CO are the orders that customer placed.

The CO lines tells us how many lines of orders exist if you look at the index[7-9] of the CO string. I illustrating this below.

CO77812002D10212020       <---(002)
125^LO917^11212020.      <----line 1
235^IL993^11252020       <----line 2 

CO77812002S10212020
125^LO917^11212020
235^IL993^11252020

CO95307005D06092019    <---(005)
194^AF977^06292019    <---line 1 
72^L223^07142019       <---line 2
370^IL993^08022019    <---line 3
258^Y337^07072019     <---line 4
253^O261^06182019     <---line 5

CO30950003D06012019
139^LM485^06272019
113^N669^06192019
249^P530^07112019
CO37501001D05252020
479^IL993^06162020

I have thought of a brute force way of doing this but it won't work against much larger datasets.

Any help would be greatly appreciated!

Faisal Malik
  • 77
  • 1
  • 9
  • What does your current code look like? And why doesn’t it work for a larger dataset? – Heike Nov 29 '20 at 21:23
  • Because I hard coded an empty list, created a counter for every time I came across a customer account column, and populated the empty list with order data. I'm pretty sure that is not a effective way to get this done. – Faisal Malik Nov 29 '20 at 22:57

2 Answers2

1

You can use fileinput (source) to "simultaneously" read and modify your file. In fact, the in-place functionality that offers to modify a file while parsing it is implemented through a second backup file. Specifically, as stated here:

Optional in-place filtering: if the keyword argument inplace=True is passed to fileinput.input() or to the FileInput constructor, the file is moved to a backup file and standard output is directed to the input file (...) by default, the extension is '.bak' and it is deleted when the output file is closed.

Therefore, you can format your file as specified this way:

import fileinput

with fileinput.input(files = ['orders.txt'], inplace=True) as orders_file:
    for line in orders_file:
        if line[:2] == 'CO':    # Detect customer line
            orders_counter = 0
            num_of_orders = int(line[7:10])    # Extract number of orders
        else:
            orders_counter += 1
            # If last order for specific customer has been reached
            # append a '\n' character to format it as desired
            if orders_counter == num_of_orders:
                line += '\n'
        # Since standard output is redirected to the file, print writes in the file
        print(line, end='')

Note: it's supposed that the file with the orders is formatted exactly in the way you specified:

CO...
(order_1)
(order_2)
...
(order_i)
CO...
(order_1)
...
lezaf
  • 482
  • 2
  • 10
  • Its order like I stated except without spacing in between. I will try this solution, thanks very much! – Faisal Malik Nov 30 '20 at 18:38
  • I don't intend to change the orders.txt file by writing to it, rather I want to create a list of lists. The lists inside the list would be each customer and their order. So I can then process each one and print out the amount they owe. – Faisal Malik Nov 30 '20 at 19:03
  • I got the solution below, but only because of your help! So I accept this answer. – Faisal Malik Nov 30 '20 at 19:39
  • Great! I'm glad that I helped! – lezaf Nov 30 '20 at 19:50
1

This did what I hoping to get done!

tot_customers = []

with open("orders.txt", "r") as a_file:
customer = []
for line in a_file:
  stripped_line = line.strip()

  if stripped_line[:2] == "CO":
      customer.append(stripped_line)
      print("customers: ", customer)
      orders_counter = 0
      num_of_orders = int(stripped_line[7:10])
  else:
      customer.append(stripped_line)
      orders_counter +=1

      if orders_counter == num_of_orders:
          tot_customers.append(customer)
          customer = []
          orders_counter = 0
Faisal Malik
  • 77
  • 1
  • 9