5

I have two files - one contains the addresses (line numbers) and the other one data, like this:

address file:

2
4
6
7
1
3
5

data file

1.000451451
2.000589214
3.117892278
4.479511994
5.484514874
6.784499874
7.021239396

I want to randomize the data file based on the numbers of address files so I get:

2.000589214
4.479511994
6.784499874
7.021239396
1.000451451
3.117892278
5.484514874

I want to do it either in python or in bash, but didn't yet find any solution.

codeforester
  • 39,467
  • 16
  • 112
  • 140
hassan
  • 133
  • 1
  • 6
  • 17

3 Answers3

3

If you don't mind sed, we can use process substitution to achieve this easily:

sed -nf <(sed 's/$/p/' addr.txt) data.txt
  • -n suppresses the default printing
  • -f makes sed read commands from the process substitution <(...)
  • <(sed 's/$/p/' addr.txt) creates sed print commands based on line numbers in addr.txt

Gives the output:

2.000589214
4.479511994
6.784499874
7.021239396
1.000451451
3.117892278
5.484514874
codeforester
  • 39,467
  • 16
  • 112
  • 140
2

With awk:

awk 'NR==FNR {a[NR]=$0; next} {print a[$0]}' data.txt addr.txt
  • NR==FNR {a[NR]=$0; next} creates an associative array a with keys being the record (line) number and values being the whole record, this would be applicable only for the first file (NR==FNR), which is data.txt. next makes awk to go to the next line without processing the record any further

  • {print a[$0]} prints the value from the array with keys being the currect file's (addr.txt) line (record) number

Example:

% cat addr.txt 
2
4
6
7
1
3
5

% cat data.txt 
1.000451451
2.000589214
3.117892278
4.479511994
5.484514874
6.784499874
7.021239396

% awk 'NR==FNR {a[NR]=$0; next} {print a[$0]}' data.txt addr.txt
2.000589214
4.479511994
6.784499874
7.021239396
1.000451451
3.117892278
5.484514874
codeforester
  • 39,467
  • 16
  • 112
  • 140
heemayl
  • 39,294
  • 7
  • 70
  • 76
0

You can do it, also, within Python, like this example:

with open("address_file", 'r') as f1, open("data_file", "r") as f2:
    data1 = f1.read().splitlines()
    data2 = f2.read().splitlines()

for k in data1:
    # Handle exceptions if there is any
    try:
        print(data2[int(k)-1])
    except Exception:
        pass

Edit: As suggested @heemayl, here is another solution using only one list:

with open("file1", 'r') as f1, open("file2", 'r') as f2:
    data = f2.read().splitlines()

    for k in f1.read().splitlines():
        print(data[int(k)-1])

Both will output:

2.000589214
4.479511994
6.784499874
7.021239396
1.000451451
3.117892278
5.484514874
Chiheb Nexus
  • 9,104
  • 4
  • 30
  • 43
  • 2
    You don't need to have two lists. Just create the list for datafile and iterate over the lines for the file with just line numbers. – heemayl Jun 09 '17 at 04:38
  • Yes, i know. But i think it's easy for the OP to catch whats happening inside the code. I do'nt think he had a good knowledge how to answer his question using Python. But still your comment is correct. – Chiheb Nexus Jun 09 '17 at 04:39