I have a large file (3*10^7 rows) of call detail records (CDRs) with 9 columns ("|" as delimiter). Each row is a communication instance with the following attributes:
Date|Time|Duration|Caller|Receiver|serviceType|junk|cellReceiver|cellCaller|CallerLAC
I need to split this file into smaller chunks based on users. So each file will be all the communication by the user regardless whether the user is a caller or receiver (i.e., if A called B, then this row should appear in two files, the file of user A and the file of user B).
What would be the best way to do this efficiently? (I am using OS X Yosemite).