0

i'm working on CDR (Call Details Record) data for Telecom company in a report with a billions of rows, I ask for a SQL query that let me analyzing a subscriber’s frequency of calls made to specific MSISDNs, to guide which subscribers should be offered family deals or identify how many peers in their network have competitor services.

my table have a huge amount of details, i got the useful ones that help me in getting this insight, the table contains:

  • PRI_IDENTITY (The subscriber calling number)
  • RECIPIENT_NUMBER (The called number)
  • Timestamp (The call date/time)
  • Call_Duration (in seconds)
  • CALLED_HOME_NETWORK_CODE (The Recipient number network code)
SELECT
  PRI_IDENTITY, RECIPIENT_NUMBER, 
  TO_timestamp(CUST_LOCAL_START_DATE,'yyyyMMddHH24miss')::Date AS Call_Date,
  ACTUAL_USAGE AS Call_Duration,
  CALLED_HOME_NETWORK_CODE
FROM TBL.CDR

the sample data

enter image description here

The output that I need to figure out is contain below:

  • Year
  • Month
  • PRI_IDENTITY
  • CALLED_HOME_NETWORK_CODE
  • Frequency 5 (The count of RECIPIENT_NUMBERS whom the PRI_IDENTITY called 5 times on the day)
  • Frequency 10 (The count of RECIPIENT_NUMBERS whom the PRI_IDENTITY called 10 times on the day)
  • Frequency 15 (The count of RECIPIENT_NUMBERS whom the PRI_IDENTITY called 15 times on the day)
  • Frequency 20 (The count of RECIPIENT_NUMBERS whom the PRI_IDENTITY called 20 times on the day)
    • Frequency More than 20 (The count of RECIPIENT_NUMBERS whom the PRI_IDENTITY called more than 20 times on the day)

HINT: the PRI_IDENTITY and RECIPIENT_NUMBERS on the sample data on the image is full in the real data and due to privacy I couldn't share the whole info online

1 Answers1

1

I assume you mean per month for the call frequencies, not per day. Regardless, you want two levels of aggregation:

select yyyymm, PRI_IDENTITY, CALLED_HOME_NETWORK_CODE,
       sum(case when cnt >= 5 then 1 else 0 end) as cnt_5,
       sum(case when cnt >= 10 then 1 else 0 end) as cnt_10,
       . . .
from (select date_trunc('month', call_date) as yyyymm,
             PRI_IDENTITY ,
             CALLED_HOME_NETWORK_CODE, 
             RECIPIENT_NUMBER, count(*) as cnt
      from t
      group by date_trunc('month', call_date) as yyyymm,
               PRI_IDENTITY, CALLED_HOME_NETWORK_CODE,  RECIPIENT_NUMBER
     ) x
group by yyyymm, PRI_IDENTITY, CALLED_HOME_NETWORK_CODE;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786