0

I'll start with describing my problem.

I have n pages each with its own popularity factor. Popularity factor is on a scale of 10. Now, I have total page hits for each of the pages with me and I want to use those total page hits for calculating the popularity factor again on a scale of 10.

The total page hits is an absolute number and I have these values for only 1,70,000 pages. The total pages which I have with me is 41,00,000.

Now, my problem is I don't know how to normalize these total page hits for all of the total pages.

I tried doing this:

Popularity factor for each page = Total page hits for all the pages/total no. of pages.

I'll assume that the pages with no data will be having at least 1 total page hits. But that way my denominator becomes really big number and in the process of scaling on a scale of 10, I'm lost.

Can anyone please help with how can I approach it ?

Bergi
  • 630,263
  • 148
  • 957
  • 1,375
dark_shadow
  • 3,503
  • 11
  • 56
  • 81
  • 2
    What does this have to do with JavaScript, PHP or HTML? – Tibos Jan 30 '14 at 12:11
  • If I understand you correctly you want popularity for a page = 10 * hits for that page / maximum number of hits for a page. – Peter Collingridge Jan 30 '14 at 12:13
  • @PeterCollingridge: but that way documents with only 1 hit will have a value of (10*1)/16000 and that will make the value very small.....how can I tackle it correctly – dark_shadow Jan 30 '14 at 12:24
  • @Tibos: I thought may be web developers have some idea about it as it is related with page hits. – dark_shadow Jan 30 '14 at 12:25
  • It seems to me that this is about a (fairly simple algorithm). Unfortunately i have trouble understanding (language barrier) the specifics of your problem, so i can't offer an answer. – Tibos Jan 30 '14 at 12:29
  • @mukul_gupta It depends on what you consider correct. It seems reasonable that a page with 1 hit should have a small value if other pages have 16,000 hits. You could use a log scale if you prefer. – Peter Collingridge Jan 30 '14 at 12:32

1 Answers1

0

There are several ways to do it. Here are some examples:

Absolute popularity

Find the number of hits of the most popular page.

Assign a popularity score bases on the number of hits compared to the most popular page:

0-10% = popularity 1, 10-20% = popularity 2 and so on.

Relative popularity

Sort all pages according to number of page hits.

Assign a popularity score based on the position in the list:

0-10% = popularity 1, 10-20% = popularity 2 and so on.

Popularity of pages without statistics

I can't give you any advice on how to handle these. If you don't know how many times a page has been accessed it is really hard to give it a popularity score.

Klas Lindbäck
  • 33,105
  • 5
  • 57
  • 82