0

I browsed summaries for all 500+ answers related to this question but found no apparent SQL solution to my problem. Maybe there is none.

I wish to display a Top10 Companies by Hits on Company Profile from nnn to nnn time period.

MySQL Table to track hits on company profile

CREATE TABLE `hit_company` (
  `id` mediumint(9) unsigned NOT NULL AUTO_INCREMENT,
  `hitdate` date NOT NULL DEFAULT '0000-00-00',
  `customerid` mediumint(9) unsigned NOT NULL DEFAULT '0',
  `organization` varchar(80) DEFAULT NULL,
  `hitstamp` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
  `remote_addr` varchar(20) DEFAULT NULL,
  `hittime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
   PRIMARY KEY (`id`)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT=123456 ;

The MySQL query for a 24 hour report looks like this:

SELECT hitstamp,hitdate,customerid,organization, COUNT(organization) AS 'count' FROM hit_company 
WHERE hitstamp >= '2014-01-12 21:23' AND hitstamp <= '2014-01-13 21:23' 
GROUP BY customerid ORDER BY count DESC LIMIT 10

This yields a Top10. The problem is that I am getting hits significantly inflated by varying remote_addr who refresh their company profile page and thus artificially increase their hit counter. I only want to count 1 vote per IP per company within the report time range.

I have considered but am not sure how to code something similar to:

WHERE COUNT(remote_addr < 2)
or
GROUP BY customerid, remote_addr HAVING 'votes' < 2

... so that the same IP address is not counted more than once per company in the final result.

What is the SQL to do this?

If there is no sql, what is the optimum solution in php/mysql ?

Thank you.

Note While this report spans 24 hours, other reports span from 12 hours to 30 days to all time.

1 Answers1

0
SELECT whatever FROM (
    SELECT whatever FROM hit_company
    WHERE hitstamp >= whatever
    GROUP BY organization, remoteaddr
) GROUP BY customerid ORDER BY count

Hand written not tested. The basic idea is use subquery. It'll be very slow if computing on large data.

msg7086
  • 461
  • 2
  • 11