2

I have a table papers

CREATE TABLE `papers` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `title` varchar(1000) CHARACTER SET utf8 COLLATE utf8_unicode_ci DEFAULT NULL,
  `my_count` int(11) NOT NULL,
  PRIMARY KEY (`id`),
  FULLTEXT KEY `title_fulltext` (`title`),
) ENGINE=MyISAM AUTO_INCREMENT=1617432 DEFAULT CHARSET=utf8 COLLATE=utf8_bin

and another table link_table

CREATE TABLE `auth2paper2loc` (
  `auth_id` int(11) NOT NULL,
  `paper_id` int(11) NOT NULL,
  `loc_id` int(11) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

The id papers.id from the upper table is the same one like the link_table.paper_id in the second table. I want to iterate through every row in the upper table and count how many times this its id appears in the second table and store the "count" into the column "my_count" in the upper table.

Example: If The paper with tid = 1 = paper_id appears 5 times in the table link_table, then my_count = 5.

I can do that by a Python script but it results in too many querys and I have millions of entrys so it is really slow. And I can't figure out the right syntax to make this right inside of MySQL.

This is what I am iterating about in a for-loop in Python (too slow):

SELECT count(link_table.auth_id) FROM link_table
WHERE link_table.paper_id = %s

UPDATE papers SET auth_count = %s WHERE id = %s

Could someone please tell me how to create this one? There must be a way to nest this and put it directly in MySQL so it is faster, isn't there?

Juan Mellado
  • 14,973
  • 5
  • 47
  • 54
Aufwind
  • 25,310
  • 38
  • 109
  • 154

3 Answers3

2

How does this perform for you?

update papers a
   set my_count = (select count(*) 
                     from auth2paper2loc b
                    where b.paper_id = a.id);
Ronnis
  • 12,593
  • 2
  • 32
  • 52
1

Use either:

UPDATE PAPERS
   SET my_count = (SELECT COUNT(b.paper_id)
                     FROM AUTH2PAPERLOC b
                    WHERE b.paper_id = PAPERS.id)

...or:

   UPDATE PAPERS
LEFT JOIN (SELECT b.paper_id,
                  COUNT(b.paper_id) AS numCount
             FROM AUTH2PAPERLOC b
         GROUP BY b.paper_id) x ON x.paper_id = PAPERS.id
      SET my_count = COALESCE(x.numCount, 0)

The COALESCE is necessary to convert the NULL to a zero when there aren't any instances of PAPERS.id in the AUTH2PAPERLOC table.

OMG Ponies
  • 325,700
  • 82
  • 523
  • 502
  • `Count()` returns 0 when no instances of `papers.id` are found in AUTH2PAPERLOC. – Ronnis Apr 03 '11 at 20:55
  • 1
    @Ronnis / exactly, it's not required in the 1st query. In the 2nd (which is normally faster), it is, but there was a syntax error – RichardTheKiwi Apr 03 '11 at 20:58
  • 1
    @Richard, yup. When I'm reading his answer once more I realized the comment was meant for the second query. – Ronnis Apr 03 '11 at 21:03
1
update papers left join 
 (select paper_id, count(*) total from auth2paper2loc group by paper_id) X
 on papers.id = X.paper_id
set papers.my_count = IFNULL(X.total, 0)
RichardTheKiwi
  • 105,798
  • 26
  • 196
  • 262