3

There have been numerous questions about inconsistent results from the YouTube Data API: 1, 2, 3, 4, 5, 6. Most of them have accepted answers that seem to indicate there was a problem with the API request that was fixed by the instructions in the answers. But none of those situations apply to the API request discussed here.

There have also been two questions about duplicates in the API results: 1, 2. Both of them have the same answer, which says to use the next-page token. But both questions say the token was used, so that answer is not helpful.

Yesterday, I submitted a series of API requests to get the list of most-viewed videos about 3D printing. The first request in the series was:

https://www.googleapis.com/youtube/v3/search?q=3D print&type=video&maxResults=50&part=id,snippet&order=viewCount&key=<my key>

I ran that in a VBA sub, which took the next-page token from each result and resubmitted the URL with &pageToken=<nextPageToken> inserted.

The result was a list of 649 unique video IDs. So far so good.

After making some changes in the VBA code and seeing some duplicates in the result set, I went back today and ran the original VBA sub again. The result was again a list of 649 video IDs, but this time the list included duplicates and it also included IDs that were not in yesterday's list and was missing IDs that were there yesterday. Here is a comparison from the first two pages and the last two pages of the two result sets:

Page # on page # overall Run 1 Run 2 Same as Seq Dup
1 1 1 f2mdMcf-fJs f2mdMcf-fJs 1
1 2 2 WSauz5KVKTU WSauz5KVKTU 2 Seq
1 3 3 zsSCUWs7k9Q XYIUM5TkhMo None
1 4 4 B5Q1J5c8oNc zsSCUWs7k9Q 3 Seq
1 5 5 cUxIb3Pt-hQ B5Q1J5c8oNc 4 Seq
1 6 6 4yyOOn7pWnA LDjE28szwr8 None
1 7 7 3N46jQ0Xi3c cUxIb3Pt-hQ 5 Seq
1 8 8 08dBVz8_VzU 4yyOOn7pWnA 6 Seq
...
1 13 13 oeKIe1ik2O8 e1rQ8YwNSDs 11 Seq
1 14 14 FrG_eSECfps RVB2JreIcoc 12 Seq
1 15 15 pPQCwz2q96o oeKIe1ik2O8 13 Seq
1 16 16 uo3KuoEiu3I pPQCwz2q96o 15 NOT
1 17 17 0U6aIwd5h9s uo3KuoEiu3I 16 Seq
...
1 47 47 ShGsW68zbIo iu9rhqsvrPs 46 Seq
1 48 48 0q0xS7W78KQ ShGsW68zbIo 47 Seq
1 49 49 UheJQsXOAnY 0q0xS7W78KQ 48 Seq Dup
1 50 50 H8AcqOh0wis H8AcqOh0wis 50 NOT Dup
2 1 51 EWq3-2VuqbQ 0q0xS7W78KQ 48 NOT Dup
2 2 52 scuTZza4f_o H8AcqOh0wis 50 NOT Dup
2 3 53 bJWJW-mz4_U UheJQsXOAnY 49 NOT
2 4 54 Ii4VYsh9OlM EWq3-2VuqbQ 51 NOT
2 5 55 r2-OGUu57pU scuTZza4f_o 52 Seq
2 6 56 8KTnu18Mi9Q bJWJW-mz4_U 53 Seq
2 7 57 DconsfGsXyA Ii4VYsh9OlM 54 Seq
2 8 58 UttEvLJP3l8 8KTnu18Mi9Q 56 NOT
2 9 59 GJOOLH9ZP2I DconsfGsXyA 57 Seq
2 10 60 ewgmg9Q5Ab8 UttEvLJP3l8 58 Seq
...
13 35 635 qHpR_p8lA4I FFVOzo7tSV8 639 Seq
13 36 636 DplwDDZNTRc 76IBjdM9s6g 640 Seq
13 37 637 3AObqGsimr8 qEh0uZuu7_U None
13 38 638 88keQ4PWH18 RhfGJduOlrw 641 Seq
13 39 639 FFVOzo7tSV8 QxzH9QkirCU 643 NOT
13 40 640 76IBjdM9s6g Qsgz4GbL8O4 None
13 41 641 RhfGJduOlrw BSgg7mEzfqY 644 Seq
13 42 642 lVEqwV0Nlzg VcmjbJ2q8-w 645 Seq
13 43 643 QxzH9QkirCU gOU0BCL-TXs None
13 44 644 BSgg7mEzfqY IoOXQUcW24s 646 Seq
13 45 645 VcmjbJ2q8-w o4_2_a6LzFU 647 Seq Dup
14 1 646 IoOXQUcW24s o4_2_a6LzFU 647 NOT Dup
14 2 647 o4_2_a6LzFU ijVPcGaqVjc 648 Seq
14 3 648 ijVPcGaqVjc nk3FlgEuG-s 649 Seq
14 4 649 nk3FlgEuG-s 27ZLFn8Dejg None

The last three columns have the following meanings:

  • Same as: If an ID from Run 2 is the same as an ID from Run 1, then this column has the # overall for Run 1.
  • Seq: Indicates whether the number in column "Same as" is one more than the previous number in that column.
  • Dup: Indicates whether an ID from Run 2 occurred more than once in that run.

Problems:

These results are troubling. They mean that no single search will return a reliable list of videos for a given value of q.

I mentioned at the beginning that previous questions about inconsistent results from the YouTube Data API had answers that seemed to resolve those inconsistencies. Is there a way to do that for this search? Is there something wrong with the way I'm composing the search that is causing the problem?

If there isn't a way to fix the search, then I suppose the only way to get a list of videos on the topic with high confidence of it being complete is to run the search multiple times and merge the results until no new IDs appear that were not in a previous result set. But even then, one would not know if there are other videos lurking undetected.

Benjamin Loison
  • 3,782
  • 4
  • 16
  • 33
NewSites
  • 1,402
  • 2
  • 11
  • 26
  • 1/3 Note that [`type="video"`](https://developers.google.com/youtube/v3/docs/search/list#type) should be `type=video` ([I modified your question accordingly](https://stackoverflow.com/posts/74450557/revisions)). I noticed that using [the YouTube UI](https://www.youtube.com/results?search_query=3D+print&sp=CAMSAhAB) I got the same results as **Run 2** up to [`LDjE28szwr8`](https://www.youtube.com/shorts/LDjE28szwr8) but the next video was [`29PNv3hG_IA`](https://www.youtube.com/shorts/29PNv3hG_IA) instead of [`cUxIb3Pt-hQ`](https://www.youtube.com/shorts/cUxIb3Pt-hQ). – Benjamin Loison Nov 15 '22 at 20:21
  • 2/3 Note that `29PNv3hG_IA` isn't in your results while it was published 2 years ago. A problem here is that for a given video we don't know if for a given query parameter value, even as a very high rank, the video should appear in the query results (related to [this problem](https://github.com/Benjamin-Loison/YouTube-operational-API/issues/4)). – Benjamin Loison Nov 15 '22 at 20:21
  • 3/3 Please deepen YouTube UI results and let me know if [proposing *all at a time*](https://github.com/Benjamin-Loison/YouTube-operational-API/blob/c309312ad2369ba9287a2c882cf48992865dd229/search.php#L105-L133) parameters [`q`](https://developers.google.com/youtube/v3/docs/search/list#q), `type=video` and [`order=viewCount`](https://developers.google.com/youtube/v3/docs/search/list#order) for a search query in my reverse-engineered YouTube UI [open-source](https://github.com/Benjamin-Loison/YouTube-operational-API) [API](https://yt.lemnoslife.com) would help you. – Benjamin Loison Nov 15 '22 at 20:21
  • I watched the same problem on the YouTube website. Best you can do is post a ticket on Issue Tracker or remove duplicates once detected. – Marco Aurelio Fernandez Reyes Nov 16 '22 at 13:44
  • @BenjaminLoison - I've checked out your API and done some test runs. Very interesting, and I can share the results with you. I've sent you a message through the Matrix contact link on your API and GitHub pages. – NewSites Nov 18 '22 at 16:11
  • I answered you on Matrix that *I think sharing your results by editing your StackOverflow question would be the better way to share with anybody the current state of the art concerning your problem.* – Benjamin Loison Nov 18 '22 at 16:51
  • 1
    Its been bug reported [35177262](https://issuetracker.google.com/issues/35177262) – Linda Lawton - DaImTo Dec 16 '22 at 12:13

0 Answers0