There have been numerous questions about inconsistent results from the YouTube Data API: 1, 2, 3, 4, 5, 6. Most of them have accepted answers that seem to indicate there was a problem with the API request that was fixed by the instructions in the answers. But none of those situations apply to the API request discussed here.
There have also been two questions about duplicates in the API results: 1, 2. Both of them have the same answer, which says to use the next-page token. But both questions say the token was used, so that answer is not helpful.
Yesterday, I submitted a series of API requests to get the list of most-viewed videos about 3D printing. The first request in the series was:
https://www.googleapis.com/youtube/v3/search?q=3D print&type=video&maxResults=50&part=id,snippet&order=viewCount&key=<my key>
I ran that in a VBA sub, which took the next-page token from each result and resubmitted the URL with &pageToken=<nextPageToken>
inserted.
The result was a list of 649 unique video IDs. So far so good.
After making some changes in the VBA code and seeing some duplicates in the result set, I went back today and ran the original VBA sub again. The result was again a list of 649 video IDs, but this time the list included duplicates and it also included IDs that were not in yesterday's list and was missing IDs that were there yesterday. Here is a comparison from the first two pages and the last two pages of the two result sets:
The last three columns have the following meanings:
- Same as: If an ID from Run 2 is the same as an ID from Run 1, then this column has the
# overall
for Run 1. - Seq: Indicates whether the number in column "Same as" is one more than the previous number in that column.
- Dup: Indicates whether an ID from Run 2 occurred more than once in that run.
Problems:
- The videos XYIUM5TkhMo, LDjE28szwr8, qEh0uZuu7_U, Qsgz4GbL8O4, gOU0BCL-TXs, and 27ZLFn8Dejg were returned as #3, 6, 637, 640, 643, and 649 in Run 2, but were not returned at all in Run 1.
- The videos FrG_eSECfps, r2-OGUu57pU, lVEqwV0Nlzg were returned as #14, 55, 642, in Run 1, but were not in Run 2.
- The videos 0q0xS7W78KQ, H8AcqOh0wis, and o4_2_a6LzFU were returned as #49, 50, and 645 in Run 2, but then each appears a second time in that run (as well as appearing in Run 1 as #48, 50, and 647).
These results are troubling. They mean that no single search will return a reliable list of videos for a given value of q
.
I mentioned at the beginning that previous questions about inconsistent results from the YouTube Data API had answers that seemed to resolve those inconsistencies. Is there a way to do that for this search? Is there something wrong with the way I'm composing the search that is causing the problem?
If there isn't a way to fix the search, then I suppose the only way to get a list of videos on the topic with high confidence of it being complete is to run the search multiple times and merge the results until no new IDs appear that were not in a previous result set. But even then, one would not know if there are other videos lurking undetected.