The common crawl index file used in the below project
https://github.com/trivio/common_crawl_index/blob/master/bin/remote_copy
mmap = BotoMap(s3_anon, src_bucket, '/common-crawl/projects/url-index/url-index.1356128792'
)
is a partial one.
I want the complete index file(APRIL-2015 crawl data) to use in my project which uses the above project as a base.
Where can I download the entire index file?
Here Tom Morris states that
The index files which are used by the index service are also available for download.