I assume you want to block /products/ID/purchase
but allow /products/ID
.
Your last suggestion would only block pages that start with "purchase":
User-agent: *
Disallow: /purchase
So this is not what you want.
You'd need your second suggestion:
User-agent: *
Disallow: /products/*/purchase
This would block all URLs that start with /products/
, followed by any character(s), followed by /purchase
.
Note: It uses the wildcard *
. In the original robots.txt "specification", this is not a character with special meaning. However, some search engines extended the spec and use it as a kind of wildcard. So it should work for Google and probably some other search engines, but you can't bet that it would work with all the other crawlers/bots.
So your robots.txt could look like:
User-agent: *
Disallow: /sign_in
Disallow: /products/*/purchase
Also note that some search engines (including Google) might still list a URL in their search results (without title/snippet) although it is blocked in robots.txt. This might be the case when they find a link to a blocked page on a page that is allowed to be crawled. To prevent this, you'd have to noindex
the document.