Questions tagged [search-engine-bots]

51 questions
1
vote
2 answers

How to convert JavaScript dynamic data into HTML and render?

We have developed a website and it uses JavaScript library to query database and display the data in HTML page. When you go to the website, you need to search for something in order to retrieve the data. so by default website doesn't display any…
1
vote
1 answer

Disable google indexing website telephone numbers

I was presented with the task of hiding telephone numbers from Google - what that means is, we want to display them on the website and have them clickable href="tel:..." but to ensure Google does not index it and does NOT display it with the search…
1
vote
3 answers

New site going on to an old domain

I have a client who over the years has managed to get their product to the top of Google for many different search terms. They're adamant that the new site shouldn't have a detrimental effect to their google ranking. The site will be replacing the…
sea_1987
  • 2,902
  • 12
  • 44
  • 69
1
vote
1 answer

Allow and disallow in robots.txt file

I want to disallow all files and folders on my site from SE bots, except a special folder and files in it. Can I use these lines at robots.txt file? User-agent: * Disallow: / Allow: /thatfolder Is it right?
hd.
  • 17,596
  • 46
  • 115
  • 165
1
vote
2 answers

robots.txt in codeigniter - allow view/function

I read a little bit about robots.txt and I read I should disallow all folders in my web application, but I would like to allow bots to read main page and one view (url is for example: www.mywebapp/searchresults - it's a codeigniter route - it's…
Ridd
  • 10,701
  • 3
  • 19
  • 20
1
vote
2 answers

Web Crawling and Pagerank

I'm a computer science student and I am a bit inexperienced when it comes to web crawling and building search engines. At this time, I am using the latest version of Open Search Server and am crawling several thousand domains. When using the built…
1
vote
1 answer

Does the Baidu search spider understand Javascript?

More specifically, if/how well does the Baidu search crawler index - Content delivered via AJAX History API (pushState()) I've looked through their documentation and webmaster tools but can't find a 'View as Baidu bot' tool (like Google & Bing…
user3982863
1
vote
1 answer

How to build human-like search-engine queries?

I am interacting with a search engine programmatically and I need to trick it into thinking that I am a human making queries, as opposed to a robot. This involves generating queries for which it seems plausible that any ordinary user would search…
Ponkadoodle
  • 5,777
  • 5
  • 38
  • 62
1
vote
1 answer

remove pages from google dynamic url - robots.txt

I have a few links on google that are domain.com/results.php?name=a&address=b The results page/parameters has now been renamed and I need to remove the existing links on google etc. I tried User-agent: * Disallow: /results.php in robots.txt and then…
1
vote
3 answers

Tell search engines that page does not exist

I have checked the logs and found that the search engines visits a lot of bogus URL's on my website. They are most likely from before a lot of the links were changed, and even though I have made 301 redirects some links have been altered in very…
Anders
  • 499
  • 1
  • 5
  • 18
1
vote
1 answer

The * character in the Disallow statement of the Robots.txt File

How do different search bots interpret the * character in the disallow statement of the robots.txt file? Do all of them treat it as "none, one or more than one character" ? Let's take the following example: User-agent: * Disallow:…
CompilingCyborg
  • 4,760
  • 13
  • 44
  • 61
1
vote
3 answers

How to fix indexed pages that shouldn't be crawled by GoogleBot and other search engine crawlers?

On an existing .Net MVC3 site, we implemented paging where the URL looks something like www.mysite.com/someterm/anotherterm/_p/89/10, where 89 is the page number and 10 is the number of results per page. Unfortunately the rel="nofollow" was missing…
1
vote
2 answers

Search bot detection

Is it possible to prevent a site from being scraped by any scrapers, but in the same time allow Search engines to parse your content. Just checking for User Agent is not the best option, because it's very easy to simulate them. JavaScript checks…
user584397
  • 347
  • 2
  • 12
0
votes
0 answers

How can i create sitemap with changing ids?

i developed one paged site ( mainly only the index) and i don't know actually how to optimize it seo friendly. For example, i have a open book library and that books are open like "index#LoTR" "index#BookId" . Google can't dedect…
0
votes
0 answers

Bing refuses to index my blog when I submit a url it returns: "This site requires Javascript to work, please enable Javascript in your browser..."

I believe my problem is very much related to the one answered here. ByetHost server passing html values "Checking your browser" with JSON String. Every time I try to get Bing to Index my site it fails and when I check in webmaster tools for my…