YaCy search operators
Each search engine has its own documentation in order to perform advanced search. Advanced search are very useful for SEOs as it allows them to better check SERPs.
For Google this documentation is available at: http://support.google.com/websearch/bin/answer.py?hl=fr&answer=136861
For Bing it is available at: http://msdn.microsoft.com/en-us/library/ff795620.aspx
By knowing how to perform good search you will:
- improve your search experience and identify some contents you could not have find otherwise.
- check with more precision the SEO of a website: https://www.seo-footprints.com.
- improve the security of your information system: https://www.exploit-db.com/google-hacking-database/.
cache: shows the last version of the page the search engine has within its index. Very useful to access an old page put offline.
related: shows websites which are identified as similar to the one you indicate.
"keyword": shows results including this keyword only. For example "John Snow" will give you pages including those words in this exact sequence.
-keyword: shows results which does not include this keyword.
inurl: this request shows results listing these words within the url.
inlink: only urls with the <phrase> within outbound links of the document
link: will give you all the websites which have a link pointing to the website you are indicating.
site: will give you all the pages which are indexed by the search engine. This is very useful in order to analyze only one website.
tld: it will analyze only pages from top-level-domains. For example, tld:.fr will give you only the websites which are ending with a domain name in .fr. Even if .fr does not 100% mean that you will get only French websites, it shows at least webmasters who had an intent to target France.
filetype: this search operator is here in order to list all the results which correspond to this type of document. For example if you run filetype:pdf you are getting all the pdf files. You can cross it with whatever other words you want.
author:<author> only pages with as-author-anotated <author>
on:<date> only pages with <date> in content
from:<date1> to:<date2> only pages with a date between <date1> and <date2> in content.
/http: only resources from http or https servers
/ftp: only resources from ftp servers (they are rare, crawl them yourself)
/smb: only resources from smb servers (Intranet Indexing must be selected)
/file: only files from a local file system (Intranet Indexing must be selected)
Performing good search take time. The best thing is always to define first on paper what you would like to achieve and then test with a request accordingly.