Manual: Unobtanium Query syntax

For Advanced searching Unobtanium behaves similar to the other major search engines, using the <filter>:<argument> syntax. Quotes help with searching for exact phrases. And a minus (-) can be used to exclude a quoted phrase or negate a filter.

Technical Note: The unobtanium query syntax is implemented using the whydrogen crate.

Quoting

Example: The search query to be or not to be matches a lot of pages because that famous quote is made up of a lot of generic words, but putting it in quotes "to be or not to be" immediately yields pages that contain this exact quote.

Any text that starts with a quote after a space, and ends with a matching quote that is not mid-word is considered quoted and behaves as in the example.

Valid quote pairs are:

Technical Note: Inside quotes a backslash can be used to escape a quote i.e. \", \\ unambiguously represents a backslash. This is similar to how programming languages represent strings.

Excluding words and phrases

Example: Dolphin -"fish" searches for Dolphin but excludes all pages mentioning fish.

Example: Free Software -"Free Software Foundation" searches for Free and Software, but excludes any pages that mention a Free Software Foundation.

Any quoted text prefixed with a minus - is excluded from the query.

Using a minus without a quotes won't work as expected, if you know that feature form other search engines, This is a deliberate choice to prevent excluding search terms by accident (i.e. when pasting an error message).

Filtering

Example: The search query site:example.org Dogs searches for Dogs, but limited to the site example.org.

Filtering is done using a pair of a filter-keyword (i.e. site) and an argument (i.e. example.org) what exactly the argument does depends on the filter.

The most useful filters are:

site
Only return results from the given domain and its subdomains.
Example: site:example.org Includes example.org but also www.example.org.
host
Like site, but does not include subdomains
Example: host:example.org Includes only results from example.org, but not www.example.org.
path
Only return results where the path in the URL between the domain and the first ? or # either contains the given text, or if the path starts with a / starts with the given text.
Example: path:2020 All pages that have 2020 in their path i.e. https://example.org/the-year-2020
Example: path:/2020 All pages where the path starts with /2020 i.e. https://example.net/2020-10-01
lang
Only results of the given language, given the site tagged itself correctly.
Example: lang:de only returns results of sites tagged as being in German.
url
Only return the one exact given URL as a result if it is indexed, this is mainly useful to get query the metadata that unobtanium knows about an URL.
Example: url:https://doc.unobtanium.rocks/

Filters can be negated by prefixing them with a minus (-).

Example: example -site:example.org searches for the word example but excludes example.org and all its subdomains from the results.

The site Filter

The site filter limits the search to a given domain and its subdomains, but it can do more than that.

site:<domain>/<path> is a shorthand notation for site:<domain> path:/<path>, this also works for all the following variations of the site: filter.

Stating the domain part with a dot (.) only includes subdomains.

Example: site:.example.org will return results for subdomains like www.example.org and wiki.example.org but not from example.org itself.

If the domain does not include a dot (.) the filter includes all domains that exactly include the given text.

Example: site:wiki will include all sites that include the term wiki somewhere in their domain name. This can be useful if you forgot the full name of a site, but still remember part of it.

The path filter

The path filter applies to the path between the domain name and the first ? or # (whichever comes first) in the URL.

When not starting with a / the path filter searches for all paths that exactly contain the given text.

When starting with a / it applies to all paths that exactly start with the given text.

When starting with a * the asterisk is ignored and a contains search is always performed, this is useful to match text anywhere in the path, even if it starts with a slash (/).

Other filters

This section lists other filters that are available, these are at best useful for debugging or very specific, technical use cases. They are listed here for transparency and completeness.

scheme
filters by URL scheme
port
filters on the port a site is available on
etag
does a contains match on the last etag a site used
mime-type
Exact match on a files mime type as given by the server
mime-subtype
Exact match on a files mime subtype as given by the server
mime-suffix
Exact match on a files mime suffix as given by the server
http-status
Match the returned HTTP status the file came with, if a single digit N is given it matches the whole Nxx range.