Unobtanium Documentation

Unobtanium is a web-crawler with a search frontend, or simpler stated: It's a search engine.

The developers instance is over at unobtanium.rocks and tries to be a technology and personal websites focused search engine.

Unobtanium makes heavy use of SQLite.

Unobtanium Documentation

Note: Most of this site is still under construction, don't expect it to be complete.

Components

Unobtanium is made up of:

unobtanium (lib-unobtanium)
Main application library that implements most data-structures and database logic.
criterium
Query framework for matching Data in memory and in the DB against criteria.
unobtanium-crawler
Web Crawling and summarizing application of Unobtanium.
unobtanium-viewer
Web frontend for querying an Unobtanium summary database.
crawler database
Database schema optimized for crawling.
summary database
Database schema optimized for querying/searching.

Data Pipeline

The main Unobtanium data pipeline consists of three steps:

  1. Crawling (Web to crawler database)
  2. Summarizing (crawler database to summary database)
  3. Querying/Searching (summary database to curious Creature)

Each step in independent of the previous, so no huge setup is needed to get Unobtanium working.

Crawling and summarizing are decoupled to make iterating on code and configuration easier, as summarizing is a quite complex step.