Components
Unobtanium is made up of:
unobtanium
(lib-unobtanium)- Main application library that implements most data-structures and database logic.
criterium
- Query framework for matching Data in memory and in the DB against criteria.
unobtanium-crawler
- Web Crawling and summarizing application of Unobtanium.
unobtanium-viewer
- Web frontend for querying an Unobtanium summary database.
- crawler database
- Database schema optimized for crawling.
- summary database
- Database schema optimized for querying/searching.
Data Pipeline
The main Unobtanium data pipeline consists of three steps:
- Crawling (Web to crawler database)
- Summarizing (crawler database to summary database)
- Querying/Searching (summary database to curious Creature)
Each step in independent of the previous, so no huge setup is needed to get Unobtanium working.
Crawling and summarizing are decoupled to make iterating on code and configuration easier, as summarizing is a quite complex step.