The token index is an experimental way to store a full text index in SQL tables in the summary database.
A token in the sense of the token index is rouhly equivalent to a word after splitting up compound words and applying normalization.
In its current state there is a statistics table that keeps track of how often a token appears in a given text pile.
Open Problems
At least the following problems need to be solved for the token index to exit its experimental state:
- Page previews must be generated ouside of fts5. See pull request #36.
- It must support at least BM25 or another result weighting mechanism.
How to access the token index
The index being experimental isn't built by default and can be (re)build using unobtanium-crawler regenerate-token
command.
It can be queried using the token:
filter.