Data: Location Signature

The Location Signature is a datastructure for describing where — semantically speaking — an element is in a (HTML) document. It is implemented as a bunch of flags that get set according to the elements parent-elements.

See LocationSignature on docs.rs.

The fields are named after the corresponding HTML element.

Field HTML-Tags Notes
in_header header
in_footer footer
in_aside aside
in_nav nav
in_form form
in_main main
in_article article 1
in_section section 2
in_table table
in_figure figure
in_address address
in_code code
in_headline h1h6
in_list ul, ol, dl
in_paragraph p

The location signature is mainly used to extract the purpose of a link.

1
Depending on document structure, if no in_main is present in_article might be a "good enough" alternative.
2
section has no real semantic meaning, but it might be useful to separate useful content from fluff on sites that have little to no other semantic markup.

Serialized Representation

The Location signature is always represented as key-value pairs, fields set to false are not serialized.