The Location Signature is a datastructure for describing where — semantically speaking — an element is in a (HTML) document. It is implemented as a bunch of flags that get set according to the elements parent-elements.
See LocationSignature on docs.rs.
The fields are named after the corresponding HTML element.
Field | HTML-Tags | Notes |
---|---|---|
in_header |
header |
|
in_footer |
footer |
|
in_aside |
aside |
|
in_nav |
nav |
|
in_form |
form |
|
in_main |
main |
|
in_article |
article |
1 |
in_section |
section |
2 |
in_table |
table |
|
in_figure |
figure |
|
in_address |
address |
|
in_code |
code |
|
in_headline |
h1 – h6 |
|
in_list |
ul , ol , dl |
|
in_paragraph |
p |
The location signature is mainly used to extract the purpose of a link.
- 1
- Depending on document structure, if no
in_main
is presentin_article
might be a "good enough" alternative. - 2
section
has no real semantic meaning, but it might be useful to separate useful content from fluff on sites that have little to no other semantic markup.
Serialized Representation
The Location signature is always represented as key-value pairs, fields set to false are not serialized.