Triplestores are a form of graph database designed to store highly structured web-accessible data. They are widely used to store data that can be described in a consistent manner. The underlying data structure used in triplestores is also used to describe controlled vocabularies, such as DCAT - a vocabulary used to describe data catalogues. Knowledge of the structure of triplestores, and the related use of controlled vocabularies to describe datasets, increases an RSE's set of tools that can be used to publish, access, and understand data that that follows recognised standards. Triplestores and controlled vocabularies are key technologies in helping data become more Findable, Accessible, Interoperable, and Reusable (FAIR).
This walkthrough will briefly introduce the technology stack generally used for hosting triplestores as well as the underlying data structure that is required to store data in a triplestore. I will then demonstate the power of the query language used for triplestores, SPARQL. I will exemplify the use of SPARQL via a range of SPARQL endpoints that exist in the wild across several subject domains, including the triplestore representation of the structured elements of Wikipedia, as well as scientific and geospatial examples providing access to large, diverse datasets. I will show some of the more advanced features of SPARQL, and how these compare with other APIs which may be more familiar to the audience. Finally, I will discuss the idea of federated querying, where a single query can be used to retrieve data from different databases distributed over a network.