Mapping Sameness Identifier

Version 0.0.0

Authors
Affiliations

Jakob Voß

Verbundzentrale des GBV (VZG)

Charles Tapley Hoyt

RWTH Aachen University

Published

March 20, 2026

Other Formats
Abstract

This document specifies a content-based identifier for mappings.

1 Introduction

Mappings (also known as terminology mappings, correspondences, or semantic alignments) are relevant to integrate and to compare data from diverse sources. If mappings are managed in distributed environments, there is a need to reference individual mappings with an identifier. Two kinds of identifiers exist:

  • Assigned identifiers must explicitly be generated once by the source of a mapping and carried along with it.
  • Content-based identifiers can be derived automatically from the content of a mapping.

This specification defines a content-based identifier for mappings. One use case is the distributed review of mappings: if some mapping is reviewed, the review should apply to other mappings with same content, independent from when, where, and by whom the mapping was created.

This specification consists of a definition of mapping content and the definition of a method to derive a mapping sameness identifier from the content. The method does not depend on any specific data format or programming language. The normative part of this specification is limited to section 2, section 3, and its references, excluding examples and notes. The appendix contains examples and implementations for illustration.

2 Definitions

2.1 Terminology

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “NOT RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in BCP 14 (RFC 2119 and RFC 8174) when, and only when, they appear in all capitals, as shown here.

2.2 Mapping

A mapping is a 4-tuple of:

  • subjects: a non-empty set of IRI
  • objects: a non-empty set of IRI
  • predicate: an absolute IRI, also known as mapping type
  • negativity: either false for affirmative mappings or true for negative mappings

IRIs must be absolute as specificied in RFC 3987. IRIs SHOULD be based on NFC-normalized Unicode strings (see RFC 3987 section 5.3.2.4).

A mapping is also called 1-to-1 mapping, if both subjects and objects contain exactely one IRI each.

Note

Applications may restrict or extend mappings, for instance by:

  • limitation to 1-to-1-mappings
  • limitation of the set of allowed predicates
  • limitation to affirmative mappings
  • allowing subjects and/or objects to be empty sets

Handling these models of mappings is out of the scope of this specification. The same applies to resolving different IRIs to one form if they refer to the same concept.

The definiton of a mapping roughly corresponds to a an edge in a directed hypergraph. The original use case of this specification was application to mappings between terminologies but there may be other uses in applications of directed hypergraphs such as chemical reaction networks and transport routing.

Example 1: An affirmative 1-to-1 mapping
  • subjects: http://example.org/feline
  • objects: http://example.com/cat
  • predicate: http://www.w3.org/2002/07/owl#sameAs
  • negativity: false
Example 2: A more complex mapping
  • subjects: http://example.org/red and http://example.org/blue
  • objects: http://example.com/green
  • predicate: http://www.w3.org/2004/02/skos/core#closeMatch
  • negativity: true

3 Identifier

A mapping sameness identifier is created from a Unicode string consisting of three elements, separated by space (U+0020):

  1. mapping subjects sorted by Unicode Code Point and separated by vertical bar (U+007C)
  2. mapping predicate
  3. mapping objects sorted by Unicode Code Point and separated by vertical bar (U+007C)

The resulting mapping elements string is encoded in UTF-8 and then hashed with SHA-256 (RFC 6234). The mapping sameness identifier then consists of thre parts concatenated:

  1. URI namespace string mapping:
  2. the SHA-256 digest in hexadecimal form using lowercase characters (0 to 9 and a to f)
  3. the negativity indicator, being a tilde (~, U+007E) if the mapping is negative, or an empty string otherwise
Warning

The URI prefix mapping: has not been registered at IANA yet, so this may change in the final version of this specification.

Identifier of example 1

mapping elements string: http://example.org/feline http://www.w3.org/2002/07/owl#sameAs http://example.com/cat

mapping sameness identifier: mapping:95a088082ab2b2a68638aebbcc3fe3e0f229da75a8b5bdbb9f3f8cd5e1e4286e

Identifier of example 2

mapping elements string: http://example.org/blue|http://example.org/red http://www.w3.org/2004/02/skos/core#closeMatch http://example.com/green

mapping sameness identifier: mapping:424e7a86ea29d5a0aaf1d3d7da9a864b48121ac465c67163aef56f6f87bb1ba8~

Applications MAY use a mapping sameness identifier to construct an assigned identifiers of a mapping. Applications MUST not refer to these identifiers as sameness identifier but MAY use the term mapping identifier. If the constructed mapping identifier is an URI application SHOULD:

  • use the mapping sameness identifier als last part of the path element of the constructed URI
  • optionally append an URI query component with information not included in any of the four mapping elements
  • optionally append an URI fragment identifier to add information included in any of the four mapping elements
Possible mapping identifiers based on mapping samenes identifiers
http://example.org/mapping:95a088082ab2b2a68638aebbcc3fe3e0f229da75a8b5bdbb9f3f8cd5e1e4286e?reviewed=true
http://example.org/api/mapping:424e7a86ea29d5a0aaf1d3d7da9a864b48121ac465c67163aef56f6f87bb1ba8~#skos:closeMatch
http://example.org/set/2/mapping:424e7a86ea29d5a0aaf1d3d7da9a864b48121ac465c67163aef56f6f87bb1ba8

4 References

4.1 Normative References

4.2 Informative references

5 Appendix

The following information is non-normative.

Examples

SSSOM

Mappings can be expressed in the Simple Standard for Sharing Ontological Mappings (SSSOM) in a tabular format:

# curie_map:
#   ex: https://example.org
#   owl: http://www.w3.org/2002/07/owl#
#   semapv: https://w3id.org/semapv/vocab/
# mapping_set_id: https://example.org/example.sssom.tsv
subject_id  predicate_id    object_id   mapping_justification
ex:feline   owl:sameAs  ex:cat  semapv:UnspecifiedMatching

Negative mappings can be specified in SSSOM with the predicate_modifier column:

# curie_map:
#   ex: https://example.org
#   owl: http://www.w3.org/2002/07/owl#
#   semapv: https://w3id.org/semapv/vocab/
# mapping_set_id: https://example.org/example.sssom.tsv
subject_id  predicate_id    predicate_modifier  object_id   mapping_justification
ex:feline   owl:sameAs  Not ex:dog  semapv:UnspecifiedMatching

SSSOM only supports 1-to-1 mappings.

JSKOS

Mappings can be expressed in JSKOS data format with its Mapping Object:

{
  "type": [
    "http://www.w3.org/2004/02/skos/core#mappingRelation",
    "http://www.w3.org/2002/07/owl#sameAs"
  ],
  "from": {
    "memberSet": [ { "uri": "http://example.org/feline" } ]
  },
  "to": {
    "memberSet": [ { "uri": "http://example.com/cat" } ]
  }
}

Note that in this example http://www.w3.org/2004/02/skos/core#mappingRelation does not encode the mapping predicate but the fact that this JSON object is a JSKOS mapping!

JSKOS supports mapping sources and mapping objects with multiple source but negative mappings are not supported yet (see this issue). A negative mapping will likely be expressed like this:

{
  "type": [
    false,
    "http://www.w3.org/2004/02/skos/core#closeMatch"
  ],
  "from": {
    "memberSet": [ 
      { "uri": "http://example.org/red" },
      { "uri": "http://example.org/blue" }
    ]
  },
  "to": {
    "memberSet": [ { "uri": "http://example.org/green" } ]
  }
}

RDF

A non-negative 1-to-1 mapping can be expressed in form of an RDF triple with mapping subjects being the RDF triple subject, mapping objects being the RDF triple object, and mapping predicate being the RDF triple predicate. An RDF triples can only represent a mappings if all of its elements are absolute IRIs.

For example mapping example 1 can be expressed in RDF Turtle syntax:

<http://example.org/feline> <http://www.w3.org/2002/07/owl#sameAs> <http://example.com/cat> .

Expressing negative mappings and mappings beyond 1-to-1 mappings in RDF require more complex features and conventions such as reification or triple terms (also known as RDF-star, introduced in RDF 1.2). The choice of these features to express mappings is out of the scope of this document. The combination of mapping sameness identifiers and RDF statements to express mappings beyond non-negative 1-to-1 mappings is an anti-pattern:

# Don't do this, it's a contradiction because the mapping is not negative:
<mapping:95a088082ab2b2a68638aebbcc3fe3e0f229da75a8b5bdbb9f3f8cd5e1e4286e> a ontology:NegativeMapping .

Implementations

The following scripts expect a JSON object with fields subjects, predicate, objects, and negativity from standard input.

Python


import hashlib
import sys, json

def mapping_sameness_identifier(mapping):
    subjects = "|".join(sorted(mapping["subjects"]))
    objects = "|".join(sorted(mapping["objects"]))
    string = f"{subjects} {mapping['predicate']} {objects}"
    digest = hashlib.sha256(string.encode('utf-8')).hexdigest()
    return f"mapping:{digest}~" if mapping["negativity"] else f"mapping:{digest}"

if __name__ == "__main__":
    mapping = json.load(sys.stdin)
    print(mapping_sameness_identifier(mapping))

Command line tools


#!/bin/bash
export LANG=C.UTF-8

read -r s p o neg <<<$(jq -r '[
    (.subjects|sort|join("|")),
    .predicate,
    (.objects|sort|join("|")),
    (if .negativity then "~" else "" end)
  ]|join(" ")')

digest=$(echo -n "$s $p $o" | sha256sum | cut -f1 -d' ') 
echo "mapping:$digest$neg"

JavaScript

Note

The script assumes an additional method String.codePointCompare to compare strings by Unicode Code Point. See this TC39 proposal for an implementation.

import crypto from "crypto"
import fs from "fs"

const { subjects, objects, predicate, negativity } = JSON.parse(fs.readFileSync(0, 'utf-8'))
subjects.sort(String.codePointCompare)
objects.sort(String.codePointCompare)

const str = [subjects.join("|"), predicate, objects.join("|")].join(" ")
const bytes = new TextEncoder().encode(str)
const digest = crypto.createHash('sha256').update(Buffer.from(bytes)).digest("hex")
  
console.log(`mapping:${digest}${negativity ? "~" : ""}`)

Changes

This document is managed in a revision control system at https://github.com/ts4nfdi/mapping-sameness-identifier, including an issue tracker.

  • Version 0.0.0

    Working draft

Acknowledgements

Groundwork of this specification was laid out at the second TS4NFDI Community Hands-On Workshop, March 2026 in Leipzig. We thank all participants and organizers.