How does it work?
In the tutorial, we have learnt how to wrap a document and issue it into a document store. However, we didn't explain what these actions were doing and why they are necessary.
#
Wrapping a documentAs a reminder, wrapping a document works on JSON object. A single wrapped document will look like this:
A few interesting transformations happened that we will dive into below:
- A
data
key has been created and its value holds the contents of the file previously provided when wrapping, along with some weird-looking extra (hexadecimal) data. - A
signature
object has been created.
data
object#
The The first step of wrapping consists of transforming all the object properties provided as input using the following algorithm:
- For each property
- Generate a salt using uuid v4 in order to prevent rainbow table attack.
- Determine the type of original property.
- Transform the original value to
<salt>:<original-type>:<original-value>
.
The shape of the input object remains untouched.
signature
object#
The #
targetHashOnce the data
object has been computed we will be able to create an unique hash for the document that we will set into targetHash
:
- List each properties' path from the
data
object and associate its value. The path follows the flatley path convention. For instance:name
,issuers.0.tokenRegistry
, etc. - For each properties' path, compute a hash using the properties' path and value. To compute the hash we use keccak256.
- Sort all the hashes from the previous step alphabetically and hash them all together: this will provide the
targetHash
of the document. To compute thetargetHash
we also use keccak256.
The
targetHash
of a document is an unique identifier.
Later on, during verification of the document, the same exact steps are performed again to assert that the contents of the document has not been tampered with. This works as the final targetHash will be completely different if any part of the wrapped document is different from the original.
The document store is a smart contract on the Ethereum network that records the issuance and revocation status of OpenAttestation documents. It stores the hashes of wrapped documents, which are the records of the owner of the document store having issued the documents.
Imagine that you wrap thousands of files and had to issue the targetHash
for each of them. It would be extremely inefficient. That's where the merkleRoot
will come in handy.
#
merkleRootOnce the targetHash
of a document is computed, OpenAttestation will determine the merkleRoot
. The merkleRoot
value is the merkle root hash computed from the merkle tree using the targetHash
of all the document wrapped together. Each targetHash
is a leaf in the tree. After computing the merkle tree, the merkleRoot
associated to a document will be added to it as well as the proofs (intermediate hashes) needed to ensure that the targetHash
has been used to compute the merkleRoot
. The proofs are added into the proof
property.
In the document above we can notice that the targetHash
and the merkelRoot
are identical and that the proof
is empty. This is normal and happen when you wrap only one document at a time. Try to wrap at least 2 documents at the same time, and you will see a difference between targetHash
and the merkelRoot
, and you will see proofs appended.
The
merkleRoot
will always be the same for all the documents wrapped together (in a batch). It will be different for documents wrapped separately.
Now that our batch of documents have a common identifier and that we can prove (thanks to the merkle tree algorithm) that the targetHash
of a document was used to create a specific merkleRoot
, we can use the merkleRoot
in our document store and issue it.
#
Data ObfuscationDue to the way we compute targetHash
, OpenAttestation allows for one to obfuscate data they don't want to make public. For this we can simply compute the hash of a specific field and add it into the documents. Let's try it with the CLI and the document above:
The content of output.json
will be:
The name
field is not available anymore in the data
object, and the hash associated to it is added into privacy.obfuscatedData
.
More importantly, the document remains valid.
The hash added into privacy.obfuscatedData
is the one used when computing the targetHash
. To verify that a document remained untouched, OpenAttestation computes the targetHash
of the provided document and compare it to signature.targetHash
. There is one subtle difference during verification. All the hashes available in privacy.obfuscatedData
are added to the list of computed hashes. So for verification the steps are as follows:
- List each properties' path from the
data
object and associate its value. - For each properties' path, compute a hash using the properties' path and value.
- Append the hashes from
privacy.obfuscatedData
to the list of computed hashes from the previous step. - Sort all the hashes from the previous step alphabetically and hash them all together: this will provide the
targetHash
of the document.
The only difference with the targetHash
computation is the step 3.
With the help of data obfuscation a user can decide to selectively disclose a subset of data he wants to share.
#
Document StoreAs discussed above, issuance of documents can happen individually or by batch. Issuing a batch documents is by far the more efficient way.
When it comes to revocation both values can also be used:
targetHash
will allow for the revocation of a specific document.merkleRoot
will allow for the revocation of the whole batch of documents.
#
Additional information#
Data Obfuscation limitations#
Empty objectsConsidering the following object in data
:
The following obfuscation would work:
foo.bar
only;foo.xyz
only;foo
(that would remove completely the object);
However, obfuscating both foo.bar
AND foo.xyz
would lead to an error. Indeed, obfuscation does not work when applied to all individual fields of an object, leaving the object empty:
While we could provide a way to make this work (and actually we used to), that would also introduce a new behavior: anyone could add empty objects into the document, and the document would remain valid. While we are not sure whether this could lead to potential vulnerabilities, we decided to not support it.
To avoid this problem, obfuscate the full object (foo
in this case) when you need to obfuscate all the fields of an object.