dock.io - Decentralized professional data exchange powered by Ethereum
In recent years centralized platforms have revolutionized the way individuals find jobs and
employment, buildup reputation, and establish professional networks. As a result, these platforms
retain ownership of massive amounts of professionals data and information that only exist within
these closed networks.
Applications are disincentivized from sharing information and data because of the competitive
marketplace and their reliance on monetization of this data for business purposes. Each platform
wants to hold an advantage over its competitors.
The dock.io protocol aims to solve this fundamental data hoarding problem. This will be
accomplished by implementing a protocol that actually encourages data exchange between
By utilizing the secure Ethereum mainchain and a token model, the dock.io protocol will allow
users to take control of their data and exchange it between applications. The dock.io token model
will simultaneously motivate applications to exchange information by making it beneficial for
both platforms involved in the transaction.
Both the users and the applications primary needs are met. Users get full control of their data and
applications receive compensation for data exchange.
High Level Overview and Vision
Anyone can use the dock.io protocol and easily port their data to and from participating
applications. Users are able to have automatically updated profile information, employment
history data, and any type of platform data in an integrated fashion. As a result, they will not be
bound to any single platform for their employment data, freelancing, or personal reviews.
Users will input signed and verified content and choose where it is shared, thus owning their own
data. They will also choose where to share data received from other platforms in their name. It is
up to the user to opt-into sharing any particular data format with any specific application of their
Applications: Data Providers and Consumers
Any application can be a data provider. Data providers are special on-chain non-user accounts
which can request to push signed data to the users’ profile. It is important to note that the
application is not publishing data directly to the blockchain, but sending it to each user
individually. Examples of such data would be CV updates, platform reviews and ratings,
freelance transactional data, Git commits, application specific data, etc.
For an application to be able to consume data from the user, they are required spend tokens to
access the data. The alignment of interests between user and application is achieved via the
dock.io token model.
Usage of Ethereum (Reasons)
The Ethereum main chain was chosen as the preferred chain for the dock.io protocol to run on
due to several benefits:
1. Network effect and strong ecosystem.
2. Multiple good scaling roadmaps.
3. Easy integration with third party smart contracts and applications.
4. ERC201 token standard and easy integration with wallets and exchanges.
Core dock.io Protocol Features
6.1 Data Exchange Model
dock.io uses the IPLD2 specification developed for IPFS to perform content addressable data
6.2 Encrypted Secure Versioning of Data
All data is by default encrypted and only decrypted by the parties exchanging the data. The user
selectively decides to perform push updates for any data format to a specific application by first
encrypting the content with the receivers public key. Encryption and versioning happens in the
1. The user creates the delta of the data. The application always wants to have the latest
version of the data. Delta’s make sense because it is possible to tie this data to a specific
time in the blockchain. Initial data sets without deltas would not offer that possibility,
unless every time a new dataset is created it references the old version. Delta’s might also
reduce total bandwidth requirements for applications and nodes.
2. The unencrypted delta is used to create a hash. This hash includes the previous version of
the hash as well as the most recent Ethereum block hash, the transaction, and the block
hash of the previous version hash.
3. The hash of the unencrypted delta is committed as a transaction to the dock.io Ethereum
Smart Contract special data format section of the user (e.g. Employment, Reviews, Git
commits, education, etc).
4. The user takes the delta and encrypts it with the application’s public key and includes the
IPFS addressable hash of the data in the application update list section of the contract. This
is part of the same transaction as before, it is needed to reduce the transaction requirement.
5. The application observes the dock.io Ethereum contract of the user and sees that a package
is addressed to it. The application requests the encrypted package over IPFS from the user
or an intermediary node.
Reasoning Behind These Steps
dock.io makes use of IPFS to be fully content addressable and as compatible as possible with
other Ethereum ecosystem applications and smart contracts. As of this writing, IPFS adoption
within Ethereum smart contracts is picking up and should become a standard soon.
The unencrypted delta’s of the data format updates are hashed, versioned, and anchored to the
Ethereum blockchain. This is done to provide data integrity and make it possible for any
application receiving data format updates from any user to know whether they truly received the
latest updates, and not a parallel fake history of the data. Through this method all applications can
be sure that they received the true data every other application also received, without exposing
unencrypted data on-chain.
By encrypting the updates to the different applications’ public keys, the user remains in full
control of who receives the push updates. The data itself is not exposed to the public.
Possibility of Public Data
Users can opt to publish specific unencrypted data formats to IPFS directly. That way anyone can
receive that data, verify its integrity, and verify it is the latest version, without having to make
any data requests to the user. We envision that users will opt to publish some core data that is not
sensitive in this fashion. Such data could include but is not limited to:
1. First name and last name
4. Licenses and Certifications
5. Public contact information such as work email
Users need to be aware that any data published in this fashion through the dock.io protocol to
IPFS is irreversibly on the internet and cannot be taken back.
Data Exchange Model
dock.io makes use of the open source standard for anchoring data to blockchains called
Chainpoint3. Chainpoint was originally developed for the Bitcoin blockchain, and only for
anchoring data to a single chain. dock.io uses a modified specification of Chainpoint to not just
anchor updates of data formats to a single blockchain and include it in a single block, but to
prove that the update was created at a specific block as well. This is achieved by making the
following Chainpoint blockchain receipt modifications (changes in gray):
Chainpoint data of encrypted data being shared with specific applications is structured exactly the
same. The sole difference is that the encrypted versioning chainpoint data has one more field:
It is important for the application receiving the encrypted data to know that it is receiving the
latest version being tracked publicly on-chain. Once the application decrypts its update with it’s
own key, it can compare the hash value of it’s decrypted update with the hash value of the
unencrypted update being referenced on-chain publicly.
By including the currentBlock hash we can establish the earliest update that was generated by the
user. Ethereum block hashes are unpredictable but provable, any user can know for certain that
the update was not shared publicly before the time of the block creation.
The prevHash is the hash of the Ethereum transaction which included the previous connected
update. This way any user can now prove that a certain update happened after another one, and is
not a fork of an older update.
The prevHash is the hash of the Ethereum transaction which included the previous connected
update. This way any user can now prove that a certain update happened after another one, and is
not a fork of an older update.
For more information on the Chainpoint protocol standard see the Chainpoint white paper.
All versioning of the data formats will happen client-side and off-chain. The actual versioning is
not resource intensive and can be performed on any device, including a mobile device operated
by the user. The first implementation will use simple GIT4 for versioning of data formats. Later
versions might include more compact versioning methods. The end-goal of the versioning
functionality is to build a versioned key-value store similar to Noms5.
Versioning, rather than publishing, of full content hashes is done to ensure correct versioning of
the same data format. Any application receiving the previous version can check whether the
current version truly is a continuation of the previous one. If non-delta based publishing of
content hashes was used, then the applications would not be able to check for that. A whole class
of applications will rely on this feature for multiple reasons.
On-chain data is solely chainpoint anchoring data which includes:
• Chainpoint Versioning
• Content Hashes
• Merkle Roots
• Merkle Proofs
• Git Hashes
• Ethereum dock.io Contract State
• Vouching Signatures
• Arbitrary UTC Timestamps
• Possibly External ID Verifiers (In the case of civic, only predetermined)
No user data is stored in encrypted or unencrypted form on-chain. No application can infer the
versioned on-chain data from the IPFS hash pointers. They won’t be able to make sense of it
unless the user creates a newly encrypted version specifically for that application. This is how
encryption of updates is used for application data access control.
The user can chose to publish unencrypted updates and data to IPFS directly. Applications can
then directly download those updates from IPFS without the prior permission from the user.
Off-chain data includes all data format information, including prior updates and versions, in both
encrypted and unencrypted form. As long as the user opts to share data selectively with more than
one application, the stored encrypted data should require more storage than the unencrypted data.
Performing this client-side versioning and encryption for every application requires both
additional storage as well as computational resources. Nonetheless, for the majority of users these
operations and storage requirements should be met even on mobile devices.
Depending on semantics, all IPFS addressable data is off-chain, but the content hashes through
which the data can be found are all published publicly, and in versioned form on-chain. Any
underlying data published to the IPFS network overall in this fashion, is just as accessible as onchain
data. Users can opt to make their data IPFS addressable and versioned on-chain, but not
share the underlying data on the IPFS network. This way the integrity of the data can be proven
even if the data itself was never published publicly.
On-Chain Data Attribution Verification
Applications receiving data from users will also know if any data format was originally pushed
from another application to the user’s profile. Applications can both sign the pushed content itself
as well as the version hash the user publishes to his/her smart contract profile. The user can
manually withhold content signatures, but then the application might not sign the version hash
Applications can have various levels of verification requirements. Loose verification
requirements might only require the data from the user. Mid-level requirements would require a
content signature by the other application. Strong verification requirements would include onchain
signing of the version hash the user pushed to the blockchain.
On-Chain Vouching and Signaling
Any account can sign on-chain data format version hashes of any other account. By doing so, one
signals to the rest of the network that the content of that version hash is meaningful and correct.
Use cases include, but are not limited to:
1. Employers vouching for the truthfulness of their employees’ work history.
2. Schools vouching for the truthfulness of their student’s credentials.
3. Students of a school vouching for the integrity of another student.
4. Identity services vouching for their citizens’ identity information.
5. Applications vouching for the truthfulness of an application related data format.
Vouching itself does not verify the authenticity of the underlying data. Instead, a secondary trust/
reputation layer needs to be used to assess how good a signal any individual signature is.
Users have no control over who can sign and publish signatures to the blockchain. Signing and
publishing of signatures of versions/data is fully at the discretion of the signing party.
For data compression purposes, only the hashes of the to be verified version/data will be signed
by any entity. This ensures signature output will remain both small enough and verifiable.
Data Types and Standard Formats
dock.io data formats are essentially the equivalent of microformats6 for Ethereum smart
contracts. The web microformats community already has a resume based microformat example
that is universally machine readable7. The dock.io proposed microformat model is also content
addressable via IPFS, versioned via GIT, and timestamped and anchored via the modified
Chainpoint approach. Data formats are not prescribed in the dock.io Ethereum smart contract.
Users can choose to create, version, share, and use whatever microformats the community settles
on. Data formats are only useful to users as long as a large number of applications accepts them.
Initial Format Types
dock.io will initially create a number of ready-made data formats. All data formats are opt-in and
voluntary for all users and applications. Theoretically, non-employment related data formats can
be used with the dock.io approach as well. Data format examples include, but are not limited to:
1. Employment history format (resume)
2. Social Graph (mutually signed contacts)
3. Licenses (relevant certifications/licenses)
4. Education (education history)
5. Application specific formats
All format data consists of “tags” and “data” fields. Tags can be nested. It is up to the format data
creator, such as the application, to determine whether the actual data is stored in JSON, XML,
CSV, or another format.
Format types cannot be updated by a central party. This means that if an application wants to
change something to the structure of a data format which it established previously, it cannot
arbitrarily tell clients to change the format. The clients need to opt-into updating their format as
We imagine that clients will do this automatically as the formats and the stored data is only as
useful as long as applications accept the format. We have refrained from including an auto-update
format method. We have seen that historically this has been used by closed source API and
software vendors as a way to keep 3rd party applications from relying on their format, while still
being able to claim that their data/service is openly available.
Versioning of Individual Tags Within Format Data
The smallest atomic unit within a data format that can be versioned is a “tag” and “data” pair.
The smallest any single data format can be is a single “tag” and “data” pair. There is no upper
bound to how large a set of these pairs can be. If a large number of pairs requires updating
frequently then we advise to version the individual pairs.
Rare Number of Updates
If the number of updates is rare, then we advise to only version the data formats overall and have
the version hashes be merkle roots. This way, the individual pairs are the merkle leafs. That
would make it possible to selectively share down to individual pairs, while still verifying to any
application that the pair is part of an up to date version hash on the Ethereum blockchain.
Experience Format Example
Multiple experience formats combined can constitute an employment history format.
Today, applications are naturally incentivized to hoard data on their users. User data is often nonaccessible
to the user or third party applications.
More malicious data strategies even include user data gathering, open ecosystem integration in
return for data from third party applications, and then shutting down third party application data
access. This is an example of Microsoft’s internal strategy of “embrace, extend, extinguish8”.
Linkedin has also utilized similar tactics.
Even if an application was not malicious in this manner, other applications would take advantage
of its data and outcompete them by not providing data in return. This is a common problem that
can be solved with token based modeling.
The token’s utility allows applications to exchange any data with each other, with the user
controlling sharing of their data. It makes long term trust in the protocol possible. Applications
are incentivized to keep exchanging data with one another long term.
FIAT-Denominated Fee Oracle
Token costs, as well as any exit fees, will be denominated in FIAT. It is also possible a future
stable cryptocurrency could be used such as Basecoin, Dai etc., should one reach market
maturity. This is important because users and applications will be able to predict their costs long
term. A side effect of this approach is that if the token price on the open market goes up,
proportionally less tokens will be bought by users and applications. If the price goes down, more
tokens will be bought by users and applications on the open market. Long term this should give a
lot of stability and predictability to the exchange rate of the token.
Tokens will be used to incentivize applications to share their data with users and other
applications. Users will have control over which applications can access and update their data. It
is important to note that tokens will not be used to incentivize users to share data with
It is imperative to motivate applications as opposed to users due to how data platforms currently
operate. Applications do not share data because their peers and competitors can use it against
them to create a competitive advantage. This is a major problem in today’s data landscape.
Once this is understood, it becomes clear that a token incentive model that encourages
applications to share data, not hoard it, is absolutely necessary. In order for a model like this to
work properly, the users must have full control of their own data.
Reasoning Behind Not Including Users in the Token Economy
If tokens were used to encourage users to share their data it could lead to a number of different
issues. Users could send out faulty data in an effort to maximize their profits. Users could also
practice deceitful tactics like faking accounts and playing the system as a means of income.
Users should not be burdened with worrying about the value of their data. If users were included
in the token economy they would have to consider microtransactions and payments involving
their data on an everyday basis. By incentivizing applications and not users, the dock.io protocol
will make the user experience as easy as possible. This action will also discourage malicious
Applications will provide compensation to other applications for accessing users data. Users,
however, will not pay applications for their own data. Instead, any time a user sends data that is
attributed to another application, a payment will be triggered within the dock.io protocol. The
first time this data has been accessed, the payment will be sent to the application that is attributed
to that data. Every time afterwards that the data is accessed by other applications, the fee paid by
the other applications is burned.
For example, imagine a user sends data first created by Application A to Application B. The
dock.io protocol verifies this transaction and triggers a payment to be sent from Application B to
Application A, which would be Application B’s payment to access the data. When Application C
accesses the same data, the tokens paid by Application C are burned or removed from the total
supply of tokens on the dock.io protocol.
Using this model, applications will indirectly pay other applications to access their data.
However, applications will not be able to control which other applications receive this data. Users
will remain in ultimate control of their own data.
The fee model for associated data exchange is not final and can be changed. The percentage of
payment for associated data is changeable through an onchain percentage mechanism.
If an application pays a flat rate fee to a user for their data and no third party application data is
included, 100% of the fee would be burned and therefore unusable.
There are a number of reasons why the burning of dock.io tokens makes sense:
1. There should be a cost for applications to acquire user data.
2. At the same time, users should not receive those tokens as incentive to avoid malicious
3. General token holders will benefit from the total supply of the outstanding tokens
decreasing due to the burning mechanism.
4. Any percentage of token burn will act as an anti-spam and anti-DDoS mechanism for the
Game Theoretic Effects
The dock.io protocol represents a two-sided market10. This means that one side, in this case the
users, does not have any monetary incentive to interact with the product or their own data.
The real value that users will receive from the system is ultimate control over their own data and
data portability. This is a major benefit for users. All current major data platforms do not offer
complete control. Since users won’t be motivated by monetary gains, they will be impartial
towards other applications and the quality of the data.
We believe this will lead to users choosing the best data providers and applications. Thus, the
overall quality of data should increase during the cycle of the protocol.
Possible Issues and Hurdles
On-Chain Scaling and Transaction Flooding
The dock.io protocol will need a relatively high transaction throughput for the smart contract to
operate with such a large number of users and applications.
Currently, Ethereum can not support the required number of transactions for such a global
system. Ethereum scaling efforts intend to solve this problem. Regardless, any type of global
state blockchain system will be prone to transaction floating issues.
We have seen similar issues in the past. One example is fake transactions being generated on the
Bitcoin network as an attack method.
Theoretically, the dock.io protocol could experience a denial of service attack. The difference,
however, is that the mainchain will require a number of tokens. Performing such an attack over a
long period of time would prove to be extremely costly.
Need for ETH for Gas by Users
The dock.io smart contract and protocol are reliant on the Ethereum mainchain. The Ethereum
smart contracts on the mainchain rely on Ether to pay the gas requirement for running smart
Both users and applications need Ethereum and dock.io tokens in order to use the dock.io
protocol. Applications need both in order to send and execute transactions. Users need both so
they can send version data of their latest user related data.
This complicates matters for users and applications, and makes using the dock.io protocol more
difficult. These issues can be mitigated through the use of background automation, hosted
wallets, and hosted services.
Transition to Standalone Chain
It is possible that the scaling efforts of the Ethereum mainchain do not produce adequate results.
It is also possible that transactions on the Ethereum mainchain become too expensive for the
average transaction moving forward. In this event, we reserve the right to create a new standalone
This chain would be dedicated solely to the dock.io protocol and smart contract. Procedures will
be put in place to ensure the transition is as smooth as possible.
The best method for such a transition would be a publicly known snapshot date. All applications
and clients of the users would be upgraded to a client that switches over to the mainchain at a
certain snapshot date.
Long-Con Game by Data Providers
Data providers can theoretically set up a large number of fake user accounts. These accounts
would then be filled with meaningless data.
Applications may unknowingly pay for this faulty data. This would allow these data providers to
steal tokens from other accounts. However, it would be quickly detected by other applications
that the data is false, rendering this strategy useless.
Token User Experience
End-users might not want to take on the cognitive load of buying a small amount of tokens,
setting up their wallet and data application, manually inputting data, setting up connections, and
approving applications. This is required for the user to be sovereign and have control over their
own data and tokens. Therefore, we believe that similar to Bitcoin mainstream adoption, the
majority of users will opt into performing all these tasks automatically through a hosted third
Centralized Exchange Rate Oracle Censorship
The centralized oracle providing token-FIAT life exchange rates to the dock.io smart contract is a
possible central failure point. The following scenarios are technically possible:
1. The proxied exchange rate received by centralized exchanges is bad, and therefore the
oracle relays bad data.
2. The oracle is down, making the dock.io smart contract rely on the last known exchange rate
for fee calculation.
3. The oracle is shut down by hackers or authorities.
4. The oracle is hacked and relays false exchange rate data.
Applications Can Share Decrypted Data
Applications can decrypt and share user related data with applications not approved by the user.
Once an application receives user related data, it is up to them what they do with it. The user can
stop sharing new data with the application sharing with non-permitted applications.
This is both a positive and a negative. It can be beneficial, allowing users the option to share
certain updates or basic data in a public manner. This means the version hash which they commit
to the dock.io Ethereum smart contract would be committed to IPFS itself by the user in
This allows anyone to discover the data on IPFS itself and make use of it without having to ask
permission, notify the user, or pay the user. There are many application use cases that show this
can be useful for the applications and the users.
However, this can also be a negative. Wrongly configured clients could possibly leak data this
way by publishing unencrypted data to IPFS.
Once a user publishes unencrypted data to the IPFS they can not return or delete the data. It is
forever in the public domain. This could create issues for certain user groups.
Powered by dock.io
Nick Macario - Co-Founder
Previously founder of branded.me
Nick is a multi-time founder and technology executive with an exit. His previous company was
branded.me, a professional networking site which grew to millions of users and was recognized
for many awards and publications.
Elina Cadouri - Co-Founder
Previously Co-founder Outsource.com
Elina is a multi-time founder and veteran marketplace operator. Her previous company was
Outsource.com, a freelance marketplace transacting tens of thousands of jobs and millions of
dollars in revenue.
Stenli Duka - CTO
Previously @ MotiveMetrics
Stenli is a seasoned full-stack engineer and has led multiple teams. His passion is machine
learning, natural language processing and complex algorithms. He previously spent time at
Motive Metrics as the Director of Engineering.
Evgeniy Zabolotniy - Lead Blockchain Engineer
Previously @ branded.me
Evgeniy brings over 10 years of experience as a systems architect, developer and security
specialist. Previously he worked with Nick at branded.me as part of the founding team, and has
been a core team member for over 3 years.
Todd Scheuring Head of Design
Previously @ Honeybook
Todd brings over 10 years of design and interactive development experience. Prior to Dock.io, he
was a senior UX designer for Honeybook and led design for the network and growth teams.
Jeffry Harrison - Director of Partnerships
Previously @ The Muse
Jeffrey brings years of industry experience in both sales and recruiting. Prior to he was a Team
Lead at The Muse where he helped grow the sales team from 3 to over 40, and was responsible
for some of the largest deals the company achieved.
Gabriel Moncarz - Data Scientist
Previously @ Sabre Corp
Gabriel brings over 10 years experience as a Data Scientist, Computer Engineer, Magister in
Quantitative Finance and Magister in Data Mining, Big Data and Knowledge Discovery.
Previously he was with Sabre Corp.
Fausto Woelflin - Senior Engineer
Previously @ Ampush
Fausto is a senior engineer who develops python-based microservices with Elasticsearch, Flask,
Cassandra and Docker. He previously was with Ampush.
Piotr Swies - Senior Engineer
Previously @ Hunted Hive
Piotr is a software engineer, architect and full-stack developer with over 10 years of experience,
skilled in designing and implementing distributed systems with microservices architecture. He
was previously with Hunted Hive.
Sergey Ermakovich Lead Frontend Developer
Previously @ branded.me
Sergey brings 10 years of experience, specializing in front-end web development, reactive
functional programming (RFP), and UI architecture. Sergey comes from branded.me and has
been with the team for over 3 years.
Samuel Hellawell - Senior Frontend Developer
Previously @ branded.me
Samuel is a senior engineer with a passion for developing quality applications, games and
websites. He’s started and sold his own company, and was part of the founding team at
branded.me prior to Remote.
Maciek BodekSenior - Frontend Developer
Previously @ Shortlist.co
Maciek is a frontend engineer with ten years of experience crafting web products and complex
interfaces ranging from portfolio microsites to agile SaaS startups.