Ethereum (ETH) - POW/POS - Ethash


  • admins

    Ethereum is a decentralized platform that runs smart contracts: applications that run exactly as programmed without any possibility of downtime, censorship, fraud or third party interference.

    Ethereum is how the Internet was supposed to work.

    Ethereum was crowdfunded during August 2014 by fans all around the world. It is developed by ETHDEV with contributions from great minds across the globe.

    Website:

    https://www.ethereum.org/

    http://wiki.ethdev.com

    Documentation:

    https://www.cryptocompare.com/mining/guides/how-to

    https://github.com/ethereum/go-ethereum/wiki/Minin

    https://ethereum.gitbooks.io/frontier-guide/conten

    https://github.com/ethereum/wiki/wiki

    https://blog.ethereum.org

    Stats:

    https://stats.ethdev.com/

    https://eth-status.org/

    Forums:

    https://forum.ethereum.org/

    https://www.reddit.com/r/ethereum

    Developers:

    http://ethdev.com/

    Wallets:

    https://github.com/ethereum/mist/releases/download

    Online wallet:

    https://ethereumwallet.com/

    The following full-node implementations of Ethereum are available:

    • Geth, written in Go
    • Eth, written in C++
    • Ethereum J, written in Java
    • pyethapp, written in Python
    • ethereumjs, written in JavaScript
    • ethereumH, written in Haskell

    Ethereum Blockchain As a Service On Azure:

    https://azure.microsoft.com/en-in/blog/ethereum-bl

    Block Explorer:

    https://live.ether.camp/

    https://etherscan.io/

    https://www.etherchain.org/

    http://ether.fund/explorer

    https://explorer.etherapps.info/

    https://tradeblock.com/ethereum/

    https://github.com/etherparty/explorer (Git Source)

    https://ethereumblocks.info/

    Exchanges:

    https://poloniex.com/exchange/btc_eth

    https://www.gatecoin.com/public/markets

    https://www.kraken.com/

    https://www.cryptsy.com/markets/view/ETH_BTC

    https://bittrex.com/Market/Index?MarketName=BTC-ET

    https://hitbtc.com/

    https://bleutrade.com/exchange/ETH/BTC

    https://coinsquare.io/

    https://metaexchange.info/markets/ETH/BTC

    https://alcurex.org/index.php/crypto/index

    https://yunbi.com/markets/ethcny

    https://www.cryptocompare.com/coins/eth/markets/BT

    Tools:

    http://ether.fund/tool/contract

    http://ether.fund/tool/etherface

    http://ether.fund/tool/terminal

    http://ether.fund/peers

    http://ether.fund/tool/converter

    http://ether.fund/tool/calculator

    http://ether.fund/tool/gas-fees

    http://ether.fund/tool/gas-price

    http://ether.fund/tool/blockcast

    Contracts:

    http://ether.fund/contracts/

    Cloud Mining:

    http://www.ethercloud.info/

    Mining Pools:

    http://ethereumpool.co/

    http://eth.nanopool.org/

    https://eth.suprnova.cc/

    http://ethpool.org/

    https://eurohash.net/#/

    http://www.talkether.org/

    https://www2.coinmine.pl/eth/index.php?page=statistics&action=blocks

    https://eth.pp.ua/stats/

    Social:

    https://twitter.com/ethereumproject

    https://plus.google.com/+EthereumOrgOfficial

    https://www.facebook.com/ethereumproject/

    IRC Chat:

    • bitly.com/irc_ethereum #ethereum: for general discussion
    • #ethereum-dev: for development specific questions and discussions
    • ##ethereum: for offtopic and banter
    • #ethereum-mining: for mining only conversations
    • #ethereum-markets: for discussions about markets


  • Analysis of Storage Corruption Bug

    This blog post provides an update on our findings following the discovery of the storage corruption bug last week. In summary, the bug was much less severe than we initially thought. The small number of affected contracts we found is either only exploitable by the owner, or the exploit can only cause a disruption in the user interface and not in the actual contract logic. All exploitable contracts/dapps we reviewed can be fixed without having to upgrade the contract itself. Of course, please still check your contracts to be safe.

    Following the discovery of the storage corruption bug in the Solidity compiler and the realization that it may have serious effects on already-deployed contracts that cannot be updated, we started analyzing how common the bug is and how exploitable contracts can be addressed.

    We focused on contracts with source code published on etherscan because important or popular smart contracts usually have their source code published there in order to gain trust from their users, who can then verify the compilation. Furthermore, if the source code is not available, it is also much harder for an attacker to find a suitable exploit. Finally, contracts that are privately used (and thus do not require publishing their source code) usually check that they are called from a certain address, and thus an attacker has no means to write to their storage.

    In order to automate the process of checking all contracts on etherscan, we created a modified version of the Solidity compiler that can automatically detect the conditions for triggering the bug. This technique has already reduced the number of potentially vulnerable contracts to 167. We then manually checked those contracts for potential corruption of storage that would make them vulnerable to attacks.

    It turns out that only ten contracts were vulnerable, so we were able to contact most of the contract owners/developers. Seven out of ten of those contracts are only exploitable by the owner in that they are allowed to change certain parameters outside their permitted range, or allowed to unlock a previously locked contract. One contract is exploitable by unprivileged users but have other major flaws in its design. The other two contracts found to be exploitable by unprivileged users either provided no advantages if exploited or only affected the user interface.

    Why are only so few contracts exploitable?

    First, let us define what we mean by “exploitable”:

    The storage corruption bug is exploitable if it can be used to modify a variable in storage in a way that would not be possible without the bug, and this modification has consequences for the behaviour and use of the smart contract. For example, we do not consider a contract exploitable in the following situations:

    • The same account would be able to overwrite the variable in the same state of the contract by regular means.
    • Overwriting can only happen at construction time (note that we did not check whether overwriting occurred at that time).
    • Overwriting is only triggered in unlikely situations where the contract logic was broken anyway (for example, a 32-bit counter that is incremented once per block, oveflows).
    • Variables can be overwritten that are unused in the smart contract and look non-critical, but may be part of the public interface.

    Why is this critical bug only exploitable in so few cases?

    It is a combination of the following factors that together multiply and dramatically reduce the probability of exploitability.

    1. Since small types only provide an advantage in very rare cases, they are seldomly used.
    2. Small types must be adjacent to each other in storage – a single large type in between them prevents the bug from being triggered.
    3. State variables are often assigned one after the other, which removes the corruption at the second assignment.
    4. The combination of “address” and “bool” is most common among the cases that are left, but here, the address variable is often an “owner” that is assigned from msg.sender and thus not exploitable. Even if the owner can be changed, the flag is often a flag that can be still be set by the owner through other means.

    How to fix affected contracts

    A large majority of the exploitable contracts are only exploitable by the contract owner, administrator or developer, particularly though a single function that allows the owner to be changed. The exploit allows a further escalation of privileges for the owner. In order to prevent the owner from taking advantage of this exploit, a proxy contract can be installed between the owner and the affected contract. This proxy contract forwards calls from the owner, but disallows calling the exploitable functions. If calling the exploitable functions is still necessary, the proxy contract can prevent malicious data from being forwarded to the contract.

    If you have specific questions or concerns regarding your contracts, please contact us on gitter.



  • Ethereum-Whoa… Release Geth 1.5

    The Go Ethereum team is very proud to finally release Geth 1.5, which can almost be called a complete internal rewrite of the Go Ethereum (go-ethereum) codebase.

    We’ve packed a huge number of changes into this release, and simply listing them wouldn’t do them justice. Instead, we’ve decided to write them up in a more informal way, explaining not only what’s new, but also why it’s needed, and why it’s awesome!

    Go Ethereum website

    The go-ethereum project never really had a website. There was something auto-generated a long time ago by GitHub, but it couldn’t really be called a decent website as it didn’t contain valuable information, didn’t look particularly good, and there was nobody to properly maintain it. But at the time it was ok as the hardcore developers were cared more about the source repository and wiki pages, than a web site.

    However, as Ethereum gains popularity and traction, we are now making efforts to make Geth, its code, and associated resources more accessible and streamlined for everyone involved, not just a handful of core developers. As a first step in this direction we’ve begun to put together a new website for go-ethereum. You can see it at: https://geth.ethereum.org.

    The web site still has a long way to go, but we’ve done our best to include information that is not available elsewhere else, yet we feel is essential for anyone starting out with go-ethereum: a detailed installation guide for all platforms, and a downloads section gathering all our binaries from every build service we maintain. You can expect a detailed developer guide in the next few weeks, and a detailed user guide afterwards.


    Library access

    Go Ethereum, one of three original clients along with C++ Ethereum and Py Ethereum, evolved alongside the Ethereum networking and consensus protocol specification. This process entailed fast prototyping, frequent rewrites and binned features. The net effect was a codebase that worked well, but was difficult to embed into other projects due to its messy internals.

    In the Geth 1.4.x series we started untangling <strong>go-ethereum</strong>, but it took longer than anticipated to clean up most of the public API pathways. With Geth 1.5, we’ve finally arrived at the point where we can stand behind our programmatic APIs both as usable and as something we would like to support long term. The final pieces are still being polished, but we’re confident you’ll like the result a lot!

    Our main areas of focus were: a) simplified client side account management, b) remote clients via HTTP, IPC and WebSockets; c) contract interactions and binding generation, and d) in-process embedded nodes. With these four main use-cases covered, we’re confident most server side or mobile applications can go a long way.

    Check out the teaser slide presentation about our new APIs presented by @karalabe at Devcon2, our Ethereum developers conference in Shanghai, a few weeks ago.




    Mobile platforms

    With Geth 1.5 focusing on library reusability, it is only natural to see how far we can push the envelope. There has been ample exploration of running (or at least interfacing with) Ethereum from browsers; our current release focused on doing so from desktop/server processes. The only missing piece of the puzzle was mobile devices… until now.

    The 1.5 release of go-ethereum introduces our first experimental attempt at providing true Android and iOS library reusability of our codebase. This comes in the form of a native Java and ObjC wrapper around our code, bundled up officially as an Android archive and iOS XCode framework. The former is more mature, while the latter requires some API polishes due to the difficulty in automatically wrapping Go to ObjC/Swift code.

    We’re also providing native dependencies for both platforms in the form of Maven Central packages (or Sonatype for develop snapshots) for Android, and CocoaPod packages for iOS. Since this is the very first time we’re making the pushes to these package managers, there are a few hurdles that may arise, so we’ll make a separate announcement when both are reliable to use. Until then, we recommend sticking to the downloadable library bundles.

    Experimental protocols

    The 1.5 release of Geth is an attempted foundation for the future direction and features we’d like to work on and stabilize in upcoming releases. In our opinion, the best way to push the desired new features forward is to ship them as experimental (solely opt-in) protocols so that anyone can play with them and provide feedback. In the light of this, we’ve merged in quite a few things we (and hopefully the community) had been looking forward to for quite some time.

    Discovery v5

    If you’ve played with joining the official testnet (Morden) or running a publicly reachable private testnet, you know it can sometimes take quite a long time to synchronize, as the node often seemingly just sits there doing nothing.

    One of the root causes for testnet sync issues is that the peer discovery protocol cannot differentiate between machines running different blockchains, or even different network protocols altogether. The only way to find suitable peers is to connect to as many peers as possible and keep the ones that make sense. This approach works for the mainnet, but for smaller protocols (testnet, light clients, swarm, whisper) it’s like looking for a needle in a haystack of advertised peers.

    Geth 1.5 contains a new version of the peer discovery protocol that extends the “shooting in the dark” approach with topic based peer-querying. In short, peers can actively search for other peers that have specifically advertised feature sets, protocols or configurations. This new discovery protocol should enable nodes to instantly find others of interest, even when there are only a handful among thousands of “boring” ones.

    Please note: the v5 discovery protocol is experimental, hence it is currently only enabled for light clients and light servers. This will allow us to gather valuable information and analyze its behavior/anomalies without influencing the main Ethereum P2P network in the slightest.

    Light client

    Blockchains are large beasts, there’s no denying it. Irrelevant of optimizations, there will always be devices that are too resource-constrained to play an active role in blockchain networks (e.g. mobile phones, IoT devices). Although unexpected, we’ve seen this effect happen during the DoS attack, which caused HDDs to have troubles syncing.

    The only meaningful solution for running a blockchain on tiny embedded devices is for them to become light clients, where they do not bare the full burden of sustaining the network, but rather only bear the burden of their own operation. Not only is this beneficial for the small devices, but it also benefits the network as a whole as it removes slow links and thus makes the core network smaller, tighter and more performant.

    We’re proud to finally include an alpha version of a light client inside Geth 1.5. It can sync in minutes (or less) and consume only megabytes of disk space, but nonetheless fully interacts with the Ethereum blockchain and is even usable through the Mist browser (although there have been hiccups there).

    You can run Geth as a light client via the --light flag. If you are maintaining a full node, feeling a bit generous, and aren’t running a sensitive production system, consider enabling the light server protocol to help out small devices in the network via <strong>--lightserv 25 --lightpeers 50</strong> flags (first sets the percentage of system resources allowed to be used by light clients, and the second sets the number of light clients to allow connecting).

    Swarm

    Along with the consensus protocol, the Ethereum vision also consists of two other pillars: real time dark messaging (Whisper) and decentralized file storage (Swarm). All three are needed to create truly decentralized, high availability applications. Whisper is more or less available as an experimental protocol, but Swarm always looked like a far away dream.

    With the arrival of 1.5, we’re very excited to include an initial proof-of-concept implementation of the Swarm protocol for developers to play with. It is included as a separate daemon process (and inherently executable binary), not embedded inside Geth. This allows users to run Swarm against any Ethereum client while also preventing any issues from interfering with the main node’s functionality.

    RPC subscriptions

    If you’ve written a more complex DApp against a Geth node (or any other Ethereum node for that matter), you may have noticed that polling the node for data on RPC can have adverse effects on performance. Not polling it, on the other hand, has adverse effects on user experience since the DApp is less sensitive to new events.

    The issue is that polling for changes is a bad idea since most of the time there’s no change, only the possibility of one. A better solution, instead of querying the node for changes every now and then, is to subscribe to certain events and let the node provide notification when there’s a change. Geth 1.5 enables this via a new RPC subscription mechanism. Any DApp (or external process) can subscribe to a variety of events and leave it to the node to notify when needed. Since this mechanism is not possible over plain HTTP (like it is over IPC), the 1.5 release also includes support for running the RPC API via WebSockets.

    JavaScript tracing

    During the DoS attacks in recent months, we spent an inordinate amount of time analyzing different transactions to better understand how they work. These efforts entailed trying to create various traces, looking at exactly what the EVM executes, and how that influences the underlying implementation.

    Although Geth featured an EVM tracing API endpoint for quite some time now, it didn’t provide much granularity in regards to configurability. It ran the EVM bytecode, returned the executed opcodes, any occurred errors and optionally a diff of stack, and memory and storage modifications made by the transaction. This is useful, but expensive resource-wise to both create and to pass through the RPC layer.

    With the 1.5 release, we’re introducing a new mechanism for tracing transactions, a JavaScript map-reduce construct. Instead of the usual trace options available until now, you will be able to specify two JavaScript methods: a mapper invoked for every opcode with access to all trace data, and a reducer invoked at the end of the trace to specify the final data to return to the caller.

    The advantage of the JavaScript trace approach it that it’s executed inside the Go Ethereum node itself, so the tracer can access all information available for free without performance impact, and can collect only what it needs while discarding everything else. It is also a lot simpler to write custom trace code instead of having to parse some predefined output format.

    Vendored dependencies

    Until the 1.4.x release cycles of Geth, the go-ethereum codebase used the godep tool as its dependency manager because Go itself did not provide a viable alternative other than manually copying dependencies or relying on upstream repositories to not break over time.

    This situation was unfortunate due to a number of drawbacks: a) building go-ethereum required both a custom tool as well as knowing the quirks of said tool, b) dependency updates via godep were very painful due to them dirtying the local workspaces and not being able to work in temporary folders, and c) using go-ethereum as a library was extremely hard as dependencies weren’t an integral part of the Go workflow.

    With the Geth 1.5 release, we’ve switched over to the officially recommended way of vendoring dependencies (fully supported starting with Go 1.6), namely by placing all external dependencies into locations native to the Go compiler and toolchain ( vendor), and switching to a different dependency management tool to more cleanly handle our requirements (called trash).

    From an outside perspective, the main benefit is no longer having to muck around with some random dependency management tool that we happen to use when building go-ethereum, or to using it as a library in other projects. Now you can stick to the plain old Go tools and everything will work out of the box!

    Build infrastructure

    From the beginning of the Ethereum project, all official clients depended on a build infrastructure that was built and maintained by @caktux based on Amazon EC2 instances, Ansible and a sizeable suite of Python scripts (called the Ethereum Buildbot).

    Initially, this infrastructure worked well when the original implementations all shipped a handful of major platform, architecture and deliverable bundles. However as time passed and projects started to focus on smaller unique builds, the maintenance burden started to ramp up as the buildbot began to crumble down. When the maintainer left the Ethereum project, it became clear that we needed to transition to new build flows, but creating them was a non-trivial effort.

    One of the major milestones of the Geth 1.5 release is the complete transition from the old build infrastructure to one that is fully self-contained within our repositories. We moved all builds on top of the various continuous integration services we rely on (Travis, AppVeyor, CircleCI), and implemented all the build code ourselves as an organic part of the go-ethereum sources.

    The end result is that we can now build everything the go-ethereum project needs without depending on particular service providers or particular code outside of the team’s control. This will ensure that go-ethereum won’t have strange missing packages or out-of-date package managers.

    Build artifacts

    Starting with Geth 1.5, we are distributing significantly more build artifacts than before. Our two major deliverables are archives containing Geth only, and bundles containing Geth and any other tools deemed useful for developers and/or users of the Ethereum platform. These artifacts are pre-compiled for every stable release as well as every single develop commit to a very wide variety of targets: Linux ( 386, amd64, arm-5, arm-6, arm-7 and arm64), macOS (amd64) and Windows (386, amd64).

    One of our feature updates are library bundles for using go-ethereum in mobile projects. On Android we’re providing official builds for .aar archives containing binaries for 386, amd64, arm-7 and arm64, covering all popular mobiles as well as local simulator builds. On iOS we’re providing official XCode Framework bundles containing binaries for amd64, arm-7 and arm64, covering all iPhone architectures as well as local simulator builds.

    Besides the standalone binary archives we’re also distributing all of the above in the form of Homebrew bundles for macOS, launchpad PPA packages for Ubuntu, NSIS installers for Windows (Chocolatey distribution will need further administrative hurdles to overcome), Maven Central dependencies for Android and CocoaPods dependencies for iOS!

    All of the artifacts mentioned above are available from the go-ethereum downloads page.

    Digital signatures

    For a long time our binary distributions were a bit chaotic, sometimes providing checksums, sometimes not, which depended on who made the release packages and how much time we had to tie up loose ends. The lack of checksums often lead to users asking how to verify bundles floating around the internet, and more seriously it resulted in a number of fake developer and project clones popping up that distributed malware.

    To sort this out once and for all, from Geth 1.5 an on, all our officially built archives will be digitally signed via a handful of OpenPGP keys. We will not rely on checksums any more to prove authenticity of our distributed bundles, but will ask security-conscious users to verify any downloads via their attached PGP signatures. You can find the list of signing keys we use at our OpenPGP Signatures section.

    Repository branches

    A bit before the Frontier release last July, we switched to a source repository model where the master branch contained the latest stable code and develop contained the bleeding edge source code we were working on.

    This repository model however had a few drawbacks: a) people new to the project wanting to contribute always started hacking on master, only to realize later that their work was based on something old; b) every time a major release was made, master needed to be force-pushed, which looked pretty bad from a repository history perspective; c) developers trying to use the go-ethereum codebase in their own projects rarely realized there was a more advanced branch available.

    Beginning with Geth 1.5, we will no longer maintain a separate master branch for latest-stable and develop branch for latest-edge, rather we will switch to master as the default and development branch of the project, and each stable release generation will have its own indefinitely living branch (e.g. release/1.4, release/1.5). The release branches will allow people to depend on older generations (e.g. 1.4.x) without finding surprising git issues with history rewrites. And having master as the default development branch would allow developers to use the latest code.




  • Hard Fork No. 4: Spurious Dragon

    The Ethereum network will be undergoing a hard fork at block number 2,675,000, which will likely occur between 15:00 and 16:00 UTC on Tuesday, November 22, 2016. A countdown timer can be seen at https://fork.codetract.io/. The Morden test network will be undergoing a hard fork at block number 1,885,000.

    As a user, what do I need to do?

    Download the latest version of your Ethereum client:

    What happens if I do not update my client?

    If you are using an Ethereum client that is not updated for the upcoming hard fork, your client will sync to the pre-fork blockchain once the fork occurs. You will be stuck on an incompatible chain following the old rules and you will be unable to send ether or operate on the post-fork Ethereum network.

    Importantly, if your client is not updated, it also means that any transactions you make will still be susceptible to replay attacks.

    What if I am using a web or mobile Ethereum wallet like MyEtherWallet or Jaxx?

    Ethereum websites and mobile applications that allow you to store ether and/or make transactions are running their own Ethereum client infrastructure to facilitate their services. Generally, you do not need to do anything if you use a third party web based or mobile Ethereum wallet. However, you should still check with your web or mobile Ethereum wallet provider to see what actions they are taking to update for the hard fork and if they are asking their users to take other steps.

    In particular, you should ensure that transactions are generated with the new replay-protected EIP 155 scheme.

    What do I do if my Ethereum client is having trouble syncing to the blockchain?

    Make sure you have downloaded the latest version of your Ethereum client.

    Why are we proposing to hard fork the network?

    “Spurious Dragon” is the second hard fork of the two-round hard fork response to the DoS attacks on the Ethereum network in September and October. The previous hard fork (a.k.a “Tangerine Whistle”) addressed immediate network health issues due to the attacks. The upcoming hard fork addresses important but less pressing matters such as further tuning opcode pricing to prevent future attacks on the network, enabling “debloat” of the blockchain state, and adding replay attack protection.

    What changes are a part of this hard fork?

    The following Ethereum Improvement Proposals (EIPs) describe the protocol changes implemented in this hard fork.

    • EIP 155: Replay attack protection – prevents transactions from one Ethereum chain from being rebroadcasted on an alternative chain. For example: If you send 150 test ether to someone from the Morden testnet, that same transaction cannot be replayed on the main Ethereum chain. Important note: EIP 155 is backwards compatible, so transactions generated with the “pre-Spurious-Dragon” format will still be accepted. However, to ensure you are protected against replay attacks, you will still need to use a wallet solution that implements EIP 155.
      Be aware that this backwards compatibility also means that transactions created from alternative Ethereum based blockchains that have not implemented EIP 155 (such as Ethereum Classic) can still be replayed on the main Ethereum chain.
    • EIP 160: EXP cost increase – adjusts the price of `EXP` opcode so it balances the price of `EXP` with the computational complexity of the operation, essentially making it more difficult to slow down the network via computationally expensive contract operations.
    • EIP 161: State trie clearing – makes it possible to remove a large number of empty accounts that were put in the state at very low cost as a result of earlier DoS attacks. With this EIP, ’empty’ accounts are removed from the state whenever ‘touched’ by another transaction. Removal of the empty accounts greatly reduces blockchain state size, which will provide client optimizations such as faster sync times. The actual removal process will begin after the fork by systematically performing `CALL` to the empty accounts that were created by the attacks.
    • EIP 170: Contract code size limit – changes the maximum code size that a contract on the blockchain can have. This update prevents an attack scenario where large pieces of account code can be accessed repeatedly at a fixed gas cost. The maximum size has been set to 24576 bytes, which is larger than any currently deployed contract.

    DISCLAIMER
    This is an emergent and evolving highly technical space. If you choose to implement the recommendations in this post and continue to participate, you should make sure you understand how it impacts you. You should understand that there are risks involved including but not limited to risks like unexpected bugs. By choosing to implement these recommendations, you alone assume the risks of the consequences.

    Discuss this article on Disqus



  • From Morden to Ropsten


    Testing a fork

    The Spurious Dragon hardfork is scheduled for the coming week; block 2675000 is likely to occur Tuesday evening (CET). The block number for the testnet “Morden” was scheduled at block 1885000. Performing the fork in the test network prior to performing it in the main network was an important measure taken in the testing process to ensure a smooth rollover into the post-fork state.

    The Morden fork occurred on Nov-20-2016, 06:12:20 +UTC, at block 1885000 as planned. A bit later, at block 1885074, there was a consensus issue between Geth and Parity.

    Morden replay protection

    The Morden testnet has been running since the launch of the Ethereum blockchain (July 2015). At that time, concerns about replay-attacks between Morden and Mainnet were addressed by using a nonce-offset. All accounts on Morden used a starting nonce of 2^20 instead of 0, ensuring that any transaction valid on one chain would not be valid on the other.

    EIP 161 specifies new EVM rules regarding nonces. The implementation of those rules, in combination with Morden-specific nonce-rules, resulted in Geth and Parity creating incompatible blocks at block 1885074.

    Consequences for the Main network

    All issues found during the rollout of Spurious Dragon on the test network were Morden-specific. There are currently no known issues affecting the Mainnet.

    Starting the new “Ropsten” test network

    Before the current hard forks, there were already discussions about restarting the test network from a new genesis block in order to make full syncing simpler and less resource intensive. And due to the low difficulty of the testnet, the difficulty bomb was already causing noticeable increases in block times, which would continue to grow if unaddressed. So the time is now right to leave Morden behind and start a new test network.

    New clients will be released that use Ropsten instead of Morden as the default testnet.

    Developers who want to get started with Ropsten right away can download the genesis file here, and start a client with the Ropsten network id:3

    • geth: geth --datadir /path/to/testnet/data init genesis.json; geth --datadir /path/to/testnet/data --networkid 3 console
    • parity: Download ropsten.json, then parity --chain path/to/ropsten.json


  • Security alert [11/24/2016]: Consensus bug in geth v1.4.19 and v1.5.2


    Security Alert

    Affected configurations: Geth

    Severity: High

    Summary: An issue has been identified with Geth’s journaling mechanism. This caused a network fork at block #2686351 (Nov-24-2016 14:12:07 UTC). The new Geth release 1.5.3 fixes the journaling issue and repairs the fork.Details: Geth was failing to revert empty account deletions when the transaction causing the deletions of empty accounts ended with an an out-of-gas exception. An additional issue was found in Parity, where the Parity client incorrectly failed to revert empty account deletions in a more limited set of contexts involving out-of-gas calls to precompiled contracts; the new Geth behavior matches Parity’s, and empty accounts will cease to be a source of concern in general in about one week once the state clearing process finishes.The chain that was created from block #2686351 by the old Geth client, which both Parity and the new Geth release consider invalid, seems to have been mostly abandoned around block #2686516, meaning that ~165 blocks were mined on the now abandoned chain. Transactions are broadcast across the network so most transactions are likely present on both the old Geth chain and the current chain, although mining rewards and transaction fees on the old Geth chain are lost. No transactions or


    blocks on the chain that both clients will now accept will be reverted.


    The latest geth release will update the blockchain from the point of the fork, even if it has synced past the point of the fork.Solution: Geth 1.5.3 was released. If you are using Geth: Download the latest client here: https://github.com/ethereum/go-ethereum/releases/tag/v1.5.3If you are using Mist: Restart Mist and the auto-update feature will prompt you to update the Geth client that Mist uses to geth 1.5.3.If you do not update, please be aware you will be on an invalid chain that is not supported.We continue to recommend that exchanges and other high-value users run multiple clients and automatically halt operations or otherwise enter safe mode if they go out of sync by more than ~10 blocks.Ethereum websites and mobile applications that allow you to store ether and/or make transactions are run by third party web based or mobile Ethereum providers (“Third Party Providers”). Third Party Providers run their own Ethereum client infrastructure to facilitate their services. Generally, you do not need to do anything if you use a Third Party Provider such as MetaMask, Jaxx, and MyEtherWallet. However, they may have instructions for you. You should check with your Ethereum Third Party Provider to see what actions, if any, they are recommending for their users.

    —————————–

    DISCLAIMER

    This is an emergent and evolving highly technical space. If you choose to participate, you should know there are many risks involved including but not limited to risks like unexpected bugs and other technical complications that could result in loss of ether and other consequences. In addition, if you do not update to Geth 1.5.3, you will be on an unsupported network. By choosing to use the Ethereum platform, you assume the risks of this emergent platform.


    Vitalik Buterin


    profile









  • Ethereum Research Update

    This week marks the completion of our fourth hard fork, Spurious Dragon, and the subsequent state clearing process, the final steps in the two-hard-fork solution to the recent Ethereum denial of service attacks that slowed down the network in September and October. Gas limits are in the process of being increased to 4 million as the network returns to normal, and will be increased further as additional optimizations to clients are finished to allow quicker reading of state data.In the midst of these events, we have seen great progress from the C++ and Go development teams, including improvements to Solidity tools and the release of the Geth light client, and the Parity, EthereumJ and other external development teams have continued pushing forward on their own with technologies such as Parity’s warp sync; many of these innovations have already made their way into the hands of the average user, and still others are soon to come. At the same time, however, a large amount of quiet progress has been taking place on the research side, and while that progress has in many cases been rather blue-sky in nature and low-level protocol improvements necessarily take a while to make it into the main Ethereum network, we expect that the results of the work will start to bear fruit very soon.


    Metropolis

    Metropolis is the next major planned hardfork for Ethereum. While Metropolis is not quite as ambitious as Serenity and will not include proof of stake, sharding or any other similarly large sweeping changes to how Ethereum works, it is expected to include a series of small improvements to the protocol, which are altogether much more substantial than Homestead. Major improvements include:

    EIP 86 (account security abstraction) – move the logic for verifying signatures and nonces into contracts, allowing developers to experiment with new signature schemes, privacy-preserving technologies and modifications to parts of the protocol without requiring further hard forks or support at the protocol level. Also allows contracts to pay for gas.

    EIP 96 (blockhash and state root changes) – simplifies the protocol and client implementations, and allows for upgrades to light client and fast-syncing protocols that make them much more secure.

    Precompiled/native contracts for elliptic curve operations and big integer arithmetic, allowing for applications based on ring signatures or RSA cryptography to be implemented efficiently

    Various improvements to efficiency that allow faster transaction processing


    Much of this work is part of a long-term plan to move the protocol toward what we call abstraction. Essentially, instead of having complex protocol rules governing contract creation, transaction validation, mining and various other aspects of the system’s behavior, we try to put as much of the Ethereum protocol’s logic as possible into the EVM itself, and have protocol logic simply be a set of contracts. This reduces client complexity, reduces the long-run risk of consensus failures, and makes hard forks easier and safer – potentially, a hard fork could be specified simply as a config file that changes the code of a few contracts. By reducing the number of “moving parts” at the bottom level of the protocol in this way, we can greatly reduce Ethereum’s attack surface, and open up more parts of the protocol to user experimentation: for example, instead of the protocol upgrading to a new signature scheme all at the same time, users are free to experiment and implement their own.


    Proof of Stake, Sharding and Cryptoeconomics

    Over the past year, research on proof of stake and sharding has been quietly moving forward. The consensus algorithm that we have been working on, Casper, has gone through several iterations and proof-of-concept releases, each of which taught us important things about the combination of economics and decentralized consensus. PoC release 2 came at the start of this year, although that approach has now been abandoned as it has become obvious that requiring every validator to send a message every block, or even every ten blocks, requires far too much overhead to be sustainable. The more traditional chain-based PoC3, as described in the Mauve Paper, has been more successful; although there are imperfections in how the incentives are structured, the flaws are much less serious in nature.


    Myself, Vlad and many volunteers from Ethereum research team came together at the bootcamp at IC3 in July with university academics, Zcash developers and others to discuss proof of stake, sharding, privacy and other challenges, and substantial progress was made in bridging the gap between our approach to proof of stake and that of others who have been working on similar problems. A newer and simpler version of Casper began to solidify, and myself and Vlad continued on two separate paths: myself aiming to create a simple proof of stake protocol that would provide desirable properties with as few changes from proof of work as possible, and Vlad taking a “correct-by-construction” approach to rebuild consensus from the ground up. Both were presented at Devcon2 in Shanghai in September, and that’s where we were at two weeks ago.At the end of November, the research team (temporarily joined by Loi Luu, of validator’s dilemma fame), along with some of our long-time volunteers and friends, came together for two weeks for a research workshop in Singapore, aiming to bring our thoughts together on various issues to do with Casper, scalability, consensus incentives and state size control.



    A major topic of discussion was coming up with a rigorous and generalizable strategy for determining optimal incentives in consensus protocols – whether you’re creating a chain-based protocol, a scalable sharding protocol, or even an incentivized version of PBFT, can we come up with a generalized way to correctly assign the right rewards and penalties to all participants, using only verifiable evidence that could be put into a blockchain as input, and in a way that would have optimal game-theoretic properties? We had some ideas; one of them, when applied to proof of work as an experiment, immediately led to a new path toward solving selfish mining attacks, and has also proven extremely promising in addressing long-standing issues in proof of stake.A key goal of our approach to cryptoeconomics is ensuring as much incentive-compatibility as possible even under a model with majority collusions: even if an attacker controls 90% of the network, is there a way to make sure that, if the attacker deviates from the protocol in any harmful way, the attacker loses money? At least in some cases, such as short-range forks, the answer seems to be yes. In other cases, such as censorship, achieving this goal is much harder.A second goal is bounding “griefing factors” – that is, ensuring that there is no way for an attacker to cause other players to lose money without losing close to the same amount of money themselves. A third goal is ensuring that the protocol continues to work as well as possible under other kinds of extreme conditions: for example, what if 60% of the validator nodes drop offline simultaneously? Traditional consensus protocols such as PBFT, and proof of stake protocols inspired by such approaches, simply halt in this case; our goal with Casper is for the chain to continue, and even if the chain can’t provide all of the guarantees that it normally does under such conditions the protocol should still try to do as much as it can.One of the main beneficial results of the workshop was bridging the gap between my current “exponential ramp-up” approach to transaction/block finality in Casper, which rewards validators for making bets with increasing confidence and penalizes them if their bets are wrong, and Vlad’s “correct-by-construction” approach, which emphasizes penalizing validators only if they equivocate (ie. sign two incompatible messages). At the end of the workshop, we began to work together on strategies to combine the best of both approaches, and we have already started to use these insights to improve the Casper protocol.In the meantime, I have written some documents and FAQs that detail the current state of thinking regarding proof of stake, sharding and Casper to help bring anyone interested up to speed:https://github.com/ethereum/wiki/wiki/Proof-of-Stake-FAQ

    https://github.com/ethereum/wiki/wiki/Sharding-FAQ

    https://docs.google.com/document/d/1maFT3cpHvwn29gLvtY4WcQiI6kRbN_nbCf3JlgR3m_8 (Mauve Paper; now slightly out of date but will be updated soon)

    State size control

    Another important area of protocol design is state size control – that is, how to we reduce the amount of state information that full nodes need to keep track of? Right now, the state is about a gigabyte in size (the rest of the data that a geth or parity node currently stores is the transaction history; this data can theoretically be pruned once there is a robust light-client protocol for fetching it), and we saw already how protocol usability degrades in several ways if it grows much larger; additionally, sharding becomes much more difficult as sharded blockchains require nodes to be able to quickly download parts of the state as part of the process of serving as validators.Some proposals that have been raised have to do with deleting old non-contract accounts with not enough ether to send a transaction, and doing so safely so as to prevent replay attacks. Other proposals involve simply making it much more expensive to create new accounts or store data, and doing so in a way that is more decoupled from the way that we pay for other kinds of costs inside the EVM. Still other proposals include putting time limits on how long contracts can last, and charging more to create accounts or contracts with longer time limits (the time limits here would be generous; it would still be affordable to create a contract that lasts several years). There is currently an ongoing debate in the developer community about the best way to achieve the goal of keeping state size small, while at the same time keeping the core protocol maximally user and developer-friendly.

    Miscellanea

    Other areas of low-level-protocol improvement on the horizon include:

    Several “EVM 1.5” proposals that make the EVM more friendly to static analysis, facilitating compatibility with WASM

    Integration of zero knowledge proofs, likely through either (i) an explicit ZKP opcode/native contract, or (ii) an opcode or native contract for the key computationally intensive ingredients in ZKPs, particularly elliptic curve pairing computations

    Further degrees of abstraction and protocol simplification

    Expect more detailed documents and conversations on all of these topics in the months to come, especially as work on turning the Casper specification into a viable proof of concept release that could run a testnet continues to move forward.

    profile

    Vitalik Buterin

    https://ethereum.org























  • zkSNARKs in a nutshell


    The possibilities of zkSNARKs are impressive, you can verify the correctness of computations without having to execute them and you will not even learn what was executed – just that it was done correctly. Unfortunately, most explanations of zkSNARKs resort to hand-waving at some point and thus they remain something “magical”, suggesting that only the most enlightened actually understand how and why (and if?) they work. The reality is that zkSNARKs can be reduced to four simple techniques and this blog post aims to explain them. Anyone who can understand how the RSA cryptosystem works, should also get a pretty good understanding of currently employed zkSNARKs. Let’s see if it will achieve its goal!


    pdf version


    As a very short summary, zkSNARKs as currently implemented, have 4
    main ingredients (don’t worry, we will explain all the terms in later
    sections):

    A) Encoding as a polynomial problem

    The program that is to be checked is compiled into a quadratic
    equation of polynomials: t(x) h(x) = w(x) v(x), where the equality holds
    if and only if the program is computed correctly. The prover wants to
    convince the verifier that this equality holds.

    B) Succinctness by random sampling

    The verifier chooses a secret evaluation point s to reduce the
    problem from multiplying polynomials and verifying polynomial function
    equality to simple multiplication and equality check on numbers:
    t(s)h(s) = w(s)v(s)

    This reduces both the proof size and the verification time tremendously.

    C) Homomorphic encoding / encryption

    An encoding/encryption function E is used that has some homomorphic
    properties (but is not fully homomorphic, something that is not yet
    practical). This allows the prover to compute E(t(s)), E(h(s)), E(w(s)),
    E(v(s)) without knowing s, she only knows E(s) and some other helpful
    encrypted values.

    D) Zero Knowledge

    The prover permutes the values E(t(s)), E(h(s)), E(w(s)), E(v(s)) by
    multiplying with a number so that the verifier can still check their
    correct structure without knowing the actual encoded values.

    The very rough idea is that checking t(s)h(s) = w(s)v(s) is identical
    to checking t(s)h(s) k = w(s)v(s) k for a random secret number k (which
    is not zero), with the difference that if you are sent only the numbers
    (t(s)h(s) k) and (w(s)v(s) k), it is impossible to derive t(s)h(s) or
    w(s)v(s).

    This was the hand-waving part so that you can understand the essence of zkSNARKs, and now we get into the details.

    RSA and Zero-Knowledge Proofs

    Let us start with a quick reminder of how RSA works, leaving out some
    nit-picky details. Remember that we often work with numbers modulo some
    other number instead of full integers. The notation here is “a + b ≡ c
    (mod n)”, which means “(a + b) % n = c % n”. Note that the “(mod n)”
    part does not apply to the right hand side “c” but actually to the “≡”
    and all other “≡” in the same equation. This makes it quite hard to
    read, but I promise to use it sparingly. Now back to RSA:

    The prover comes up with the following numbers:

    p, q: two random secret primes
    n := p q
    d: random number such that 1 < d < n – 1
    e: a number such that d e ≡ 1 (mod (p-1)(q-1)).

    The public key is (e, n) and the private key is d. The primes p and q can be discarded but should not be revealed.

    The message m is encrypted via

    E(m) := me % n

    and c = E(m) is decrypted via

    D(c) := cd % n.

    Because of the fact that cd ≡ (me % n)d ≡ med (mod n) and multiplication in the exponent of m behaves like multiplication in the group modulo (p-1)(q-1), we get med ≡
    m (mod n). Furthermore, the security of RSA relies on the assumption
    that n cannot be factored efficiently and thus d cannot be computed from
    e (if we knew p and q, this would be easy).

    One of the remarkable feature of RSA is that it is multiplicatively homomorphic.
    In general, two operations are homomorphic if you can exchange their
    order without affecting the result. In the case of homomorphic
    encryption, this is the property that you can perform computations on
    encrypted data. Fully homomorphic encryption, something that
    exists, but is not practical yet, would allow to evaluate arbitrary
    programs on encrypted data. Here, for RSA, we are only talking about
    group multiplication. More formally: E(x) E(y) ≡ xeye ≡ (xy)e
    ≡ E(x y) (mod n), or in words: The product of the encryption of two
    messages is equal to the encryption of the product of the messages.

    This homomorphicity already allows some kind of zero-knowledge proof
    of multiplication: The prover knows some secret numbers x and y and
    computes their product, but sends only the encrypted versions a = E(x), b
    = E(y) and c = E(x y) to the verifier. The verifier now checks that (a
    b) % n ≡ c % n and the only thing the verifier learns is the encrypted
    version of the product and that the product was correctly computed, but
    she neither knows the two factors nor the actual product. If you replace
    the product by addition, this already goes into the direction of a
    blockchain where the main operation is to add balances.

    Interactive Verification

    Having touched a bit on the zero-knowledge aspect, let us now focus
    on the other main feature of zkSNARKs, the succinctness. As you will see
    later, the succinctness is the much more remarkable part of zkSNARKs,
    because the zero-knowledge part will be given “for free” due to a
    certain encoding that allows for a limited form of homomorphic encoding.

    SNARKs are short for succinct non-interactive arguments of knowledge. In this general setting of so-called interactive protocols, there is a prover and a verifier
    and the prover wants to convince the verifier about a statement (e.g.
    that f(x) = y) by exchanging messages. The generally desired properties
    are that no prover can convince the verifier about a wrong statement (soundness) and there is a certain strategy for the prover to convince the verifier about any true statement (completeness). The individual parts of the acronym have the following meaning:

    Succinct: the sizes of the messages are tiny in comparison to the length of the actual computation
    Non-interactive: there is no or only little interaction. For
    zkSNARKs, there is usually a setup phase and after that a single message
    from the prover to the verifier. Furthermore, SNARKs often have the
    so-called “public verifier” property meaning that anyone can verify
    without interacting anew, which is important for blockchains.
    ARguments: the verifier is only protected against computationally
    limited provers. Provers with enough computational power can create
    proofs/arguments about wrong statements (Note that with enough
    computational power, any public-key encryption can be broken). This is
    also called “computational soundness”, as opposed to “perfect
    soundness”.
    of Knowledge: it is not possible for the prover to construct a proof/argument without knowing a certain so-called witness (for example the address she wants to spend from, the preimage of a hash function or the path to a certain Merkle-tree node).

    If you add the zero-knowledge prefix, you also
    require the property (roughly speaking) that during the interaction, the
    verifier learns nothing apart from the validity of the statement. The
    verifier especially does not learn the witness string – we will see later what that is exactly.



    As an example, let us consider the following transaction validation computation: f(σ1, σ2, s, r, v, ps, pr, v) = 1 if and only if σ1 and σ2 are the root hashes of account Merkle-trees (the pre- and the post-state), s and r are sender and receiver accounts and ps, pr are Merkle-tree proofs that testify that the balance of s is at least v in σ1 and they hash to σ2 instead of σ1 if v is moved from the balance of s to the balance of r.

    It is relatively easy to verify the computation of f if all inputs
    are known. Because of that, we can turn f into a zkSNARK where only σ1 and σ2 are publicly known and (s, r, v, ps, pr,
    v) is the witness string. The zero-knowledge property now causes the
    verifier to be able to check that the prover knows some witness that
    turns the root hash from σ1 to σ2 in a way that does not violate any requirement on correct transactions, but she has no idea who sent how much money to whom.

    The formal definition (still leaving out some details) of zero-knowledge is that there is a simulator
    that, having also produced the setup string, but does not know the
    secret witness, can interact with the verifier — but an outside observer
    is not able to distinguish this interaction from the interaction with
    the real prover.

    NP and Complexity-Theoretic Reductions

    In order to see which problems and computations zkSNARKs can be used
    for, we have to define some notions from complexity theory. If you do
    not care about what a “witness” is, what you will not know
    after “reading” a zero-knowledge proof or why it is fine to have
    zkSNARKs only for a specific problem about polynomials, you can skip
    this section.

    P and NP

    First, let us restrict ourselves to functions that only output 0 or 1 and call such functions problems.
    Because you can query each bit of a longer result individually, this is
    not a real restriction, but it makes the theory a lot easier. Now we
    want to measure how “complicated” it is to solve a given problem
    (compute the function). For a specific machine implementation M of a
    mathematical function f, we can always count the number of steps it
    takes to compute f on a specific input x – this is called the runtime
    of M on x. What exactly a “step” is, is not too important in this
    context. Since the program usually takes longer for larger inputs, this
    runtime is always measured in the size or length (in number of bits) of
    the input. This is where the notion of e.g. an “n2 algorithm” comes from – it is an algorithm that takes at most n2 steps on inputs of size n. The notions “algorithm” and “program” are largely equivalent here.

    Programs whose runtime is at most nk for some k are also called “polynomial-time programs”.

    Two of the main classes of problems in complexity theory are P and NP:

    P is the class of problems L that have polynomial-time programs.

    Even though the exponent k can be quite large for some problems, P is
    considered the class of “feasible” problems and indeed, for
    non-artificial problems, k is usually not larger than 4. Verifying a
    bitcoin transaction is a problem in P, as is evaluating a polynomial
    (and restricting the value to 0 or 1). Roughly speaking, if you only
    have to compute some value and not “search” for something, the problem
    is almost always in P. If you have to search for something, you mostly
    end up in a class called NP.

    The Class NP

    There are zkSNARKs for all problems in the class NP and actually, the
    practical zkSNARKs that exist today can be applied to all problems in
    NP in a generic fashion. It is unknown whether there are zkSNARKs for
    any problem outside of NP.

    All problems in NP always have a certain structure, stemming from the definition of NP:

    NP is the class of problems L that have a polynomial-time program V
    that can be used to verify a fact given a polynomially-sized so-called
    witness for that fact. More formally:

    L(x) = 1 if and only if there is some polynomially-sized string w (called the witness) such that V(x, w) = 1

    As an example for a problem in NP, let us consider the problem of
    boolean formula satisfiability (SAT). For that, we define a boolean
    formula using an inductive definition:

    any variable x1, x2, x3,… is a boolean formula (we also use any other character to denote a variable
    if f is a boolean formula, then ¬f is a boolean formula (negation)
    if f and g are boolean formulas, then (f ∧ g) and (f ∨ g) are boolean formulas (conjunction / and, disjunction / or).

    The string “((x1∧ x2) ∧ ¬x2)” would be a boolean formula.

    A boolean formula is satisfiable if there is a way to assign
    truth values to the variables so that the formula evaluates to true
    (where ¬true is false, ¬false is true, true ∧ false is false and so on,
    the regular rules). The satisfiability problem SAT is the set of all
    satisfiable boolean formulas.

    SAT(f) := 1 if f is a satisfiable boolean formula and 0 otherwise

    The example above, “((x1∧ x2) ∧ ¬x2)”,
    is not satisfiable and thus does not lie in SAT. The witness for a
    given formula is its satisfying assignment and verifying that a variable
    assignment is satisfying is a task that can be solved in polynomial
    time.

    P = NP?

    If you restrict the definition of NP to witness strings of length
    zero, you capture the same problems as those in P. Because of that,
    every problem in P also lies in NP. One of the main tasks in complexity
    theory research is showing that those two classes are actually different
    – that there is a problem in NP that does not lie in P. It might seem
    obvious that this is the case, but if you can prove it formally, you can
    win US$ 1 million.
    Oh and just as a side note, if you can prove the converse, that P and
    NP are equal, apart from also winning that amount, there is a big chance
    that cryptocurrencies will cease to exist from one day to the next. The
    reason is that it will be much easier to find a solution to a proof of
    work puzzle, a collision in a hash function or the private key
    corresponding to an address. Those are all problems in NP and since you
    just proved that P = NP, there must be a polynomial-time program for
    them. But this article is not to scare you, most researchers believe
    that P and NP are not equal.

    NP-Completeness

    Let us get back to SAT. The interesting property of this seemingly
    simple problem is that it does not only lie in NP, it is also
    NP-complete. The word “complete” here is the same complete as in
    “Turing-complete”. It means that it is one of the hardest problems in
    NP, but more importantly — and that is the definition of NP-complete —
    an input to any problem in NP can be transformed to an equivalent input
    for SAT in the following sense:

    For any NP-problem L there is a so-called reduction function f, which is computable in polynomial time such that:

    L(x) = SAT(f(x))

    Such a reduction function can be seen as a compiler: It takes source
    code written in some programming language and transforms in into an
    equivalent program in another programming language, which typically is a
    machine language, which has the some semantic behaviour. Since SAT is
    NP-complete, such a reduction exists for any possible problem in NP,
    including the problem of checking whether e.g. a bitcoin transaction is
    valid given an appropriate block hash. There is a reduction function
    that translates a transaction into a boolean formula, such that the
    formula is satisfiable if and only if the transaction is valid.

    Reduction Example

    In order to see such a reduction, let us consider the problem of
    evaluating polynomials. First, let us define a polynomial (similar to a
    boolean formula) as an expression consisting of integer constants,
    variables, addition, subtraction, multiplication and (correctly
    balanced) parentheses. Now the problem we want to consider is

    PolyZero(f) := 1 if f is a polynomial which has a zero where its variables are taken from the set {0, 1}

    We will now construct a reduction from SAT to PolyZero and thus show
    that PolyZero is also NP-complete (checking that it lies in NP is left
    as an exercise).

    It suffices to define the reduction function r on the structural
    elements of a boolean formula. The idea is that for any boolean formula
    f, the value r(f) is a polynomial with the same number of variables and
    f(a1,..,ak) is true if and only if r(f)(a1,..,ak)
    is zero, where true corresponds to 1 and false corresponds to 0, and
    r(f) only assumes the value 0 or 1 on variables from {0, 1}:

    r(xi) := (1 – xi)
    r(¬f) := (1 – r(f))
    r((f ∧ g)) := (1 – (1 – r(f))(1 – r(g)))
    r((f ∨ g)) := r(f)r(g)

    One might have assumed that r((f ∧ g)) would be defined as r(f) +
    r(g), but that will take the value of the polynomial out of the {0, 1}
    set.

    Using r, the formula ((x ∧ y) ∨¬x) is translated to (1 – (1 – (1 – x))(1 – (1 – y))(1 – (1 – x)),

    Note that each of the replacement rules for r satisfies the goal stated above and thus r correctly performs the reduction:

    SAT(f) = PolyZero(r(f)) or f is satisfiable if and only if r(f) has a zero in {0, 1}

    Witness Preservation

    From this example, you can see that the reduction function only
    defines how to translate the input, but when you look at it more closely
    (or read the proof that it performs a valid reduction), you also see a
    way to transform a valid witness together with the input. In our
    example, we only defined how to translate the formula to a polynomial,
    but with the proof we explained how to transform the witness, the
    satisfying assignment. This simultaneous transformation of the witness
    is not required for a transaction, but it is usually also done. This is
    quite important for zkSNARKs, because the the only task for the prover
    is to convince the verifier that such a witness exists, without
    revealing information about the witness.

    Quadratic Span Programs

    In the previous section, we saw how computational problems inside NP
    can be reduced to each other and especially that there are NP-complete
    problems that are basically only reformulations of all other problems in
    NP – including transaction validation problems. This makes it easy for
    us to find a generic zkSNARK for all problems in NP: We just choose a
    suitable NP-complete problem. So if we want to show how to validate
    transactions with zkSNARKs, it is sufficient to show how to do it for a
    certain problem that is NP-complete and perhaps much easier to work with
    theoretically.

    This and the following section is based on the paper GGPR12
    (the linked technical report has much more information than the journal
    paper), where the authors found that the problem called Quadratic Span
    Programs (QSP) is particularly well suited for zkSNARKs. A Quadratic
    Span Program consists of a set of polynomials and the task is to find a
    linear combination of those that is a multiple of another given
    polynomial. Furthermore, the individual bits of the input string
    restrict the polynomials you are allowed to use. In detail (the general
    QSPs are a bit more relaxed, but we already define the strong version because that will be used later):

    A QSP over a field F for inputs of length n consists of

    a set of polynomials v0,…,vm, w0,…,wm over this field F,
    a polynomial t over F (the target polynomial),
    an injective function f: {(i, j) | 1 ≤ i ≤ n, j ∈ {0, 1}} → {1, …, m}

    The task here is roughly, to multiply the polynomials by factors and add them so that the sum (which is called a linear combination)
    is a multiple of t. For each binary input string u, the function f
    restricts the polynomials that can be used, or more specific, their
    factors in the linear combinations. For formally:

    An input u is accepted (verified) by the QSP if and only if there are tuples a = (a1,…,am), b = (b1,…,bm) from the field F such that

    ak,bk = 1 if k = f(i, u[i]) for some i, (u[i] is the ith bit of u)
    ak,bk = 0 if k = f(i, 1 – u[i]) for some i and
    the target polynomial t divides va wb where va = v0 + a1 v0 + … + amvm, wb = w0 + b1 w0 + … + bmwm.

    Note that there is still some freedom in choosing the tuples a and b
    if 2n is smaller than m. This means QSP only makes sense for inputs up
    to a certain size – this problem is removed by using non-uniform
    complexity, a topic we will not dive into now, let us just note that it
    works well for cryptography where inputs are generally small.

    As an analogy to satisfiability of boolean formulas, you can see the factors a1,…,am, b1,…,bm
    as the assignments to the variables, or in general, the NP witness. To
    see that QSP lies in NP, note that all the verifier has to do (once she
    knows the factors) is checking that the polynomial t divides va wb, which is a polynomial-time problem.

    We will not talk about the reduction from generic computations or
    circuits to QSP here, as it does not contribute to the understanding of
    the general concept, so you have to believe me that QSP is NP-complete
    (or rather complete for some non-uniform analogue like NP/poly). In
    practice, the reduction is the actual “engineering” part – it has to be
    done in a clever way such that the resulting QSP will be as small as
    possible and also has some other nice features.

    One thing about QSPs that we can already see is how to verify them
    much more efficiently: The verification task consists of checking
    whether one polynomial divides another polynomial. This can be
    facilitated by the prover in providing another polynomial h such that t h
    = va wb which turns the task into checking a polynomial identity or put differently, into checking that t h – va wb
    = 0, i.e. checking that a certain polynomial is the zero polynomial.
    This looks rather easy, but the polynomials we will use later are quite
    large (the degree is roughly 100 times the number of gates in the
    original circuit) so that multiplying two polynomials is not an easy
    task.

    So instead of actually computing va, wb and
    their product, the verifier chooses a secret random point s (this point
    is part of the “toxic waste” of zCash), computes the numbers t(s), vk(s) and wk(s) for all k and from them, va(s) and wb(s) and only checks that t(s) h(s) = va(s) wb
    (s). So a bunch of polynomial additions, multiplications with a scalar
    and a polynomial product is simplified to field multiplications and
    additions.

    Checking a polynomial identity only at a single point instead of at
    all points of course reduces the security, but the only way the prover
    can cheat in case t h – va wb is not the zero
    polynomial is if she manages to hit a zero of that polynomial, but since
    she does not know s and the number of zeros is tiny (the degree of the
    polynomials) when compared to the possibilities for s (the number of
    field elements), this is very safe in practice.

    The zkSNARK in Detail

    We now describe the zkSNARK for QSP in detail. It starts with a setup
    phase that has to be performed for every single QSP. In zCash, the
    circuit (the transaction verifier) is fixed, and thus the polynomials
    for the QSP are fixed which allows the setup to be performed only once
    and re-used for all transactions, which only vary the input u. For the
    setup, which generates the common reference string (CRS), the
    verifier chooses a random and secret field element s and encrypts the
    values of the polynomials at that point. The verifier uses some specific
    encryption E and publishes E(vk(s)) and E(wk(s))
    in the CRS. The CRS also contains several other values which makes the
    verification more efficient and also adds the zero-knowledge property.
    The encryption E used there has a certain homomorphic property, which
    allows the prover to compute E(v(s)) without actually knowing vk(s).

    How to Evaluate a Polynomial Succinctly and with Zero-Knowledge

    Let us first look at a simpler case, namely just the encrypted
    evaluation of a polynomial at a secret point, and not the full QSP
    problem.

    For this, we fix a group (an elliptic curve is usually chosen here) and a generator g. Remember that a group element is called generator if there is a number n (the group order) such that the list g0, g1, g2, …, gn-1 contains all elements in the group. The encryption is simply E(x) := gx. Now the verifier chooses a secret field element s and publishes (as part of the CRS)

    E(s0), E(s1), …, E(sd) – d is the maximum degree of all polynomials

    After that, s can be (and has to be) forgotten. This is exactly what
    zCash calls toxic waste, because if someone can recover this and the
    other secret values chosen later, they can arbitrarily spoof proofs by
    finding zeros in the polynomials.

    Using these values, the prover can compute E(f(s)) for arbitrary
    polynomials f without knowing s: Assume our polynomial is f(x) = 4x2 + 2x + 4 and we want to compute E(f(s)), then we get E(f(s)) = E(4s2 + 2s + 4) = g4s^2 + 2s + 4 = E(s2)4 E(s1)2 E(s0)4, which can be computed from the published CRS without knowing s.

    The only problem here is that, because s was destroyed, the verifier
    cannot check that the prover evaluated the polynomial correctly. For
    that, we also choose another secret field element, α, and publish the
    following “shifted” values:

    E(αs0), E(αs1), …, E(αsd)

    As with s, the value α is also destroyed after the setup phase and
    neither known to the prover nor the verifier. Using these encrypted
    values, the prover can similarly compute E(α f(s)), in our example this
    is E(4αs2 + 2αs + 4α) = E(αs2)4 E(αs1)2 E(αs0)4.
    So the prover publishes A := E(f(s)) and B := E(α f(s))) and the
    verifier has to check that these values match. She does this by using
    another main ingredient: A so-called pairing function e. The elliptic curve and the pairing function have to be chosen together, so that the following property holds for all x, y:

    e(gx, gy) = e(g, g)xy

    Using this pairing function, the verifier checks that e(A, gα) = e(B, g) — note that gα is known to the verifier because it is part of the CRS as E(αs0). In order to see that this check is valid if the prover does not cheat, let us look at the following equalities:

    e(A, gα) = e(gf(s), gα) = e(g, g)α f(s)

    e(B, g) = e(gα f(s), g) = e(g, g)α f(s)

    The more important part, though, is the question whether the prover
    can somehow come up with values A, B that fulfill the check e(A, gα)
    = e(B, g) but are not E(f(s)) and E(α f(s))), respectively. The answer
    to this question is “we hope not”. Seriously, this is called the
    “d-power knowledge of exponent assumption” and it is unknown whether a
    cheating prover can do such a thing or not. This assumption is an
    extension of similar assumptions that are made for proving the security
    of other public-key encryption schemes and which are similarly unknown
    to be true or not.

    Actually, the above protocol does not really allow the verifier to check that the prover evaluated the polynomial f(x) = 4x2 + 2x + 4, the verifier can only check that the prover evaluated some
    polynomial at the point s. The zkSNARK for QSP will contain another
    value that allows the verifier to check that the prover did indeed
    evaluate the correct polynomial.

    What this example does show is that the verifier does not need to
    evaluate the full polynomial to confirm this, it suffices to evaluate
    the pairing function. In the next step, we will add the zero-knowledge
    part so that the verifier cannot reconstruct anything about f(s), not
    even E(f(s)) – the encrypted value.

    For that, the prover picks a random δ and instead of A := E(f(s)) and
    B := E(α f(s))), she sends over A’ := E(δ + f(s)) and B := E(α (δ +
    f(s)))). If we assume that the encryption cannot be broken, the
    zero-knowledge property is quite obvious. We now have to check two
    things: 1. the prover can actually compute these values and 2. the check
    by the verifier is still true.

    For 1., note that A’ = E(δ + f(s)) = gδ + f(s) = gδgf(s) = E(δ) E(f(s)) = E(δ) A and similarly, B’ = E(α (δ + f(s)))) = E(α δ + α f(s))) = gα δ + α f(s) = gα δ gα f(s)

    = E(α)δE(α f(s)) = E(α)δ B.

    For 2., note that the only thing the verifier checks is that the
    values A and B she receives satisfy the equation A = E(a) und B = E(α a)
    for some value a, which is obviously the case for a = δ + f(s) as it is
    the case for a = f(s).

    Ok, so we now know a bit about how the prover can compute the
    encrypted value of a polynomial at an encrypted secret point without the
    verifier learning anything about that value. Let us now apply that to
    the QSP problem.

    A SNARK for the QSP Problem

    Remember that in the QSP we are given polynomials v0,…,vm, w0,…,wm, a target polynomial t (of degree at most d) and a binary input string u. The prover finds a1,…,am, b1,…,bm (that are somewhat restricted depending on u) and a polynomial h such that

    t h = (v0 + a1v1 + … + amvm) (w0 + b1w1 + … + bmwm).

    In the previous section, we already explained how the common
    reference string (CRS) is set up. We choose secret numbers s and α and
    publish

    E(s0), E(s1), …, E(sd) and E(αs0), E(αs1), …, E(αsd)

    Because we do not have a single polynomial, but sets of polynomials
    that are fixed for the problem, we also publish the evaluated
    polynomials right away:

    E(t(s)), E(α t(s)),
    E(v0(s)), …, E(vm(s)), E(α v0(s)), …, E(α vm(s)),
    E(w0(s)), …, E(wm(s)), E(α w0(s)), …, E(α wm(s)),

    and we need further secret numbers βv, βw, γ (they will be used to verify that those polynomials were evaluated and not some arbitrary polynomials) and publish

    E(γ), E(βv γ), E(βw γ),
    E(βv v1(s)), …, E(βv vm(s))
    E(βw w1(s)), …, E(βw wm(s))
    E(βv t(s)), E(βw t(s))

    This is the full common reference string. In practical
    implementations, some elements of the CRS are not needed, but that would
    complicated the presentation.

    Now what does the prover do? She uses the reduction explained above to find the polynomial h and the values a1,…,am, b1,…,bm. Here it is important to use a witness-preserving reduction (see above) because only then, the values a1,…,am, b1,…,bm
    can be computed together with the reduction and would be very hard to
    find otherwise. In order to describe what the prover sends to the
    verifier as proof, we have to go back to the definition of the QSP.

    There was an injective function f: {(i, j) | 1 ≤ i ≤ n, j ∈ {0, 1}} → {1, …, m} which restricts the values of a1,…,am, b1,…,bm.
    Since m is relatively large, there are numbers which do not appear in
    the output of f for any input. These indices are not restricted, so let
    us call them Ifree and define vfree(x) = Σk akvk(x) where the k ranges over all indices in Ifree. For w(x) = b1w1(x) + … + bmwm(x), the proof now consists of

    Vfree := E(vfree(s)), W := E(w(s)), H := E(h(s)),
    V’free := E(α vfree(s)), W’ := E(α w(s)), H’ := E(α h(s)),
    Y := E(βv vfree(s) + βw w(s)))

    where the last part is used to check that the correct polynomials
    were used (this is the part we did not cover yet in the other example).
    Note that all these encrypted values can be generated by the prover
    knowing only the CRS.

    The task of the verifier is now the following:

    Since the values of ak, where k is not a “free” index can
    be computed directly from the input u (which is also known to the
    verifier, this is what is to be verified), the verifier can compute the
    missing part of the full sum for v:

    E(vin(s)) = E(Σk akvk(s)) where the k ranges over all indices not in Ifree.

    With that, the verifier now confirms the following equalities using the pairing function e (don’t be scared):

    e(V’free, g) = e(Vfree, gα), e(W’, E(1)) = e(W, E(α)), e(H’, E(1)) = e(H, E(α))
    e(E(γ), Y) = e(E(βv γ), Vfree) e(E(βw γ), W)
    e(E(v0(s)) E(vin(s)) Vfree, E(w0(s)) W) = e(H, E(t(s)))

    To grasp the general concept here, you have to understand that the
    pairing function allows us to do some limited computation on encrypted
    values: We can do arbitrary additions but just a single multiplication.
    The addition comes from the fact that the encryption itself is already
    additively homomorphic and the single multiplication is realized by the
    two arguments the pairing function has. So e(W’, E(1)) = e(W, E(α))
    basically multiplies W’ by 1 in the encrypted space and compares that to
    W multiplied by α in the encrypted space. If you look up the value W
    and W’ are supposed to have – E(w(s)) and E(α w(s)) – this checks out if
    the prover supplied a correct proof.

    If you remember from the section about evaluating polynomials at
    secret points, these three first checks basically verify that the prover
    did evaluate some polynomial built up from the parts in the CRS. The
    second item is used to verify that the prover used the correct
    polynomials v and w and not just some arbitrary ones. The idea behind is
    that the prover has no way to compute the encrypted combination E(βv vfree(s) + βw w(s))) by some other way than from the exact values of E(vfree(s)) and E(w(s)). The reason is that the values βv are not part of the CRS in isolation, but only in combination with the values vk(s) and βw is only known in combination with the polynomials wk(s). The only way to “mix” them is via the equally encrypted γ.

    Assuming the prover provided a correct proof, let us check that the
    equality works out. The left and right hand sides are, respectively

    e(E(γ), Y) = e(E(γ), E(βv vfree(s) + βw w(s))) = e(g, g)γ(βv vfree(s) + βw w(s))
    e(E(βv γ), Vfree) e(E(βw γ), W) = e(E(βv γ), E(vfree(s))) e(E(βw γ), E(w(s))) = e(g, g)(βv γ) vfree(s) e(g, g)(βw γ) w(s) = e(g, g)γ(βv vfree(s) + βw w(s))

    The third item essentially checks that (v0(s) + a1v1(s) + … + amvm(s)) (w0(s) + b1w1(s) + … + bmwm(s))
    = h(s) t(s), the main condition for the QSP problem. Note that
    multiplication on the encrypted values translates to addition on the
    unencrypted values because E(x) E(y) = gx gy = gx+y = E(x + y).

    Adding Zero-Knowledge

    As I said in the beginning, the remarkable feature about zkSNARKS is
    rather the succinctness than the zero-knowledge part. We will see now
    how to add zero-knowledge and the next section will be touch a bit more
    on the succinctness.

    The idea is that the prover “shifts” some values by a random secret
    amount and balances the shift on the other side of the equation. The
    prover chooses random δfree, δw and performs the following replacements in the proof

    vfree(s) is replaced by vfree(s) + δfree t(s)
    w(s) is replaced by w(s) + δw t(s).

    By these replacements, the values Vfree and W, which
    contain an encoding of the witness factors, basically become
    indistinguishable form randomness and thus it is impossible to extract
    the witness. Most of the equality checks are “immune” to the
    modifications, the only value we still have to correct is H or h(s). We
    have to ensure that

    (v0(s) + a1v1(s) + … + amvm(s)) (w0(s) + b1w1(s) + … + bmwm(s)) = h(s) t(s), or in other words
    (v0(s) + vin(s) + vfree(s)) (w0(s) + w(s)) = h(s) t(s)

    still holds. With the modifications, we get

    (v0(s) + vin(s) + vfree(s) + δfree t(s)) (w0(s) + w(s) + δw t(s))

    and by expanding the product, we see that replacing h(s) by

    h(s) + δfree (w0(s) + w(s)) + δw (v0(s) + vin(s) + vfree(s)) + (δfree δw) t(s)

    will do the trick.

    Tradeoff between Input and Witness Size

    As you have seen in the preceding sections, the proof consists only
    of 7 elements of a group (typically an elliptic curve). Furthermore, the
    work the verifier has to do is checking some equalities involving
    pairing functions and computing E(vin(s)), a task that is
    linear in the input size. Remarkably, neither the size of the witness
    string nor the computational effort required to verify the QSP (without
    SNARKs) play any role in verification. This means that SNARK-verifying
    extremely complex problems and very simple problems all take the same
    effort. The main reason for that is because we only check the polynomial
    identity for a single point, and not the full polynomial. Polynomials
    can get more and more complex, but a point is always a point. The only
    parameters that influence the verification effort is the level of
    security (i.e. the size of the group) and the maximum size for the
    inputs.

    It is possible to reduce the second parameter, the input size, by shifting some of it into the witness:

    Instead of verifying the function f(u, w), where u is the input and w is the witness, we take a hash function h and verify

    f'(H, (u, w)) := f(u, w) ∧ h(u) = H.

    This means we replace the input u by a hash of the input h(u) (which
    is supposed to be much shorter) and verify that there is some value x
    that hashes to H(u) (and thus is very likely equal to u) in addition to
    checking f(x, w). This basically moves the original input u into the
    witness string and thus increases the witness size but decreases the
    input size to a constant.

    This is remarkable, because it allows us to verify arbitrarily complex statements in constant time.

    How is this Relevant to Ethereum

    Since verifying arbitrary computations is at the core of the Ethereum
    blockchain, zkSNARKs are of course very relevant to Ethereum. With
    zkSNARKs, it becomes possible to not only perform secret arbitrary
    computations that are verifiable by anyone, but also to do this
    efficiently.

    Although Ethereum uses a Turing-complete virtual machine, it is
    currently not yet possible to implement a zkSNARK verifier in Ethereum.
    The verifier tasks might seem simple conceptually, but a pairing
    function is actually very hard to compute and thus it would use more gas
    than is currently available in a single block. Elliptic curve
    multiplication is already relatively complex and pairings take that to
    another level.

    Existing zkSNARK systems like zCash use the same problem / circuit /
    computation for every task. In the case of zCash, it is the transaction
    verifier. On Ethereum, zkSNARKs would not be limited to a single
    computational problem, but instead, everyone could set up a zkSNARK
    system for their specialized computational problem without having to
    launch a new blockchain. Every new zkSNARK system that is added to
    Ethereum requires a new secret trusted setup phase (some parts can be
    re-used, but not all), i.e. a new CRS has to be generated. It is also
    possible to do things like adding a zkSNARK system for a “generic
    virtual machine”. This would not require a new setup for a new use-case
    in much the same way as you do not need to bootstrap a new blockchain
    for a new smart contract on Ethereum.

    Getting zkSNARKs to Ethereum

    There are multiple ways to enable zkSNARKs for Ethereum. All of them
    reduce the actual costs for the pairing functions and elliptic curve
    operations (the other required operations are already cheap enough) and
    thus allows also the gas costs to be reduced for these operations.

    improve the (guaranteed) performance of the EVM
    improve the performance of the EVM only for certain pairing functions and elliptic curve multiplications

    The first option is of course the one that pays off better in the
    long run, but is harder to achieve. We are currently working on adding
    features and restrictions to the EVM which would allow better
    just-in-time compilation and also interpretation without too many
    required changes in the existing implementations. The other possibility
    is to swap out the EVM completely and use something like eWASM.

    The second option can be realized by forcing all Ethereum clients to
    implement a certain pairing function and multiplication on a certain
    elliptic curve as a so-called precompiled contract. The benefit is that
    this is probably much easier and faster to achieve. On the other hand,
    the drawback is that we are fixed on a certain pairing function and a
    certain elliptic curve. Any new client for Ethereum would have to
    re-implement these precompiled contracts. Furthermore, if there are
    advancements and someone finds better zkSNARKs, better pairing functions
    or better elliptic curves, or if a flaw is found in the elliptic curve,
    pairing function or zkSNARK, we would have to add new precompiled
    contracts.



  • zkSNARKs in a nutshell


    The possibilities of zkSNARKs are impressive, you can verify the correctness of computations without having to execute them and you will not even learn what was executed – just that it was done correctly. Unfortunately, most explanations of zkSNARKs resort to hand-waving at some point and thus they remain something “magical”, suggesting that only the most enlightened actually understand how and why (and if?) they work. The reality is that zkSNARKs can be reduced to four simple techniques and this blog post aims to explain them. Anyone who can understand how the RSA cryptosystem works, should also get a pretty good understanding of currently employed zkSNARKs. Let’s see if it will achieve its goal!

    pdf version


    As a very short summary, zkSNARKs as currently implemented, have 4 main ingredients (don’t worry, we will explain all the terms in later sections):

    A) Encoding as a polynomial problem

    The program that is to be checked is compiled into a quadratic equation of polynomials: t(x) h(x) = w(x) v(x), where the equality holds if and only if the program is computed correctly. The prover wants to convince the verifier that this equality holds.

    B) Succinctness by random sampling

    The verifier chooses a secret evaluation point s to reduce the problem from multiplying polynomials and verifying polynomial function equality to simple multiplication and equality check on numbers: t(s)h(s) = w(s)v(s)

    This reduces both the proof size and the verification time tremendously.

    C) Homomorphic encoding / encryption

    An encoding/encryption function E is used that has some homomorphic properties (but is not fully homomorphic, something that is not yet practical). This allows the prover to compute E(t(s)), E(h(s)), E(w(s)), E(v(s)) without knowing s, she only knows E(s) and some other helpful encrypted values.

    D) Zero Knowledge

    The prover permutes the values E(t(s)), E(h(s)), E(w(s)), E(v(s)) by multiplying with a number so that the verifier can still check their correct structure without knowing the actual encoded values.

    The very rough idea is that checking t(s)h(s) = w(s)v(s) is identical to checking t(s)h(s) k = w(s)v(s) k for a random secret number k (which is not zero), with the difference that if you are sent only the numbers (t(s)h(s) k) and (w(s)v(s) k), it is impossible to derive t(s)h(s) or w(s)v(s).

    This was the hand-waving part so that you can understand the essence of zkSNARKs, and now we get into the details.

    RSA and Zero-Knowledge Proofs

    Let us start with a quick reminder of how RSA works, leaving out some nit-picky details. Remember that we often work with numbers modulo some other number instead of full integers. The notation here is “a + b ≡ c (mod n)”, which means “(a + b) % n = c % n”. Note that the “(mod n)” part does not apply to the right hand side “c” but actually to the “≡” and all other “≡” in the same equation. This makes it quite hard to read, but I promise to use it sparingly. Now back to RSA:

    The prover comes up with the following numbers:

    • p, q: two random secret primes
    • n := p q
    • d: random number such that 1 < d < n – 1
    • e: a number such that d e ≡ 1 (mod (p-1)(q-1)).

    The public key is (e, n) and the private key is d. The primes p and q can be discarded but should not be revealed.

    The message m is encrypted via

    • E(m) := me % n

    and c = E(m) is decrypted via

    • D(c) := cd % n.

    Because of the fact that cd ≡ (me % n)d ≡ med (mod n) and multiplication in the exponent of m behaves like multiplication in the group modulo (p-1)(q-1), we get med ≡ m (mod n). Furthermore, the security of RSA relies on the assumption that n cannot be factored efficiently and thus d cannot be computed from e (if we knew p and q, this would be easy).

    One of the remarkable feature of RSA is that it is multiplicatively homomorphic. In general, two operations are homomorphic if you can exchange their order without affecting the result. In the case of homomorphic encryption, this is the property that you can perform computations on encrypted data. Fully homomorphic encryption, something that exists, but is not practical yet, would allow to evaluate arbitrary programs on encrypted data. Here, for RSA, we are only talking about group multiplication. More formally: E(x) E(y) ≡ xeye ≡ (xy)e ≡ E(x y) (mod n), or in words: The product of the encryption of two messages is equal to the encryption of the product of the messages.

    This homomorphicity already allows some kind of zero-knowledge proof of multiplication: The prover knows some secret numbers x and y and computes their product, but sends only the encrypted versions a = E(x), b = E(y) and c = E(x y) to the verifier. The verifier now checks that (a b) % n ≡ c % n and the only thing the verifier learns is the encrypted version of the product and that the product was correctly computed, but she neither knows the two factors nor the actual product. If you replace the product by addition, this already goes into the direction of a blockchain where the main operation is to add balances.

    Interactive Verification

    Having touched a bit on the zero-knowledge aspect, let us now focus on the other main feature of zkSNARKs, the succinctness. As you will see later, the succinctness is the much more remarkable part of zkSNARKs, because the zero-knowledge part will be given “for free” due to a certain encoding that allows for a limited form of homomorphic encoding.

    SNARKs are short for succinct non-interactive arguments of knowledge. In this general setting of so-called interactive protocols, there is a prover and a verifier and the prover wants to convince the verifier about a statement (e.g. that f(x) = y) by exchanging messages. The generally desired properties are that no prover can convince the verifier about a wrong statement (soundness) and there is a certain strategy for the prover to convince the verifier about any true statement (completeness). The individual parts of the acronym have the following meaning:

    • Succinct: the sizes of the messages are tiny in comparison to the length of the actual computation
    • Non-interactive: there is no or only little interaction. For zkSNARKs, there is usually a setup phase and after that a single message from the prover to the verifier. Furthermore, SNARKs often have the so-called “public verifier” property meaning that anyone can verify without interacting anew, which is important for blockchains.
    • ARguments: the verifier is only protected against computationally limited provers. Provers with enough computational power can create proofs/arguments about wrong statements (Note that with enough computational power, any public-key encryption can be broken). This is also called “computational soundness”, as opposed to “perfect soundness”.
    • of Knowledge: it is not possible for the prover to construct a proof/argument without knowing a certain so-called witness (for example the address she wants to spend from, the preimage of a hash function or the path to a certain Merkle-tree node).

    If you add the zero-knowledge prefix, you also require the property (roughly speaking) that during the interaction, the verifier learns nothing apart from the validity of the statement. The verifier especially does not learn the witness string – we will see later what that is exactly.

    As an example, let us consider the following transaction validation computation: f(σ1, σ2, s, r, v, ps, pr, v) = 1 if and only if σ1 and σ2 are the root hashes of account Merkle-trees (the pre- and the post-state), s and r are sender and receiver accounts and ps, pr are Merkle-tree proofs that testify that the balance of s is at least v in σ1 and they hash to σ2 instead of σ1 if v is moved from the balance of s to the balance of r.

    It is relatively easy to verify the computation of f if all inputs are known. Because of that, we can turn f into a zkSNARK where only σ1 and σ2 are publicly known and (s, r, v, ps, pr, v) is the witness string. The zero-knowledge property now causes the verifier to be able to check that the prover knows some witness that turns the root hash from σ1 to σ2 in a way that does not violate any requirement on correct transactions, but she has no idea who sent how much money to whom.

    The formal definition (still leaving out some details) of zero-knowledge is that there is a simulator that, having also produced the setup string, but does not know the secret witness, can interact with the verifier — but an outside observer is not able to distinguish this interaction from the interaction with the real prover.

    NP and Complexity-Theoretic Reductions

    In order to see which problems and computations zkSNARKs can be used for, we have to define some notions from complexity theory. If you do not care about what a “witness” is, what you will not know after “reading” a zero-knowledge proof or why it is fine to have zkSNARKs only for a specific problem about polynomials, you can skip this section.

    P and NP

    First, let us restrict ourselves to functions that only output 0 or 1 and call such functions problems. Because you can query each bit of a longer result individually, this is not a real restriction, but it makes the theory a lot easier. Now we want to measure how “complicated” it is to solve a given problem (compute the function). For a specific machine implementation M of a mathematical function f, we can always count the number of steps it takes to compute f on a specific input x – this is called the runtime of M on x. What exactly a “step” is, is not too important in this context. Since the program usually takes longer for larger inputs, this runtime is always measured in the size or length (in number of bits) of the input. This is where the notion of e.g. an “n2 algorithm” comes from – it is an algorithm that takes at most n2 steps on inputs of size n. The notions “algorithm” and “program” are largely equivalent here.

    Programs whose runtime is at most nk for some k are also called “polynomial-time programs”.

    Two of the main classes of problems in complexity theory are P and NP:

    • P is the class of problems L that have polynomial-time programs.

    Even though the exponent k can be quite large for some problems, P is considered the class of “feasible” problems and indeed, for non-artificial problems, k is usually not larger than 4. Verifying a bitcoin transaction is a problem in P, as is evaluating a polynomial (and restricting the value to 0 or 1). Roughly speaking, if you only have to compute some value and not “search” for something, the problem is almost always in P. If you have to search for something, you mostly end up in a class called NP.

    The Class NP

    There are zkSNARKs for all problems in the class NP and actually, the practical zkSNARKs that exist today can be applied to all problems in NP in a generic fashion. It is unknown whether there are zkSNARKs for any problem outside of NP.

    All problems in NP always have a certain structure, stemming from the definition of NP:

    • NP is the class of problems L that have a polynomial-time program V that can be used to verify a fact given a polynomially-sized so-called witness for that fact. More formally:
      L(x) = 1 if and only if there is some polynomially-sized string w (called the witness) such that V(x, w) = 1

    As an example for a problem in NP, let us consider the problem of boolean formula satisfiability (SAT). For that, we define a boolean formula using an inductive definition:

    • any variable x1, x2, x3,… is a boolean formula (we also use any other character to denote a variable
    • if f is a boolean formula, then ¬f is a boolean formula (negation)
    • if f and g are boolean formulas, then (f ∧ g) and (f ∨ g) are boolean formulas (conjunction / and, disjunction / or).

    The string “((x1∧ x2) ∧ ¬x2)” would be a boolean formula.

    A boolean formula is satisfiable if there is a way to assign truth values to the variables so that the formula evaluates to true (where ¬true is false, ¬false is true, true ∧ false is false and so on, the regular rules). The satisfiability problem SAT is the set of all satisfiable boolean formulas.

    • SAT(f) := 1 if f is a satisfiable boolean formula and 0 otherwise

    The example above, “((x1∧ x2) ∧ ¬x2)”, is not satisfiable and thus does not lie in SAT. The witness for a given formula is its satisfying assignment and verifying that a variable assignment is satisfying is a task that can be solved in polynomial time.

    P = NP?

    If you restrict the definition of NP to witness strings of length zero, you capture the same problems as those in P. Because of that, every problem in P also lies in NP. One of the main tasks in complexity theory research is showing that those two classes are actually different – that there is a problem in NP that does not lie in P. It might seem obvious that this is the case, but if you can prove it formally, you can win US$ 1 million. Oh and just as a side note, if you can prove the converse, that P and NP are equal, apart from also winning that amount, there is a big chance that cryptocurrencies will cease to exist from one day to the next. The reason is that it will be much easier to find a solution to a proof of work puzzle, a collision in a hash function or the private key corresponding to an address. Those are all problems in NP and since you just proved that P = NP, there must be a polynomial-time program for them. But this article is not to scare you, most researchers believe that P and NP are not equal.

    NP-Completeness

    Let us get back to SAT. The interesting property of this seemingly simple problem is that it does not only lie in NP, it is also NP-complete. The word “complete” here is the same complete as in “Turing-complete”. It means that it is one of the hardest problems in NP, but more importantly — and that is the definition of NP-complete — an input to any problem in NP can be transformed to an equivalent input for SAT in the following sense:

    For any NP-problem L there is a so-called reduction function f, which is computable in polynomial time such that:

    • L(x) = SAT(f(x))

    Such a reduction function can be seen as a compiler: It takes source code written in some programming language and transforms in into an equivalent program in another programming language, which typically is a machine language, which has the some semantic behaviour. Since SAT is NP-complete, such a reduction exists for any possible problem in NP, including the problem of checking whether e.g. a bitcoin transaction is valid given an appropriate block hash. There is a reduction function that translates a transaction into a boolean formula, such that the formula is satisfiable if and only if the transaction is valid.

    Reduction Example

    In order to see such a reduction, let us consider the problem of evaluating polynomials. First, let us define a polynomial (similar to a boolean formula) as an expression consisting of integer constants, variables, addition, subtraction, multiplication and (correctly balanced) parentheses. Now the problem we want to consider is

    • PolyZero(f) := 1 if f is a polynomial which has a zero where its variables are taken from the set {0, 1}

    We will now construct a reduction from SAT to PolyZero and thus show that PolyZero is also NP-complete (checking that it lies in NP is left as an exercise).

    It suffices to define the reduction function r on the structural elements of a boolean formula. The idea is that for any boolean formula f, the value r(f) is a polynomial with the same number of variables and f(a1,..,ak) is true if and only if r(f)(a1,..,ak) is zero, where true corresponds to 1 and false corresponds to 0, and r(f) only assumes the value 0 or 1 on variables from {0, 1}:

    • r(xi) := (1 – xi)
    • r(¬f) := (1 – r(f))
    • r((f ∧ g)) := (1 – (1 – r(f))(1 – r(g)))
    • r((f ∨ g)) := r(f)r(g)

    One might have assumed that r((f ∧ g)) would be defined as r(f) + r(g), but that will take the value of the polynomial out of the {0, 1} set.

    Using r, the formula ((x ∧ y) ∨¬x) is translated to (1 – (1 – (1 – x))(1 – (1 – y))(1 – (1 – x)),

    Note that each of the replacement rules for r satisfies the goal stated above and thus r correctly performs the reduction:

    • SAT(f) = PolyZero(r(f)) or f is satisfiable if and only if r(f) has a zero in {0, 1}

    Witness Preservation

    From this example, you can see that the reduction function only defines how to translate the input, but when you look at it more closely (or read the proof that it performs a valid reduction), you also see a way to transform a valid witness together with the input. In our example, we only defined how to translate the formula to a polynomial, but with the proof we explained how to transform the witness, the satisfying assignment. This simultaneous transformation of the witness is not required for a transaction, but it is usually also done. This is quite important for zkSNARKs, because the the only task for the prover is to convince the verifier that such a witness exists, without revealing information about the witness.

    Quadratic Span Programs

    In the previous section, we saw how computational problems inside NP can be reduced to each other and especially that there are NP-complete problems that are basically only reformulations of all other problems in NP – including transaction validation problems. This makes it easy for us to find a generic zkSNARK for all problems in NP: We just choose a suitable NP-complete problem. So if we want to show how to validate transactions with zkSNARKs, it is sufficient to show how to do it for a certain problem that is NP-complete and perhaps much easier to work with theoretically.

    This and the following section is based on the paper GGPR12 (the linked technical report has much more information than the journal paper), where the authors found that the problem called Quadratic Span Programs (QSP) is particularly well suited for zkSNARKs. A Quadratic Span Program consists of a set of polynomials and the task is to find a linear combination of those that is a multiple of another given polynomial. Furthermore, the individual bits of the input string restrict the polynomials you are allowed to use. In detail (the general QSPs are a bit more relaxed, but we already define the strong version because that will be used later):

    A QSP over a field F for inputs of length n consists of

    • a set of polynomials v0,…,vm, w0,…,wm over this field F,
    • a polynomial t over F (the target polynomial),
    • an injective function f: {(i, j) | 1 ≤ i ≤ n, j ∈ {0, 1}} → {1, …, m}

    The task here is roughly, to multiply the polynomials by factors and add them so that the sum (which is called a linear combination) is a multiple of t. For each binary input string u, the function f restricts the polynomials that can be used, or more specific, their factors in the linear combinations. For formally:

    An input u is accepted (verified) by the QSP if and only if there are tuples a = (a1,…,am), b = (b1,…,bm) from the field F such that

    • ak,bk = 1 if k = f(i, u[i]) for some i, (u[i] is the ith bit of u)
    • ak,bk = 0 if k = f(i, 1 – u[i]) for some i and
    • the target polynomial t divides va wb where va = v0 + a1 v0 + … + amvm, wb = w0 + b1 w0 + … + bmwm.

    Note that there is still some freedom in choosing the tuples a and b if 2n is smaller than m. This means QSP only makes sense for inputs up to a certain size – this problem is removed by using non-uniform complexity, a topic we will not dive into now, let us just note that it works well for cryptography where inputs are generally small.

    As an analogy to satisfiability of boolean formulas, you can see the factors a1,…,am, b1,…,bm as the assignments to the variables, or in general, the NP witness. To see that QSP lies in NP, note that all the verifier has to do (once she knows the factors) is checking that the polynomial t divides va wb, which is a polynomial-time problem.

    We will not talk about the reduction from generic computations or circuits to QSP here, as it does not contribute to the understanding of the general concept, so you have to believe me that QSP is NP-complete (or rather complete for some non-uniform analogue like NP/poly). In practice, the reduction is the actual “engineering” part – it has to be done in a clever way such that the resulting QSP will be as small as possible and also has some other nice features.

    One thing about QSPs that we can already see is how to verify them much more efficiently: The verification task consists of checking whether one polynomial divides another polynomial. This can be facilitated by the prover in providing another polynomial h such that t h = va wb which turns the task into checking a polynomial identity or put differently, into checking that t h – va wb = 0, i.e. checking that a certain polynomial is the zero polynomial. This looks rather easy, but the polynomials we will use later are quite large (the degree is roughly 100 times the number of gates in the original circuit) so that multiplying two polynomials is not an easy task.

    So instead of actually computing va, wb and their product, the verifier chooses a secret random point s (this point is part of the “toxic waste” of zCash), computes the numbers t(s), vk(s) and wk(s) for all k and from them, va(s) and wb(s) and only checks that t(s) h(s) = va(s) wb (s). So a bunch of polynomial additions, multiplications with a scalar and a polynomial product is simplified to field multiplications and additions.

    Checking a polynomial identity only at a single point instead of at all points of course reduces the security, but the only way the prover can cheat in case t h – va wb is not the zero polynomial is if she manages to hit a zero of that polynomial, but since she does not know s and the number of zeros is tiny (the degree of the polynomials) when compared to the possibilities for s (the number of field elements), this is very safe in practice.

    The zkSNARK in Detail

    We now describe the zkSNARK for QSP in detail. It starts with a setup phase that has to be performed for every single QSP. In zCash, the circuit (the transaction verifier) is fixed, and thus the polynomials for the QSP are fixed which allows the setup to be performed only once and re-used for all transactions, which only vary the input u. For the setup, which generates the common reference string (CRS), the verifier chooses a random and secret field element s and encrypts the values of the polynomials at that point. The verifier uses some specific encryption E and publishes E(vk(s)) and E(wk(s)) in the CRS. The CRS also contains several other values which makes the verification more efficient and also adds the zero-knowledge property. The encryption E used there has a certain homomorphic property, which allows the prover to compute E(v(s)) without actually knowing vk(s).

    How to Evaluate a Polynomial Succinctly and with Zero-Knowledge

    Let us first look at a simpler case, namely just the encrypted evaluation of a polynomial at a secret point, and not the full QSP problem.

    For this, we fix a group (an elliptic curve is usually chosen here) and a generator g. Remember that a group element is called generator if there is a number n (the group order) such that the list g0, g1, g2, …, gn-1 contains all elements in the group. The encryption is simply E(x) := gx. Now the verifier chooses a secret field element s and publishes (as part of the CRS)

    • E(s0), E(s1), …, E(sd) – d is the maximum degree of all polynomials

    After that, s can be (and has to be) forgotten. This is exactly what zCash calls toxic waste, because if someone can recover this and the other secret values chosen later, they can arbitrarily spoof proofs by finding zeros in the polynomials.

    Using these values, the prover can compute E(f(s)) for arbitrary polynomials f without knowing s: Assume our polynomial is f(x) = 4x2 + 2x + 4 and we want to compute E(f(s)), then we get E(f(s)) = E(4s2 + 2s + 4) = g4s^2 + 2s + 4 = E(s2)4 E(s1)2 E(s0)4, which can be computed from the published CRS without knowing s.

    The only problem here is that, because s was destroyed, the verifier cannot check that the prover evaluated the polynomial correctly. For that, we also choose another secret field element, α, and publish the following “shifted” values:

    • E(αs0), E(αs1), …, E(αsd)

    As with s, the value α is also destroyed after the setup phase and neither known to the prover nor the verifier. Using these encrypted values, the prover can similarly compute E(α f(s)), in our example this is E(4αs2 + 2αs + 4α) = E(αs2)4 E(αs1)2 E(αs0)4. So the prover publishes A := E(f(s)) and B := E(α f(s))) and the verifier has to check that these values match. She does this by using another main ingredient: A so-called pairing function e. The elliptic curve and the pairing function have to be chosen together, so that the following property holds for all x, y:

    • e(gx, gy) = e(g, g)xy

    Using this pairing function, the verifier checks that e(A, gα) = e(B, g) — note that gα is known to the verifier because it is part of the CRS as E(αs0). In order to see that this check is valid if the prover does not cheat, let us look at the following equalities:

    e(A, gα) = e(gf(s), gα) = e(g, g)α f(s)

    e(B, g) = e(gα f(s), g) = e(g, g)α f(s)

    The more important part, though, is the question whether the prover can somehow come up with values A, B that fulfill the check e(A, gα) = e(B, g) but are not E(f(s)) and E(α f(s))), respectively. The answer to this question is “we hope not”. Seriously, this is called the “d-power knowledge of exponent assumption” and it is unknown whether a cheating prover can do such a thing or not. This assumption is an extension of similar assumptions that are made for proving the security of other public-key encryption schemes and which are similarly unknown to be true or not.

    Actually, the above protocol does not really allow the verifier to check that the prover evaluated the polynomial f(x) = 4x2 + 2x + 4, the verifier can only check that the prover evaluated some polynomial at the point s. The zkSNARK for QSP will contain another value that allows the verifier to check that the prover did indeed evaluate the correct polynomial.

    What this example does show is that the verifier does not need to evaluate the full polynomial to confirm this, it suffices to evaluate the pairing function. In the next step, we will add the zero-knowledge part so that the verifier cannot reconstruct anything about f(s), not even E(f(s)) – the encrypted value.

    For that, the prover picks a random δ and instead of A := E(f(s)) and B := E(α f(s))), she sends over A’ := E(δ + f(s)) and B := E(α (δ + f(s)))). If we assume that the encryption cannot be broken, the zero-knowledge property is quite obvious. We now have to check two things: 1. the prover can actually compute these values and 2. the check by the verifier is still true.

    For 1., note that A’ = E(δ + f(s)) = gδ + f(s) = gδgf(s) = E(δ) E(f(s)) = E(δ) A and similarly, B’ = E(α (δ + f(s)))) = E(α δ + α f(s))) = gα δ + α f(s) = gα δ gα f(s)

    = E(α)δE(α f(s)) = E(α)δ B.

    For 2., note that the only thing the verifier checks is that the values A and B she receives satisfy the equation A = E(a) und B = E(α a) for some value a, which is obviously the case for a = δ + f(s) as it is the case for a = f(s).

    Ok, so we now know a bit about how the prover can compute the encrypted value of a polynomial at an encrypted secret point without the verifier learning anything about that value. Let us now apply that to the QSP problem.

    A SNARK for the QSP Problem

    Remember that in the QSP we are given polynomials v0,…,vm, w0,…,wm, a target polynomial t (of degree at most d) and a binary input string u. The prover finds a1,…,am, b1,…,bm (that are somewhat restricted depending on u) and a polynomial h such that

    • t h = (v0 + a1v1 + … + amvm) (w0 + b1w1 + … + bmwm).

    In the previous section, we already explained how the common reference string (CRS) is set up. We choose secret numbers s and α and publish

    • E(s0), E(s1), …, E(sd) and E(αs0), E(αs1), …, E(αsd)

    Because we do not have a single polynomial, but sets of polynomials that are fixed for the problem, we also publish the evaluated polynomials right away:

    • E(t(s)), E(α t(s)),
    • E(v0(s)), …, E(vm(s)), E(α v0(s)), …, E(α vm(s)),
    • E(w0(s)), …, E(wm(s)), E(α w0(s)), …, E(α wm(s)),

    and we need further secret numbers βv, βw, γ (they will be used to verify that those polynomials were evaluated and not some arbitrary polynomials) and publish

    • E(γ), E(βv γ), E(βw γ),
    • E(βv v1(s)), …, E(βv vm(s))
    • E(βw w1(s)), …, E(βw wm(s))
    • E(βv t(s)), E(βw t(s))

    This is the full common reference string. In practical implementations, some elements of the CRS are not needed, but that would complicated the presentation.

    Now what does the prover do? She uses the reduction explained above to find the polynomial h and the values a1,…,am, b1,…,bm. Here it is important to use a witness-preserving reduction (see above) because only then, the values a1,…,am, b1,…,bm can be computed together with the reduction and would be very hard to find otherwise. In order to describe what the prover sends to the verifier as proof, we have to go back to the definition of the QSP.

    There was an injective function f: {(i, j) | 1 ≤ i ≤ n, j ∈ {0, 1}} → {1, …, m} which restricts the values of a1,…,am, b1,…,bm. Since m is relatively large, there are numbers which do not appear in the output of f for any input. These indices are not restricted, so let us call them Ifree and define vfree(x) = Σk akvk(x) where the k ranges over all indices in Ifree. For w(x) = b1w1(x) + … + bmwm(x), the proof now consists of

    • Vfree := E(vfree(s)), W := E(w(s)), H := E(h(s)),
    • V’free := E(α vfree(s)), W’ := E(α w(s)), H’ := E(α h(s)),
    • Y := E(βv vfree(s) + βw w(s)))

    where the last part is used to check that the correct polynomials were used (this is the part we did not cover yet in the other example). Note that all these encrypted values can be generated by the prover knowing only the CRS.

    The task of the verifier is now the following:

    Since the values of ak, where k is not a “free” index can be computed directly from the input u (which is also known to the verifier, this is what is to be verified), the verifier can compute the missing part of the full sum for v:

    • E(vin(s)) = E(Σk akvk(s)) where the k ranges over all indices not in Ifree.

    With that, the verifier now confirms the following equalities using the pairing function e (don’t be scared):

    1. e(V’free, g) = e(Vfree, gα), e(W’, E(1)) = e(W, E(α)), e(H’, E(1)) = e(H, E(α))
    2. e(E(γ), Y) = e(E(βv γ), Vfree) e(E(βw γ), W)
    3. e(E(v0(s)) E(vin(s)) Vfree, E(w0(s)) W) = e(H, E(t(s)))

    To grasp the general concept here, you have to understand that the pairing function allows us to do some limited computation on encrypted values: We can do arbitrary additions but just a single multiplication. The addition comes from the fact that the encryption itself is already additively homomorphic and the single multiplication is realized by the two arguments the pairing function has. So e(W’, E(1)) = e(W, E(α)) basically multiplies W’ by 1 in the encrypted space and compares that to W multiplied by α in the encrypted space. If you look up the value W and W’ are supposed to have – E(w(s)) and E(α w(s)) – this checks out if the prover supplied a correct proof.

    If you remember from the section about evaluating polynomials at secret points, these three first checks basically verify that the prover did evaluate some polynomial built up from the parts in the CRS. The second item is used to verify that the prover used the correct polynomials v and w and not just some arbitrary ones. The idea behind is that the prover has no way to compute the encrypted combination E(βv vfree(s) + βw w(s))) by some other way than from the exact values of E(vfree(s)) and E(w(s)). The reason is that the values βv are not part of the CRS in isolation, but only in combination with the values vk(s) and βw is only known in combination with the polynomials wk(s). The only way to “mix” them is via the equally encrypted γ.

    Assuming the prover provided a correct proof, let us check that the equality works out. The left and right hand sides are, respectively

    • e(E(γ), Y) = e(E(γ), E(βv vfree(s) + βw w(s))) = e(g, g)γ(βv vfree(s) + βw w(s))
    • e(E(βv γ), Vfree) e(E(βw γ), W) = e(E(βv γ), E(vfree(s))) e(E(βw γ), E(w(s))) = e(g, g)v γ) vfree(s) e(g, g)w γ) w(s) = e(g, g)γ(βv vfree(s) + βw w(s))

    The third item essentially checks that (v0(s) + a1v1(s) + … + amvm(s)) (w0(s) + b1w1(s) + … + bmwm(s)) = h(s) t(s), the main condition for the QSP problem. Note that multiplication on the encrypted values translates to addition on the unencrypted values because E(x) E(y) = gx gy = gx+y = E(x + y).

    Adding Zero-Knowledge

    As I said in the beginning, the remarkable feature about zkSNARKS is rather the succinctness than the zero-knowledge part. We will see now how to add zero-knowledge and the next section will be touch a bit more on the succinctness.

    The idea is that the prover “shifts” some values by a random secret amount and balances the shift on the other side of the equation. The prover chooses random δfree, δw and performs the following replacements in the proof

    • vfree(s) is replaced by vfree(s) + δfree t(s)
    • w(s) is replaced by w(s) + δw t(s).

    By these replacements, the values Vfree and W, which contain an encoding of the witness factors, basically become indistinguishable form randomness and thus it is impossible to extract the witness. Most of the equality checks are “immune” to the modifications, the only value we still have to correct is H or h(s). We have to ensure that

    • (v0(s) + a1v1(s) + … + amvm(s)) (w0(s) + b1w1(s) + … + bmwm(s)) = h(s) t(s), or in other words
    • (v0(s) + vin(s) + vfree(s)) (w0(s) + w(s)) = h(s) t(s)

    still holds. With the modifications, we get

    • (v0(s) + vin(s) + vfree(s) + δfree t(s)) (w0(s) + w(s) + δw t(s))

    and by expanding the product, we see that replacing h(s) by

    • h(s) + δfree (w0(s) + w(s)) + δw (v0(s) + vin(s) + vfree(s)) + (δfree δw) t(s)

    will do the trick.

    Tradeoff between Input and Witness Size

    As you have seen in the preceding sections, the proof consists only of 7 elements of a group (typically an elliptic curve). Furthermore, the work the verifier has to do is checking some equalities involving pairing functions and computing E(vin(s)), a task that is linear in the input size. Remarkably, neither the size of the witness string nor the computational effort required to verify the QSP (without SNARKs) play any role in verification. This means that SNARK-verifying extremely complex problems and very simple problems all take the same effort. The main reason for that is because we only check the polynomial identity for a single point, and not the full polynomial. Polynomials can get more and more complex, but a point is always a point. The only parameters that influence the verification effort is the level of security (i.e. the size of the group) and the maximum size for the inputs.

    It is possible to reduce the second parameter, the input size, by shifting some of it into the witness:

    Instead of verifying the function f(u, w), where u is the input and w is the witness, we take a hash function h and verify

    • f'(H, (u, w)) := f(u, w) ∧ h(u) = H.

    This means we replace the input u by a hash of the input h(u) (which is supposed to be much shorter) and verify that there is some value x that hashes to H(u) (and thus is very likely equal to u) in addition to checking f(x, w). This basically moves the original input u into the witness string and thus increases the witness size but decreases the input size to a constant.

    This is remarkable, because it allows us to verify arbitrarily complex statements in constant time.

    How is this Relevant to Ethereum

    Since verifying arbitrary computations is at the core of the Ethereum blockchain, zkSNARKs are of course very relevant to Ethereum. With zkSNARKs, it becomes possible to not only perform secret arbitrary computations that are verifiable by anyone, but also to do this efficiently.

    Although Ethereum uses a Turing-complete virtual machine, it is currently not yet possible to implement a zkSNARK verifier in Ethereum. The verifier tasks might seem simple conceptually, but a pairing function is actually very hard to compute and thus it would use more gas than is currently available in a single block. Elliptic curve multiplication is already relatively complex and pairings take that to another level.

    Existing zkSNARK systems like zCash use the same problem / circuit / computation for every task. In the case of zCash, it is the transaction verifier. On Ethereum, zkSNARKs would not be limited to a single computational problem, but instead, everyone could set up a zkSNARK system for their specialized computational problem without having to launch a new blockchain. Every new zkSNARK system that is added to Ethereum requires a new secret trusted setup phase (some parts can be re-used, but not all), i.e. a new CRS has to be generated. It is also possible to do things like adding a zkSNARK system for a “generic virtual machine”. This would not require a new setup for a new use-case in much the same way as you do not need to bootstrap a new blockchain for a new smart contract on Ethereum.

    Getting zkSNARKs to Ethereum

    There are multiple ways to enable zkSNARKs for Ethereum. All of them reduce the actual costs for the pairing functions and elliptic curve operations (the other required operations are already cheap enough) and thus allows also the gas costs to be reduced for these operations.

    1. improve the (guaranteed) performance of the EVM
    2. improve the performance of the EVM only for certain pairing functions and elliptic curve multiplications

    The first option is of course the one that pays off better in the long run, but is harder to achieve. We are currently working on adding features and restrictions to the EVM which would allow better just-in-time compilation and also interpretation without too many required changes in the existing implementations. The other possibility is to swap out the EVM completely and use something like eWASM.

    The second option can be realized by forcing all Ethereum clients to implement a certain pairing function and multiplication on a certain elliptic curve as a so-called precompiled contract. The benefit is that this is probably much easier and faster to achieve. On the other hand, the drawback is that we are fixed on a certain pairing function and a certain elliptic curve. Any new client for Ethereum would have to re-implement these precompiled contracts. Furthermore, if there are advancements and someone finds better zkSNARKs, better pairing functions or better elliptic curves, or if a flaw is found in the elliptic curve, pairing function or zkSNARK, we would have to add new precompiled contracts.



  • The History of Casper — Chapter 1


    Vitalik suggested last week that I share my basic research and design philosophy in a blog post, I agreed but complained that it was still changing. My friend Jon West told me that everyone would really appreciate it if I told everyone about my Casper research, I mostly agreed. Then someone on reddit told me to focus on Ethereum.


    So here’s the Casper tech story, given as a chronological history of the evolution of the key technology, ideas and language that are involved in “Casper research”. Many of our favorite blockchain personalities are part of the story. This is my attempt to recount everything in an accessible, sequential way so that you can see where we are now (and where we’re going) with our research efforts (so don’t argue until the end of the story!). I’m going to try to release a chapter per day until it’s complete.Also note that this is my personal point of view, understanding what little I could manage through the process of working on proof-of-stake. Vitalik and Greg Meredith’s accounts will vary, for example, as they each have their own view of Casper research.


    Preface: How I started doing research at Ethereum

    March 2013-April 2014I immediately got hooked on the Blockchain technology story when Bitcoin first (really) caught my attention in March of 2013. This was during the “Cyprus crisis” run-up in the price of Bitcoin. I learned about cryptographic hashes, digital signatures and public key cryptography. I also learned about Bitcoin mining, and the incentives that miners have to protect the network. I was interested in computer science and security for the first time in my life. It was great.Set against a narrative of dystopian libertarian economics, it was underground developers (like Amir Taaki) versus central bankers in an epic global battle to save the world from the fractional reserve banking system. The blockchain revolution was better than fiction.I consumed content on reddit, listened to Lets Talk Bitcoin and a lot of Peter Todd content. I lost money on BTC-e (once because I took advice from the trollbox). I argued with my friends Ethan Buchman and Zach Ramsay about technology. We learned about MasterCoin and the possibility of building systems of top of Bitcoin, taking advantage of its Proof-of-Work network effect. When I first heard about proof-of-stake (PoS) in the 2013 alt-coin scene (thanks PPCoin!), I thought it sounded like heretical voodoo magic. Replacing miners with coins seemed like an inherently strange thing to try to do. I ended up deciding that the long-range attack problem was fatal, and any solutions were going to involve developer checkpoints of one form or another (an opinion I learned from Peter Todd). Being a Bitcoiner in 2013 was one of the most intellectually stimulating experiences of my life.In Janurary or Feburary 2014, I read about Ethereum for the first time. I watched Vitalik’s youtube videos, and I met him in person at the Toronto Decentral Bitcoin Meetups. He obviously knew way more of the tech story than I did, so I became hooked in, this time on Ethereum. Ethereum was the promise of decentralization made accessible to me, someone without much background. It was general purpose smart contracts that could do anything, disrupt any centralized system. It could be and do so many things that it wasn’t always clear to me what role ethereum would actually play in the blockchain ecosystem. The blockchain tech story (as I see it) took an exciting turn with Ethereum, and I got to be closer to the action 🙂Having been invited by Russel Verbeeten at one of these meetups, Ethan and I went to the hackathon prior to the 2014 Bitcoin Expo in Toronto. (Vitalik taught me how to use Merkle trees at this event.) I was thinking about properly incentivizing and decentralizing the peer review system for a couple of weeks, having recently had a paper rejected from an academic journal. Ethan and I tried putting this kind of system together at the hackathon. Ethan did most of the hard work using pyethereum, while I very slowly put together the first GUI I ever made. We came in second place at the hackathon (after Amir’s “Dark Market”, which became Open Bazaar). We got to meet the whole Ethereum team at the Expo, and we got ourselves invited to the public Skype channels! Charles Hoskinson offered us jobs: It was then, in April 2014, that we started volunteering for Ethereum. We even got @ethereum.org email addresses.So I got into the blockchain space because I got hooked on the Bitcoin tech story, and then on the Ethereum tech story. I then got hooked on the proof-of-stake tech story, which I now know to be very compelling. I’m going to share it, being as faithful as possible to the timeline and manner in which the parts of picture have been coming together, in an effort to help bring everyone up to speed on our efforts. It may take a few chapters, but story time ain’t over ’til it’s over.

    Chapter 1: Slasher + Security Deposits: The move from naive proof-of-stake to modern proof-of-stake.


    May 2014 – September 12, 2014When Vitalik first expressed interest in PoS to me in May 2014, first over Skype and then at a Bitcoin conference in Vienna, I was skeptical. Then he told me about slasher, which I think he had come up in January 2014. Slasher was the idea that you could lose your block reward if you sign blocks at the same height on two forks.This gave Vitalik the ability to directly tackle (and arguably solve) the nothing-at-stake problem. (For the uninitiated, the “nothing-at-stake” problem refers to the fact that the PoS miners best strategy is to mine on all forks, because signatures are very cheap to produce). It also opened up our imaginations to a new space of interactive protocols for disincentivizing bad behaviour.Still, I did not feel very satisfied with proof-of-stake at this time (despite Vitalik telling me a couple of times that he thinks “proof-of-stake is the future”) because I was really in love with proof-of-work. So during the summer I mostly worked on proof-of-work problems (ASIC-hard PoW, security sharing between PoW Chains via “Proofs-of-Proof-of-Work”, neither to completion). But I did suggest the use of security deposits to a couple of contract developers on a couple of different occasions. This planted the seed for insights made on the fateful post-Ethereum-meetup night of September 11th 2014 (kudos to Stephan Tual for organizing + getting me to that event!).Ethan Buchman and I stayed up late talking about proof-of-stake at the “hacker” instead of the “party” section of Amir Taaki’s squat in London. I connected the dots and internalized the power of security deposits for proof-of-stake. This was the night that I became convinced that PoS would work, and that making it work would be a huge amount of fun. It was also the first time I experienced the surprising size of the PoS design space, through long arguments about attacks and possible protocol responses.Since the early morning of September 12th, 2014 I have firmly advocated (to everyone who would listen) that blockchains move to PoS because it would be more secure. Amir Taaki was unimpressed by my enthusiasm for proof-of-stake. At least Ethan and I were having the best time.The use of security deposits always significantly leveraged slasher’s effectiveness. Instead of forgoing some profit X, a provably faulty node would lose a security deposit (imagined to be on the order of size X/r) on which the block reward X was to be paid as interest (at rate r).You place a deposit to play, and if you play nice you make a small return on your deposit, but if you play mean you lose your deposit. It feels economically ideal, and it’s so programmable.

    Adding deposits to slasher meant that the nothing at stake problem was officially solved.

    At least, I had made up my mind that it was solved to the point where we could no longer understand why anyone would want to build a proof-of-stake system without security deposits, for fear of nothing-at-stake problems.Also on September 12th, 2014 I met Pink Penguin for the first time, due to an introduction from Stephan Tual. I breathlessly recounted my PoS insights made the night before. And after I respectfully declined a job from from Eris Industries (now Monax) that week, Pink Penguin began sponsoring this research! (Thanks <3!!)At this point in the story I was unaware of the other, multiple independent discoveries of the use of security deposits in proof-of-stake systems made by Jae Kwon, Dominic Williams, and Nick Williamson.Stay tuned… the next chapter is about the central role that ideas from game theory played in setting the design goals that led to Casper!

    -Vlad Zamfir







  • The History of Casper – Chapter 2


    This chapter describes the game theory and economic security modelling we were doing in the Fall of 2014. It recounts how the “bribing attacker model” led our research directly to a radical solution to the long range attack problem.

    Chapter 2: The Bribing Attacker, Economic Security, and the Long Range Attack Problem

    Vitalik and I had each been reasoning about incentives as part of our research before we ever met, so the proposition that “getting the incentives right” was crucial in proof-of-stake was never a matter of debate. We were never willing to take “half of the coins are honest” as a security assumption. (It’s in bold because it’s important.) We knew that we needed some kind of “incentive compatibility” between bonded node incentives and protocol security guarantees.It was always our view that the protocol could be viewed as a game that could easily result in “bad outcomes” if the protocol’s incentives encouraged that behaviour. We regarded this as a potential security problem. Security deposits gave us a clear way to punish bad behaviour; slashing conditions, which are basically programs that decide whether to destroy the deposit.We had long observed that Bitcoin was more secure when the price of bitcoin was higher, and less secure when it was lower. We also now knew that security deposits provided slasher with more economic efficiency than slasher only on rewards. It was clear to us that economic security existed and we made it a high priority.

    The Bribing Attacker

    I’m not sure how much background Vitalik had in game theory (though it was clear he had more than I did). My own game theory knowledge at the start of the story was even more minimal than it is at the end. But I knew how to recognize and calculate Nash Equilibriums. If you haven’t learned about Nash Equilibriums yet, this next paragraph is for you.A Nash Equilibrium is a strategy profile (the players’ strategy choices) with a corresponding payoff (giving $ETH or taking $ETH away) where no players individually have an incentive to deviate. “Incentive to deviate” means “they get more $ETH if they somehow change what they’re doing”. If you remember that, and every time you hear “Nash Equilbrium” you thought “no points for individual strategy changes”, you’ll have it.Some time in late summer of 2014, I first ran into “the bribing attacker model” when I made an offhand response to an economic security question Vitalik asked me on a Skype call (“I can just bribe them to do it”). I don’t know where I got the idea. Vitalik then asked me again about this maybe a week or two later, putting me on the spot to develop it further.

    By bribing game participants you can modify a game’s payoffs, and through this operation change its Nash Equilibriums. Here’s how this might look:



    The bribe attack changes the Nash Equilibrium of the Prisoner’s Dilemma game from (Up, Left) to (Down,Right). The bribing attacker in this example has a cost of 6 if (Down, Right) is played.

    The bribing attacker was our first useful model of economic security.

    Before the bribing attack, we usually thought about economic attacks as hostile takeovers by foreign, extra-protocol purchasers of tokens or mining power. A pile of external capital would have to come into the system to attack the blockchain. With the bribe attack, the question became “what is the price of bribing the currently existing nodes to get the desired outcome?”.

    We hoped that the bribing attacks of our yet-to-be-defined proof-of-stake protocol would have to spend a lot of money to compensate for lost deposits.

    Debate about “reasonableness” aside, this was our first step in learning to reason about economic security. It was fun and simple to use a bribing attacker. You just see how much you have to pay the players to do what the attacker wants. And we were already confident that we would be able to make sure that an attacker has to pay security-deposit-sized bribes to revert the chain in an attempted double-spend. We knew we could recognize “double-signing”. So we were pretty sure that this would give proof-of-stake a quantifiable economic security advantage over a proof-of-work protocol facing a bribing attacker.

    The Bribing Economics of the Long Range Attack

    Vitalik and I applied the bribing attacker to our proof-of-stake research. We found that PoS protocols without security deposits could be trivially defeated with small bribes. You simply pay coin holders to move their coins to new addresses and give you the key to their now empty addresses. (I’m not sure who originally thought of this idea.) Our insistence on using the briber model easily ruled out all of the proof-of-stake protocols we knew about. I liked that. (At the time we had not yet heard of Jae Kwon’s Tendermint, of Dominic William’s now-defunct Pebble, or of Nick Williamson’s Credits.)

    This bribe attack also posed a challenge to security-deposit based proof-of-stake: The moment after a security deposit was returned to its original owner, the bribing adversary could buy the keys to their bonded stakeholder address at minimal cost.

    This attack is identical to the long range attack. It is acquiring old keys to take control of the blockchain. It meant that the attacker can create “false histories” at will. But only if they start at a height from which all deposits are expired.

    Before working on setting the incentives for our proof-of-stake protocol, therefore, we needed to address the long-range attack problem. If we didn’t address the long range attack problem, then it would be impossible for clients to reliably learn who really had the security deposits.

    We did know that developer checkpoints could be used to deal with the long-range attack problem. We thought this was clearly way too centralized.

    In the weeks following my conversion to proof-of-stake, while I was staying at Stephan Tual’s house outside of London, I discovered that there was a natural rule for client reasoning about security deposits. Signed commitments are only meaningful if the sender currently has a deposit. That is to say, after the deposit is withdrawn, the signatures from these nodes are no longer meaningful. Why would I trust you after you withdraw your deposit?

    The bribing attack model demanded it. It would cost the bribing attacker almost nothing to break the commitments after the deposit is withdrawn.

    This meant that a client would hold a list of bonded nodes, and stop blocks at the door if they were not signed by one of these nodes. Ignoring consensus messages from nodes who don’t currently have security deposits solves circumvents the long-range attack problem. Instead of authenticating the current state based on the history starting from the genesis block, we authenticate it based on a list of who currently has deposits.

    This is radically different from proof-of-work.

    In PoW, a block is valid if it is chained to the genesis block, and if the block hash meets the difficulty requirement for its chain. In this security deposit-based model, a block is valid if it was created by a stakeholder with a currently existing deposit. This meant that you would need to have current information in order to authenticate the blockchain. This subjectivity has caused a lot of people a lot of concern, but it is necessary for security-deposit based proof-of-stake to be secure against the bribing attacker.

    This realization made it very clear to me that the proof-of-work security model and the proof-of-stake security model are fundamentally not compatible. I therefore abandoned any serious use of “hybrid” PoW/PoS solutions. Trying to authenticate a proof-of-stake blockchain from genesis now seemed very obviously wrong.

    Beyond changing the authentication model, however, we did need to provide a way to manage these lists of security deposits. We had to use signatures from bonded nodes to manage changes to the list of bonded nodes, and we had to do it after the bonded nodes come to consensus on these changes. Otherwise, clients would have different lists of bonded validators, and they would therefore be unable to agree on the state of Ethereum.

    Bond time needed to be made long, so that clients have time to learn about the new, incoming set of bonded stakeholders. As long as clients were online enough, they could keep up to date. I thought we would use twitter to share the bonded node list, or at least a hash, so that new and hibernating clients could get synchronized after their user enters a hash into the UI.

    If you have the wrong validator list you can get man-in-the-middled. But it’s really not that bad. The argument was (and still is!) that you only need to be able to trust an external source for this information once. After that once, you will be able to update your list yourself – at least, if you are able to be online regularly enough to avoid the “long range” of withdrawn deposits.

    I know that it might take some getting used to. But we can only rely on fresh security deposits. Vitalik was a bit uncomfortable with this argument at first, trying to hold onto the ability to authenticate from genesis, but eventually was convinced by the necessity of this kind of subjectivity in proof of stake protocols. Vitalik independently came up with his weak subjectivity scoring rule, which seemed to me like a perfectly reasonable alternative to my idea at the time, which was basically “have all the deposits sign every Nth block to update the bonded node list”.

    With the nails in the nothing-at-stake and long-range attack coffins completely hammered in, we were ready to start choosing our slashing conditions.

    The next chapter will document what we learned from our first struggles to define a consensus protocol by specifying slashing conditions. I’ll also tell you about what we learned from talking with fine people from our space about our research. The game theory and economic modelling story presented here will continue developing in Chapter 4.


    NOTE: The views expressed here are solely my own personal views and do not represent those of the Ethereum Foundation. I am solely responsible for what I’ve written and am not am not acting as a spokesperson for the Foundation.









  • Swarm alpha public pilot and the basics of Swarm


    With the long awaited geth 1.5 (“let there bee light”) release, Swarm made it into the official go-ethereum release as an experimental feature. The current version of the code is POC 0.2 RC5 — “embrace your daemons” (roadmap), which is the refactored and cleaner version of the codebase that was running on the Swarm toynet in the past months.The current release ships with the swarmcommand that launches a standalone Swarm daemon as separate process using your favourite IPC-compliant ethereum client if needed. Bandwidth accounting (using the Swarm Accounting Protocol = SWAP) is responsible for smooth operation and speedy content delivery by incentivising nodes to contribute their bandwidth and relay data. The SWAP system is functional but it is switched off by default. Storage incentives (punitive insurance) to protect availability of rarely-accessed content is planned to be operational in POC 0.4. So currently by default, the client uses the blockchain only for domain name resolution.With this blog post we are happy to announce the launch of our shiny new Swarm testnet connected to the Ropsten ethereum testchain. The Ethereum Foundation is contributing a 35-strong (will be up to 105) Swarm cluster running on the Azure cloud. It is hosting the Swarm homepage.We consider this testnet as the first public pilot, and the community is welcome to join the network, contribute resources, and help us find issues, identify painpoints and give feedback on useability. Instructions can be found in the Swarm guide. We encourage those who can afford to run persistent nodes (nodes that stay online) to get in touch. We have already received promises for 100TB deployments.Note that the testnet offers no guarantees! Data may be lost or become unavailable. Indeed guarantees of persistence cannot be made at least until the storage insurance incentive layer is implemented (scheduled for POC 0.4).We envision shaping this project with more and more community involvement, so we are inviting those interested to join our public discussion rooms on gitter. We would like to lay the groundwork for this dialogue with a series of blog posts about the technology and ideology behind Swarm in particular and about Web3 in general. The first post in this series will introduce the ingredients and operation of Swarm as currently functional.

    What is Swarm after all?

    Swarm is a distributed storage platform and content distribution service; a native base layer service of the ethereum Web3 stack. The objective is a peer-to-peer storage and serving solution that has zero downtime, is DDOS-resistant, fault-tolerant and censorship-resistant as well as self-sustaining due to a built-in incentive system. The incentive layer uses peer-to-peer accounting for bandwidth, deposit-based storage incentives and allows trading resources for payment. Swarm is designed to deeply integrate with the devp2p multiprotocol network layer of Ethereum as well as with the Ethereum blockchain for domain name resolution, service payments and content availability insurance. Nodes on the current testnet use the Ropsten testchain for domain name resolution only, with incentivisation switched off. The primary objective of Swarm is to provide decentralised and redundant storage of Ethereum’s public record, in particular storing and distributing dapp code and data as well as blockchain data.There are two major features that set Swarm apart from other decentralised distributed storage solutions. While existing services (Bittorrent, Zeronet, IPFS) allow you to register and share the content you host on your server, Swarm provides the hosting itself as a decentralised cloud storage service. There is a genuine sense that you can just ‘upload and disappear’: you upload your content to the swarm and retrieve it later, all potentially without a hard disk. Swarm aspires to be the generic storage and delivery service that, when ready, caters to use-cases ranging from serving low-latency real-time interactive web applications to acting as guaranteed persistent storage for rarely used content.The other major feature is the incentive system. The beauty of decentralised consensus of computation and state is that it allows programmable rulesets for communities, networks, and decentralised services that solve their coordination problems by implementing transparent self-enforcing incentives. Such incentive systems model individual participants as agents following their rational self-interest, yet the network’s emergent behaviour is massively more beneficial to the participants than without coordination.Not long after Vitalik’s whitepaper the Ethereum dev core realised that a generalised blockchain is a crucial missing piece of the puzzle needed, alongside existing peer-to-peer technologies, to run a fully decentralised internet. The idea of having separate protocols (shh for Whisper, bzz for Swarm, eth for the blockchain) was introduced in May 2014 by Gavin and Vitalik who imagined the Ethereum ecosystem within the grand crypto 2.0 vision of the third web. The Swarm project is a prime example of a system where incentivisation will allow participants to efficiently pool their storage and bandwidth resources in order to provide global content services to all participants. We could say that the smart contracts of the incentives implement the hive mind of the swarm.A thorough synthesis of our research into these issues led to the publication of the first two orange papers. Incentives are also explained in the devcon2 talk about the Swarm incentive system. More details to come in future posts.

    How does Swarm work?

    Swarm is a network, a service and a protocol (rules). A Swarm network is a network of nodes running a wire protocol called bzz using the ethereum devp2p/rlpx network stack as the underlay transport. The Swarm protocol (bzz) defines a mode of interaction. At its core, Swarm implements a distributed content-addressed chunk store. Chunks are arbitrary data blobs with a fixed maximum size (currently 4KB). Content addressing means that the address of any chunk is deterministically derived from its content. The addressing scheme falls back on a hash function which takes a chunk as input and returns a 32-byte long key as output. A hash function is irreversible, collision free and uniformly distributed (indeed this is what makes bitcoin, and in general proof-of-work, work).This hash of a chunk is the address that clients can use to retrieve the chunk (the hash’s preimage). Irreversible and collision-free addressing immediately provides integrity protection: no matter the context of how a client knows about an address,
    it can tell if the chunk is damaged or has been tampered with just by hashing it.Swarm’s main offering as a distributed chunkstore is that you can upload content to it.
    The nodes constituting the Swarm all dedicate resources (diskspace, memory, bandwidth and CPU) to store and serve chunks. But what determines who is keeping a chunk?
    Swarm nodes have an address (the hash of the address of their bzz-account) in the same keyspace as the chunks themselves. Lets call this address space the overlay network. If we upload a chunk to the Swarm, the protocol determines that it will eventually end up being stored at nodes that are closest to the chunk’s address (according to a well-defined distance measure on the overlay address space). The process by which chunks get to their address is called syncing and is part of the protocol. Nodes that later want to retrieve the content can find it again by forwarding a query to nodes that are close the the content’s address. Indeed, when a node needs a chunk, it simply posts a request to the Swarm with the address of the content, and the Swarm will forward the requests until the data is found (or the request times out). In this regard, Swarm is similar to a traditional distributed hash table (DHT) but with two important (and under-researched) features.Swarm uses a set of TCP/IP connections in which each node has a set of (semi-)permanent peers. All wire protocol messages between nodes are relayed from node to node hopping on active peer connections. Swarm nodes actively manage their peer connections to maintain a particular set of connections, which enables syncing and content-retrieval by key-based routing. Thus, a chunk-to-be-stored or a content-retrieval-request message can always be efficiently routed along these peer connections to the nodes that are nearest to the content’s address. This flavour of the routing scheme is called forwarding Kademlia.Combined with the SWAP incentive system, a node’s rational self-interest dictates opportunistic caching behaviour: The node caches all relayed chunks locally so they can be the ones to serve it next time it is requested. As a consequence of this behavior, popular content ends up being replicated more redundantly across the network, essentially decreasing the latency of retrievals – we say that [call this phemon/outcome/?] Swarm is ‘auto-scaling’ as a distribution network. Furthermore, this caching behaviour unburdens the original custodians from potential DDOS attacks. SWAP incentivises nodes to cache all content they encounter, until their storage space has been filled up. In fact, caching incoming chunks of average expected utility is always a good strategy even if you need to expunge older chunks.
    The best predictor of demand for a chunk is the rate of requests in the past. Thus it is rational to remove chunks requested the longest time ago. So content that falls out of fashion, goes out of date, or never was popular to begin with, will be garbage collected and removed unless protected by insurance. The upshot is that nodes will end up fully utilizing their dedicated resources to the benefit of users. Such organic auto-scaling makes Swarm a kind of maximum-utilisation elastic cloud.

    Documents and the Swarm hash

    Now we’ve explained how Swarm functions as a distributed chunk store (fix-sized preimage archive), you may wonder, where do chunks come from and why do I care?On the API layer Swarm provides a chunker. The chunker takes any kind of readable source, such as a file or a video camera capture device, and chops it into fix-sized chunks. These so-called data chunks or leaf chunks are hashed and then synced with peers. The hashes of the data chunks are then packaged into chunks themselves (called intermediate chunks) and the process is repeated. Currently 128 hashes make up a new chunk. As a result the data is represented by a merkle tree, and it is the root hash of the tree that acts as the address you use to retrieve the uploaded file.When you retrieve this ‘file’, you look up the root hash and download its preimage. If the preimage is an intermediate chunk, it is interpreted as a series of hashes to address chunks on a lower level. Eventually the process reaches the data level and the content can be served. An important property of a merklised chunk tree is that it provides integrity protection (what you seek is what you get) even on partial reads. For example, this means that you can skip back and forth in a large movie file and still be certain that the data has not been tampered with. advantages of using smaller units (4kb chunk size) include parallelisation of content fetching and less wasted traffic in case of network failures.

    Manifests and URLs

    On top of the chunk merkle trees, Swarm provides a crucial third layer of organising content: manifest files. A manifest is a json array of manifest entries. An entry minimally specifies a path, a content type and a hash pointing to the actual content. Manifests allow you to create a virtual site hosted on Swarm, which provides url-based addressing by always assuming that the host part of the url points to a manifest, and the path is matched against the paths of manifest entries. Manifest entries can point to other manifests, so they can be recursively embedded, which allows manifests to be coded as a compacted trie efficiently scaling to huge datasets (i.e., Wikipedia or YouTube). Manifests can also be thought of as sitemaps or routing tables that map url strings to content. Since each step of the way we either have merkelised structures or content addresses, manifests provide integrity protection for an entire site.Manifests can be read and directly traversed using the bzzr url scheme. This use is demonstrated by the Swarm Explorer, an example Swarm dapp that displays manifest entries as if they were files on a disk organised in directories. Manifests can easily be interpreted as directory trees so a directory and a virtual host can be seen as the same. A simple decentralised dropbox implementation can be based on this feature. The Swarm Explorer is up on swarm: you can use it to browse any virtual site by putting a manifest’s address hash in the url: this link will show the explorer browsing its own source code.Hash-based addressing is immutable, which means there is no way you can overwrite or change the content of a document under a fixed address. However, since chunks are synced to other nodes, Swarm is immutable in the stronger sense that if something is uploaded to Swarm, it cannot be unseen, unpublished, revoked or removed. For this reason alone, be extra careful with what you share. However you can change a site by creating a new manifest that contains new entries or drops old ones. This operation is cheap since it does not require moving any of the actual content referenced. The photo album is another Swarm dapp that demonstrates how this is done. the source on github. If you want your updates to show continuity or need an anchor to display the latest version of your content, you need name based mutable addresses. This is where the blockchain, the Ethereum Name Service and domain names come in. A more complete way to track changes is to use version control, like git or mango, a git using Swarm (or IPFS) as its backend.

    Ethereum Name Service

    In order to authorise changes or publish updates, we need domain names. For a proper domain name service you need the blockchain and some governance. Swarm uses the Ethereum Name Service (ENS) to resolve domain names to Swarm hashes. Tools are provided to interact with the ENS to acquire and manage domains. The ENS is crucial as it is the bridge between the blockchain and Swarm.If you use the Swarm proxy for browsing, the client assumes that the domain (the part after bzz:/ up to the first slash) resolves to a content hash via ENS. Thanks to the proxy and the standard url scheme handler interface, Mist integration should be blissfully easy for Mist’s official debut with Metropolis.Our roadmap is ambitious: Swarm 0.3 comes with an extensive rewrite of the network layer and the syncing protocol, obfuscation and double masking for plausible deniability, kademlia routed p2p messaging, improved bandwidth accounting and extended manifests with http header support and metadata. Swarm 0.4 is planned to ship client side redundancy with erasure coding, scan and repair with proof of custody, encryrption support, adaptive transmission channels for multicast streams and the long-awaited storage insurance and litigation.In future posts, we will discuss obfuscation and plausible deniability, proof of custody and storage insurance, internode messaging and the network testing and simulation framework, and more. Watch this space, bzz…



  • Security alert [12/19/2016]: Ethereum.org Forums Database Compromised


    On December 16, we were made aware that someone had recently gained unauthorized access to a database from forum.ethereum.org. We immediately launched a thorough investigation to determine the origin, nature, and scope of this incident. Here is what we know:

    • The information that was recently accessed is a database backup from April 2016 and contained information about 16.5k forum users.
    • The leaked information includes
      • Messages, both public and private
      • IP-addresses
      • Username and email addresses
      • Profile information
      • Hashed passwords
        • ~13k bcrypt hashes (salted)
        • ~1.5k WordPress-hashes (salted)
        • ~2k accounts without passwords (used federated login)
    • The attacker self-disclosed that they are the same person/persons who recently hacked Bo Shen.
    • The attacker used social engineering to gain access to a mobile phone number that allowed them to gain access to other accounts, one of which had access to an old database backup from the forum.

    We are taking the following steps:

    • Forum users whose information may have been compromised by the leak will be receiving an email with additional information.
    • We have closed the unauthorized access points involved in the leak.
    • We are enforcing stricter security guidelines internally such as removing the recovery phone numbers from accounts and using encryption for sensitive data.
    • We are providing the email addresses that we believe were leaked to https://haveibeenpwned.com, a service that helps communicate with affected users.
    • We are resetting all forum passwords, effective immediately.

    If you were affected by the attack we recommend you do the following:

    • Ensure that your passwords are not reused between services. If you have reused your forum.ethereum.org password elsewhere, change it in those places.

    Additionally, we recommend this excellent blog post by Kraken that provides useful information about how to protect against these types of attacks.We deeply regret that this incident occurred and are working diligently internally, as well as with external partners to address the incident.Questions can be directed to [email protected].


    By

    profile


    Hudson Jameson







  • Ethermine Pool Remote Monitor Android Smartphone App for Ethereum Mining Rigs






  • Coinitrage.com Arbitrage Firm now accepts ETH!



  • # December Roundup Posted by Vitalik Buterin ## December marks a month of continued progress in the Ethereum ecosystem. Research on proof of stake and sharding continues after the research team’s [workshop in Singapore in November](https://blog.ethereum.org/2016/12/04/ethereum-research-update/), the light client slowly [keeps getting better](https://www.reddit.com/r/ethereum/comments/5kyxvd/i_just_tried_geth_light_today_and_am_jumping_for/), Whisper and Swarm keep moving forward, and discussions on protocol economics and community governance continue to mature. ## First, privacy technologies on Ethereum, and particularly zk-SNARKs (or “zero knowledge proofs”), have been rapidly moving forward. ## A blog post, “[zk-SNARKs in a Nutshell](https://blog.ethereum.org/2016/12/05/zksnarks-in-a-nutshell/)“, by Christian Reitwiessner A blog post explaining [quadratic arithmetic programs](https://blog.ethereum.org/2016/12/05/zksnarks-in-a-nutshell/), from myself ## [An implementation of elliptic curve pairings](http://github.com/ethereum/research/blob/master/zksnark/), perhaps the most complex on-chain technical ingredient in zk-SNARK verification, from myself ## Experimental work in integrating a [zk-SNARK precompile]([link url](https://github.com/ethereum/cpp-ethereum/blob/snark/libdevcrypto/LibSnark.cpp)) in C++ from Christian ## Vlad Zamfir has taken it upon himself to explain the history behind Casper, from his point of view: ### [Chapter 1](https://medium.com/@Vlad_Zamfir/the-history-of-casper-part-1-59233819c9a9) ### [Chapter 2](https://medium.com/@Vlad_Zamfir/the-history-of-casper-chapter-2-8e09b9d3b780) ### [Chapter 3](https://medium.com/@Vlad_Zamfir/the-history-of-casper-chapter-3-70fefb1182fc) ### [Chapter 4](https://medium.com/@Vlad_Zamfir/the-history-of-casper-chapter-4-3855638b5f0e) ### [Chapter 5](https://medium.com/@Vlad_Zamfir/the-history-of-casper-chapter-5-8652959cef58) ## On proof of stake from myself: ### [A Proof of Stake Design Philosophy](https://medium.com/@VitalikButerin/a-proof-of-stake-design-philosophy-506585978d51) ### And while we’re at it, the [Proof of Stake FAQ](https://github.com/ethereum/wiki/wiki/Proof-of-Stake-FAQ) and [Sharding FAQ](https://github.com/ethereum/wiki/wiki/Sharding-FAQ) continue to exist and and continue to be worked on. ### Vlad has also taken it upon himself to rail against the evils of “economic abstraction” (ie. the goal of trying to create token-agnostic public economic consensus protocols): [Round 1](https://medium.com/@Vlad_Zamfir/against-economic-abstraction-e27f4cbba5a7) [Round 2](https://medium.com/@Vlad_Zamfir/against-economic-abstraction-round-2-21f5c4e77d54) [But not technical abstraction!](https://medium.com/@Vlad_Zamfir/against-economic-abstraction-mitigating-possible-collateral-damage-ce8b939e808f) Various discussions were had on monetary policy: [A community-created EIP](https://github.com/ethereum/EIPs/issues/186) (186) proposed to decrease ETH issuance by ~3x before PoS [Discussions on issuance in Casper](Discussions on issuance in Casper) in the Reddit thread for one of Vlad’s posts Speaking of EIPs… Greg Colvin’s suggested modifications for [adding further static analysis capability](https://github.com/ethereum/EIPs/issues/184) (184) as part of the move toward “EVM 1.5” The Ethereum Name System (launched on the Ropsten testnet in late November), saw an [EIP opened](lhttps://github.com/ethereum/EIPs/issues/181 (181) to support reverse resolution of Ethereum addresses The data storage-focused “sister protocol” Swarm continues to move forward: [Public pilot released](https://blog.ethereum.org/2016/12/15/swarm-alpha-public-pilot-basics-swarm/)! (by Viktor Tron) A [reddit thread](https://www.reddit.com/r/ethereum/comments/5kl5lm/swarm_is_awesome/) with community feedback And from the core client development standpoint: Geth 1.5.5 was released, combining [small but important fixes](https://github.com/ethereum/go-ethereum/releases/tag/v1.5.5) to various “bugs and annoyances” Jan Xie is continuing work on pyethereum to see how well it can pass through all of the denial-of-service blocks in September and October. [Not always,]( Not always,) though the good news is that there seem to be no quadratic memory issues that stop the client outright. Another grab bag of [small but important security fixes and stability improvements](https://github.com/ethereum/mist/releases/tag/v0.8.8) from Mist 0.8.8, after an audit from [Cure53](Cure53) We wish the community a happy new year and look forward to more progress in January! Vitalik Buterin ![0_1483195262428_upload-0a84834f-77b7-4311-ab69-0b70e5ccadeb](https://i.imgur.com/yKQog6N.png) https://ethereum.org


  • Introduction of the Light Client for DApp developers

    The first version of the Light Ethereum Subprotocol (LES/1) and its implementation in Geth are still in an experimental stage, but they are expected to reach a more mature state in a few months where the basic functions will perform reliably. The light client has been designed to function more or less the same as a full client, but the “lightness” has some inherent limitations that DApp developers should understand and consider when designing their applications.

    In most cases a properly designed application can work even without knowing what kind of client it is connected to, but we are looking into adding an API extension for communicating different client capabilities in order to provide a future proof interface. While minor details of LES are still being worked out, I believe it is time to clarify the most important differences between full and light clients from the application developer perspective.

    Current limitations

    Searching and filtering contract events

    With a light client, it is not possible to search or filter events of a certain address without specifying a log topic (allEvents filter). Looking for topics without specifying an address is also not possible and will not return any results. You can only search or watch for specific (address, topic) pairs, which is the intended use of logs anyway. If the number of possible different events a contract can generate is large, the right approach is to build a hierarchy of possible events and generate log entries both for the specific event type and for more general categories.

    Pending transactions

    Light clients do not receive pending transactions from the main Ethereum network. The only pending transactions a light client knows about are the ones that have been created and sent from that client. When a light client sends a transaction, it starts downloading entire blocks until it finds the sent transaction in one of the blocks, then removes it from the pending transaction set.

    Finding a transaction by hash

    Currently you can only find locally created transactions by hash. These transactions and their inclusion blocks are stored in the database and can be found by hash later. Finding other transactions is a bit trickier. It is possible (though not implemented as of yet) to download them from a server and verify the transaction is actually included in the block if the server found it. Unfortunately, if the server says that the transaction does not exist, it is not possible for the client to verify the validity of this answer. It is possible to ask multiple servers in case the first one did not know about it, but the client can never be absolutely sure about the non-existence of a given transaction. For most applications this might not be an issue but it is something one should keep in mind if something important may depend on the existence of a transaction. A coordinated attack to fool a light client into believing that no transaction exists with a given hash would probably be difficult to execute but not entirely impossible.

    Performance considerations

    Request latency

    The only thing a light client always has in its database is the last few thousand block headers. This means that retrieving anything else requires the client to send requests and get answers from light servers. The light client tries to optimize request distribution and collects statistical data of each server’s usual response times in order to reduce latency. Latency is the key performance parameter of a light client. It is usually in the 100-200ms order of magnitude, and it applies to every state/contract storage read, block and receipt set retrieval. If many requests are made sequentially to perform an operation, it may result in a slow response time for the user. Running API functions in parallel whenever possible can greatly improve performance.

    Searching for events in a long history of blocks

    Full clients employ a so-called “MIP mapped” bloom filter to find events quickly in a long list of blocks so that it is reasonably cheap to search for certain events in the entire block history. Unfortunately, using a MIP-mapped filter is not easy to do with a light client, as searches are only performed in individual headers, which is a lot slower. Searching a few days’ worth of block history usually returns after an acceptable amount of time, but at the moment you should not search for anything in the entire history because it will take an extremely long time.

    Memory, disk and bandwidth requirements

    Here is the good news: a light client does not need a big database since it can retrieve anything on demand. With garbage collection enabled (which scheduled to be implemented), the database will function more like a cache, and a light client will be able to run with as low as 10Mb of storage space. Note that the current Geth implementation uses around 200Mb of memory, which can probably be further reduced. Bandwidth requirements are also lower when the client is not used heavily. Bandwidth used is usually well under 1Mb/hour when running idle, with an additional 2-3kb for an average state/storage request.

    Future improvements

    Reducing overall latency by remote execution

    Sometimes it is unnecessary to pass data back and forth multiple times between the client and the server in order to evaluate a function. It would be possible to execute functions on the server side, then collect all the Merkle proofs proving every piece of state data the function accessed and return all the proofs at once so that the client can re-run the code and verify the proofs. This method can be used for both read-only functions of the contracts as well as any application-specific code that operates on the blockchain/state as an input.

    Verifying complex calculations indirectly

    One of the main limitations we are working to improve is the slow search speed of log histories. Many of the limitations mentioned above, including the difficulty of obtaining MIP-mapped bloom filters, follow the same pattern: the server (which is a full node) can easily calculate a certain piece of information, which can be shared with the light clients. But the light clients currently have no practical way of checking the validity of that information, since verifying the entire calculation of the results directly would require so much processing power and bandwidth, which would make using a light client pointless.

    Fortunately there is a safe and trustless solution to the general task of indirectly validating remote calculations based on an input dataset that both parties assume to be available, even if the receiving party does not have the actual data, only its hash. This is the exact the case in our scenario where the Ethereum blockchain itself can be used as an input for such a verified calculation. This means it is possible for light clients to have capabilities close to that of full nodes because they can ask a light server to remotely evaluate an operation for them that they would not be able to otherwise perform themselves. The details of this feature are still being worked out and are outside the scope of this document, but the general idea of the verification method is explained by Dr. Christian Reitwiessner in this Devcon 2 talk.

    Complex applications accessing huge amounts of contract storage can also benefit from this approach by evaluating accessor functions entirely on the server side and not having to download proofs and re-evaluate the functions. Theoretically it would also be possible to use indirect verification for filtering events that light clients could not watch for otherwise. However, in most cases generating proper logs is still simpler and more efficient.

    profile


    Zsolt Felföldi



  • Mobile: Account management

    By,Péter Szilágyi

    To provide Ethereum integration for your mobile applications, the very first thing you should be interested in doing is account management.

    Although all current leading Ethereum implementations provide account management built in, it is ill advised to keep accounts in any location that is shared between multiple applications and/or multiple people. The same way you do not entrust your ISP (who is after all your gateway into the internet) with your login credentials; you should not entrust an Ethereum node (who is your gateway into the Ethereum network) with your credentials either.

    The proper way to handle user accounts in your mobile applications is to do client side account management, everything self-contained within your own application. This way you can ensure as fine grained (or as coarse) access permissions to the sensitive data as deemed necessary, without relying on any third party application's functionality and/or vulnerabilities.

    To support this, go-ethereum provides a simple, yet thorough accounts library that gives you all the tools to do properly secured account management via encrypted keystores and passphrase protected accounts. You can leverage all the security of the go-ethereum crypto implementation while at the same time running everything in your own application.

    Encrypted keystores

    Although handling your users' accounts locally on their own mobile device does provide certain security guarantees, access keys to Ethereum accounts should never lay around in clear-text form. As such, we provide an encrypted keystore that provides the proper security guarantees for you without requiring a thorough understanding from your part of the associated cryptographic primitives.

    The important thing to know when using the encrypted keystore is that the cryptographic primitives used within can operate either in standard or light mode. The former provides a higher level of security at the cost of increased computational burden and resource consumption:

    • standard needs 256MB memory and 1 second processing on a modern CPU to access a key
    • light needs 4MB memory and 100 millisecond processing on a modern CPU to access a key

    As such, light is more suitable for mobile applications, but you should be aware of the trade-offs nonetheless.

    For those interested in the cryptographic and/or implementation details, the key-store uses the secp256k1 elliptic curve as defined in the Standards for Efficient Cryptographylibsecp256k library and wrapped by Web3 Secret Storage format.


    Keystores on Android (Java)

    The encrypted keystore on Android is implemented by the AccountManager class from the org.ethereum.geth package. The configuration constants (for the standard or light security modes described above) are located in the Geth abstract class, similarly from the org.ethereum.geth package. Hence to do client side account management on Android, you'll need to import two classes into your Java code:

    import org.ethereum.geth.AccountManager;
    import org.ethereum.geth.Geth;

    Afterwards you can create a new encrypted account manager via:

    AccountManager am = new AccountManager("/path/to/keystore", Geth.LightScryptN, Geth.LightScryptP);

    The path to the keystore folder needs to be a location that is writable by the local mobile application but non-readable for other installed applications (for security reasons obviously), so we'd recommend placing it inside your app's data directory. If you are creating the AccountManager from within a class extending an Android object, you will most probably have access to the Context.getFilesDir() method via this.getFilesDir(), so you could set the keystore path to this.getFilesDir() + "/keystore".

    The last two arguments of the AccountManager constructor are the crypto parameters defining how resource-intensive the keystore encryption should be. You can choose between Geth.StandardScryptN, Geth.StandardScryptP, Geth.LightScryptN, Geth.LightScryptP or specify your own numbers (please make sure you understand the underlying cryptography for this). We recommend using the light version.

    Keystores on iOS (Swift 3)

    The encrypted keystore on iOS is implemented by the GethAccountManager class from the Geth framework. The configuration constants (for the standard or light security modes described above) are located in the same namespace as global variables. Hence to do client side account management on iOS, you'll need to import the framework into your Swift code:

    import Geth

    Afterwards you can create a new encrypted account manager via:

    let am = GethNewAccountManager("/path/to/keystore", GethLightScryptN, GethLightScryptP);

    The path to the keystore folder needs to be a location that is writable by the local mobile application but non-readable for other installed applications (for security reasons obviously), so we'd recommend placing it inside your app's document directory. You should be able to retrieve the document directory via let datadir = NSSearchPathForDirectoriesInDomains(.documentDirectory, .userDomainMask, true)[0], so you could set the keystore path to datadir + "/keystore".

    The last two arguments of the GethNewAccountManager factory method are the crypto parameters defining how resource-intensive the keystore encryption should be. You can choose between GethStandardScryptN, GethStandardScryptP, GethLightScryptN, GethLightScryptP or specify your own numbers (please make sure you understand the underlying cryptography for this). We recommend using the light version.

    Account lifecycle

    Having created an encrypted keystore for your Ethereum accounts, you can use this account manager for the entire account lifecycle requirements of your mobile application. This includes the basic functionality of creating new accounts and deleting existing ones; as well as the more advanced functionality of updating access credentials, exporting existing accounts, and importing them on another device.

    Although the keystore defines the encryption strength it uses to store your accounts, there is no global master password that can grant access to all of them. Rather each account is maintained individually, and stored on disk in its encrypted format individually, ensuring a much cleaner and stricter separation of credentials.

    This individuality however means that any operation requiring access to an account will need to provide the necessary authentication credentials for that particular account in the form of a passphrase:

    • When creating a new account, the caller must supply a passphrase to encrypt the account with. This passphrase will be required for any subsequent access, the lack of which will forever forfeit using the newly created account.
    • When deleting an existing account, the caller must supply a passphrase to verify ownership of the account. This isn't cryptographically necessary, rather a protective measure agaist accidental loss of accounts.
    • When updating an existing account, the caller must supply both current and new passphrases. After completing the operation, the account will not be accessible via the old passphrase any more.
    • When exporting an existing account, the caller must supply both the current passphrase to decrypt the account, as well as an export passphrase to re-encrypt it with before returning the key-file to the user. This is required to allow moving accounts between devices without sharing original credentials.
    • When importing a new account, the caller must supply both the encryption passphrase of the key-file being imported, as well as a new passhprase with which to store the account. This is required to allow storing account with different credentials than used for moving them around.

    Please note, there is no recovery mechanisms for losing the passphrases. The cryptographic properties of the encrypted keystore (if using the provided parameters) guarantee that account credentials cannot be brute forced in any meaningful time.

    Accounts on Android (Java)

    An Ethereum account on Android is implemented by the Account class from the org.ethereum.geth package. Assuming we already have an instance of an AccountManager called am from the previous section, we can easily execute all of the described lifecycle operations with a handful of function calls.

    // Create a new account with the specified encryption passphrase.
    Account newAcc = am.newAccount("Creation password");
    // Export the newly created account with a different passphrase. The returned
    // data from this method invocation is a JSON encoded, encrypted key-file.
    byte[] jsonAcc = am.exportKey(newAcc, "Creation password", "Export password");
    // Update the passphrase on the account created above inside the local keystore.
    am.updateAccount(newAcc, "Creation password", "Update password");
    // Delete the account updated above from the local keystore.
    am.deleteAccount(newAcc, "Update password");
    // Import back the account we've exported (and then deleted) above with yet
    // again a fresh passphrase.
    Account impAcc = am.importKey(jsonAcc, "Export password", "Import password");

    Although instances of Account can be used to access various information about specific Ethereum accounts, they do not contain any sensitive data (such as passphrases or private keys), rather act solely as identifiers for client code and the keystore.

    Accounts on iOS (Swift 3)

    An Ethereum account on iOS is implemented by the GethAccount class from the Geth framework. Assuming we already have an instance of an GethAccountManager called am from the previous section, we can easily execute all of the described lifecycle operations with a handful of function calls.

    // Create a new account with the specified encryption passphrase.
    let newAcc = try! am?.newAccount("Creation password")
    // Export the newly created account with a different passphrase. The returned
    // data from this method invocation is a JSON encoded, encrypted key-file.
    let jsonKey = try! am?.exportKey(newAcc!, passphrase: "Creation password", newPassphrase: "Export password")
    // Update the passphrase on the account created above inside the local keystore.
    try! am?.update(newAcc, passphrase: "Creation password", newPassphrase: "Update password")
    // Delete the account updated above from the local keystore.
    try! am?.delete(newAcc, passphrase: "Update password")
    // Import back the account we've exported (and then deleted) above with yet
    // again a fresh passphrase.
    let impAcc  = try! am?.importKey(jsonKey, passphrase: "Export password", newPassphrase: "Import password")

    Although instances of GethAccount can be used to access various information about specific Ethereum accounts, they do not contain any sensitive data (such as passphrases or private keys), rather act solely as identifiers for client code and the keystore.

    Signing authorization

    As mentioned above, account objects do not hold the sensitive private keys of the associated Ethereum accounts, but are merely placeholders to identify the cryptographic keys with. All operations that require authorization (e.g. transaction signing) are performed by the account manager after granting it access to the private keys.

    There are a few different ways one can authorize the account manager to execute signing operations, each having its advantages and drawbacks. Since the different methods have wildly different security guarantees, it is essential to be clear on how each works:

    • Single authorization: The simplest way to sign a transaction via the account manager is to provide the passphrase of the account every time something needs to be signed, which will ephemerally decrypt the private key, execute the signing operation and immediately throw away the decrypted key. The drawbacks are that the passphrase needs to be queried from the user every time, which can become annoying if done frequently; or the application needs to keep the passphrase in memory, which can have security consequences if not done properly; and depending on the keystore's configured strength, constantly decrypting keys can result in non-negligible resource requirements.
    • Multiple authorizations: A more complex way of signing transactions via the account manager is to unlock the account via its passphrase once, and allow the account manager to cache the decrypted private key, enabling all subsequent signing requests to complete without the passphrase. The lifetime of the cached private key may be managed manually (by explicitly locking the account back up) or automatically (by providing a timeout during unlock). This mechanism is useful for scenarios where the user may need to sign many transactions or the application would need to do so without requiring user input. The crucial aspect to remember is that anyone with access to the account manager can sign transactions while a particular account is unlocked (e.g. device left unattended; application running untrusted code).

    Note, creating transactions is out of scope here, so the remainder of this section will assume we already have a transaction hash to sign, and will focus only on creating a cryptographic signature authorizing it. Creating an actual transaction and injecting the authorization signature into it will be covered later.

    Signing on Android (Java)

    Assuming we already have an instance of an AccountManager called am from the previous sections, we can create a new account to sign transactions with via it's already demonstrated newAccount method; and to avoid going into transaction creation for now, we can hard-code a random Hash to sign instead.

    // Create a new account to sign transactions with
    Account signer = am.newAccount("Signer password");
    Hash    txHash = new Hash("0x0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef");

    With the boilerplate out of the way, we can now sign transaction using the authorization methods described above:

    // Sign a transaction with a single authorization
    byte[] signature = am.signPassphrase(signer, "Signer password", txHash.getBytes());
    // Sign a transaction with multiple manually cancelled authorizations
    am.unlock(signer, "Signer password");
    signature = am.sign(signer.getAddress(), txHash.getBytes());
    am.lock(signer.getAddress());
    // Sign a transaction with multiple automatically cancelled authorizations
    am.timedUnlock(signer, "Signer password", 1000000000); // 1 second in nanoseconds
    signature = am.sign(signer.getAddress(), txHash.getBytes());

    You may wonder why signPassphrase takes an Account as the signer, whereas sign takes only an Address. The reason is that an Account object may also contain a custom key-path, allowing signPassphrase to sign using accounts outside of the keystore; however sign relies on accounts already unlocked within the keystore, so it cannot specify custom paths.

    Signing on iOS (Swift 3)

    Assuming we already have an instance of a GethAccountManager called am from the previous sections, we can create a new account to sign transactions with via it's already demonstrated newAccount method; and to avoid going into transaction creation for now, we can hard-code a random Hash to sign instead.

    // Create a new account to sign transactions with
    let signer = try! am?.newAccount("Signer password")
    var error: NSError?
    let txHash = GethNewHashFromHex("0x0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef", &error)

    Note, although Swift usually rewrites NSError returns to throws, this particular instance seems to have been missed for some reason (possibly due to it being a constructor). It will be fixed in a later version of the iOS bindings when the appropriate fixed are implemented upstream in the gomobile project.

    With the boilerplate out of the way, we can now sign transaction using the authorization methods described above:

    // Sign a transaction with a single authorization
    var signature = try! am?.signPassphrase(signer, passphrase: "Signer password", hash: txHash?.getBytes())
    // Sign a transaction with multiple manually cancelled authorizations
    try! am?.unlock(signer, passphrase: "Signer password")
    signature = try! am?.sign(signer?.getAddress(), hash: txHash?.getBytes())
    try! am?.lock(signer?.getAddress())
    // Sign a transaction with multiple automatically cancelled authorizations
    try! am?.timedUnlock(signer, passphrase: "Signer password", timeout: 1000000000) // 1 second in nanoseconds
    signature = try! am?.sign(signer?.getAddress(), hash: txHash?.getBytes())

    You may wonder why signPassphrase takes a GethAccount as the signer, whereas sign takes only a GethAddress. The reason is that a GethAccount object may also contain a custom key-path, allowing signPassphrase to sign using accounts outside of the keystore; however sign relies on accounts already unlocked within the keystore, so it cannot specify custom paths.



  • EDCON (European Ethereum Development Conference)  17-18 Feb 2017.

    What is EDCON?

    A: The EDCON (European Ethereum Development Conference) is a conference that will take place in Paris at the Ecole Supérieure de Commerce de Paris - Europe (ESCP) on 17-18 February 2017. The conference will cover Ethereum technology and research (PoS, Scalability, Privacy) as well as Ethereum-based applications.

    Q: Who is organizing EDCON?

    A: EDCON is organized by LinkTime with the support and cooperation of developers from the Ethereum Foundation and the wider Ethereum community, as well as ADETIF, the Ecole Supérieure de Commerce de Paris - Europe (ESCP), the Asseth (French Ethereum, nonprofit), La ChainTech (French blockchain, nonprofit).

    Q: What topics will be discussed at EDCON?

    A: Talks will cover:     * Ethereum base-layer technology     * Ethereum research (PoS, scalability)     * Privacy (eg. zero knowledge proofs)     * Formal verification     * Ethereum consortium chain development     * Applications being built on Ethereum     * Growing the Ethereum community

    Q: How do I buy a ticket?

    A: There are two types of tickets: standard tickets and VIP tickets. Standard Ticket: Able to attend the Main Conferences, morning community event and activities of Ethereum Community in Subconference. VIP Ticket: 100 Tickets only. Able to attend the Main Conferences, Cocktail Party and activities of Ethereum Community in Sub-conference. VIPs are able to build deeper connections with core developers, investors and entrepreneurs. Highly recommended for those who want communicate more with core participants. Tickets for the conference are available here: https://edcon.io/register.html.

    Speakers



    Vitalik Buterin

    Founder of Ethereum

    Ming Chan

    Ethereum Foundation

    Executive Director

    Nick Johnson

    Ethereum Core Developer

    Gavin Wood

    Ethcore/Parity founder

    Martin Becze

    Ethereum Core Developer

    Vlad Zamfir

    Ethereum Core Researcher

    Joseph Chow

    Ethereum Developer-ConsenSys


    Jeff Coleman

    Founder of Ledger Labs

    Ethereum State Channels Researcher

    Christoph Jentzsch

    Slock It CTO

    Ethereum Security Researcher

    Alex Beregszaszi

    Ethereum Core Developer

    Loi Luu

    Founder of SmartPool

    Heiko Hees

    Brainbot/Raiden CEO

    Rick Dudley

    Ethereum Mechanism Designer

    Omise Lab DAPP Architect

    Thomas Bertani

    Oraclize CEO

    Jae Kwon

    Founder of Tendermint & Cosmos

    Nicolas Bacca

    Ledger CTO

    Yoichi Hirai

    Formal Verification Engineer at Ethereum DEV

    Quentin de Beauchesne

    Ledgys.net Co-founder

    Griff Green

    Founder of Giveth


    For more Details:https://edcon.io/


Log in to reply
 

Looks like your connection to Cryptocentral was lost, please wait while we try to reconnect.