1_introduction.tex 3.5 KB

12345678910111213141516171819
  1. \section{Introduction}
  2. \label{sec:introduction}
  3. On any public blockchain, the cost of creating a new wallet is virtually zero, enabling the same entity to manage several pseudonymous addresses. The pseudonymity underpinning blockchains like Bitcoin \citep{nakamoto2008bitcoin} and Ethereum \citep{buterin2013ethereum} breeds a sense of privacy. This often leads to misuse \citep{christin2013traveling}, such as money laundering through a large number of addresses \citep{moser2013inquiry}, or unfair voting power distributed among multiple addresses owned by the same user. Thus, it is of interest in many investigations to identify addresses linked to the same entity. This is predominantly done through heuristics. Every transaction an address makes on a blockchain is recorded and public, revealing information about the underlying entity. As such, with graph analysis tools, one can cluster addresses together that, with reasonable confidence, possess the same owner.
  4. Such anonymity tools have been widely explored for Bitcoin \cite{haslhofer2016bitcoin}, leveraging heuristics targeting the unspent transaction output (UTXO) model. However, this has limited application to more recent blockchain implementations like Ethereum, which forgo the UTXO model for an account (or sometimes balance) model.
  5. Ethereum, in particular, has an account-based protocol that implicitly encourages an entity to reuse a handful of addresses.
  6. As such, this poses greater challenges to user privacy than UTXO-based blockchains.
  7. In response to this shortcoming, several coin mixing protocols have been proposed like M\"{o}bius \citep{meiklejohn2018mobius}, MixEth \citep{seres2019mixeth}, and Tornado Cash \citep{pertsev2019tornado} to obfuscate transaction tracing, the final of which is deployed in practice.
  8. % As this has limited application to more recent blockchains like Ethereum, new heuristics \citep{victor2020address,beres2021blockchain} have surfaced focusing on graph analysis of transactions between addresses.
  9. Still, new heuristics have surfaced \citep{victor2020address,beres2021blockchain} that deanonymize Ethereum users. These heuristics largely exist in academic silos, and not been combined nor demonstrated in public application.
  10. \paragraph{Our contributions.} We develop a web application that combines several state-of-the-art heuristics to measure the anonymity of Ethereum addresses.
  11. To the best of our knowledge, this is the first instance to deploy these algorithms at scale.
  12. In doing so, we create a rich depiction of user behavior and privacy.
  13. We also propose a set of new heuristics targeted at Tornado Cash, highlighting that careless user behavior, despite using a mixer, can still reveal identity. A Python implementation is open sourced at \url{https://github.com/TutelaLabs/tutela-app} and the tool is available at \url{https://www.tutela.xyz}.
  14. \paragraph{Paper organization.} The rest of this paper is organized as follows. We provide some pertinent preliminaries in Section~\ref{sec:preliminaries}. In Section~\ref{sec:tutela}, we provide an overview of Tutela, our developed anonymity tool. In Section~\ref{sec:data}, we describe our data processing methods and the used datasets. In Section~\ref{sec:eth}, we describe two heuristics that allowed us to cluster Ethereum addresses that are likely owned by the same entity. In Section~\ref{sec:tornado}, we assess the privacy guarantees of Tornado Cash applying five novel heuristics. In Section~\ref{sec:analysis}, we provide a quantitative analysis. Finally, we conclude our work with some discussions and future work in Section~\ref{sec:discussion}.