With the rise of account abstraction and EIP-4337, there is an increasing demand for more convenient verification methods such as social login. In this blog post, we review several methods for validating ownership of smart contract accounts and propose a method for using oAuth 2.0 as a self-custodial authentication method.
The aim is to provide a secure method for linking a user’s web2 identity to their on-chain account, reducing the barriers to entry for newcomers to the Ethereum ecosystem. A system for validating ownership should solve the following problems:
There are currently three main methods to link a user's web2 identity to their on-chain account:
This article will review each of them, and then propose a fourth option: Multi-sig recovery using oAuth 2.0.
Many current solutions rely on MPC techniques through Lit Protocol and Web3Auth. These solutions use a network of nodes that verify the IdP assertion and co-sign using a threshold signature scheme. Although viable, these approaches introduce a lot of complexity. The details can be read in the above links, but given the context of account abstraction we’ll focus on other solutions that can leverage more of the EVM. After all, MPC is effectively an off-chain multi-sig.
Directly applying a multi-sig approach to oAuth validation could look like this:
In this solution, two out of three signatures are required, with the oAuth server, user's device, and a backup signer being the three signers. However, this approach assumes that the signers are set up so that the user is in control of their device and the remaining signers cannot be coerced into signing. Also, a transaction requires either the oAuth or backup signer to be online. This works, but is still a little clunky.
ERC-4337 allows smart contract accounts to set their validation logic directly. What if we could just use the assertion from the identity provider (IdP) directly?
This method verifies the IdP assertion directly, removing the need for multi-sig. When a user authenticates with oAuth, the IdP returns an ID token in the form of a JSON Web Token (JWT). The JWT signature can then be verified on-chain by the contract account, and with a valid ID token, it is assumed that the user has authenticated with the IdP and therefore owns the contract account. The contract account could call an independent singleton JWT signature contract that validates the public key of major oAuth providers.
Although this seems like an elegant solution, there are a few concerns that make it impractical.
The public key of most big IdPs are not static and rotate frequently. Therefore another concern to think about is how to update the public key stored on-chain so that users are not locked out after a key rotation.
This would require a deployment of a new singleton contract every time in order to meet storage access rules of the ERC-4337 canonical mempool. Alternatively, we could append a new key to the contract but then this would require the use of an alternative mempool due to the lookup process.
At face value, verifying directly with ID tokens would require a user log in every time they need to do a transaction. This would be clunky UX, though perhaps less clunky than the traditional multi-sig approach. However we can get around this with session keys. The JWT will usually have an expiry field. We can then generate a session key from the user’s browser/device which is only valid until the expiry date.
If for whatever reason the user looses the session key (e.g. got a new device or cleared browser data) then all they would need to do is re-authenticate with the IdP.
In the simplest scenario, this mechanism requires trust from the IdP and oAuth server. For less sophisticated users, trusting an IdP like Google or Facebook isn’t too far-fetched given wide use of social logins in web2.
We can however reduce this centralisation vector through other means. The OpenZeppelin team has already conducted an experiment with IdP verification and addressed this point:
"I can also imagine protections such as access only being allowed after no activity for a period of time e.g. 3 months."
In practice, the bigger trust assumption is in the oAuth server handling the flow. The JWT has an audience field and that must be equal to the client ID of the oAuth server. By ignoring the audience field we can have a trust-less setup where the oAuth server doesn’t matter. But this would open the system up to potential attack vectors which was also addressed in the OpenZeppelin experiment:
"The nonce is never exposed to the user as they Sign In With Google in the application, meaning that they could be inadvertently getting Google to sign a different Ethereum account if the app is malicious. This could even be done by any other application in which the user signs in with Google, which uses the JWT behind the scenes to gain access to the user’s Ethereum Identity."
In other words, by not trusting an oAuth server and checking the audience field, the contract account becomes vulnerable to any servers sending a malicious JWT that looks legit.
This method also assumes the IdP supports OIDC. The oAuth 2.0 framework is made for authorization and not authentication. OIDC was created as a thin identity layer built on top of oAuth 2.0 to support authentication. A platform that does not supports OIDC will not be able to send the JWT ID token that the contract validates against.
Not all platforms support OIDC, but most of the big services like Google, Facebook, Apple, Microsoft, and Twitter do support it. Without OIDC support another middleware dependency (like Auth0) would need to be added.
Verifying directly with the IdP is possible, but may have too many practical restrictions. However the pairing with a temporary key is an interesting concept. Instead we could also use the multi-sig approach only as a means for pairing a new key. Although in this case, it’s not necessarily a session key.
This method uses an a minimum 2/2 multi-sig to assign a new key which is generated by the user’s browser or device. After this point the key can be used for all transactions. If, for instance, the browser data is cleared then the same authentication process can be used to associate a new key.
Another added advantage is that from the contract’s point of view, this is effectively social recovery. If the user wanted to stop using oAuth they can simply remove them as signers and add other recovery setups.
Note that the second signer must also have a way to authenticate the user that cannot be faked by the oAuth signer. Otherwise this effectively means the oAuth signer is a custodian with extra steps.
The simplest solution is if the second signer verifies that the ID token from the oAuth process is signed by the IdP. This is the only way without resulting to additional authentication methods.
Also note that in this case the oAuth signer cannot maliciously create a valid ID token on its own. In which case the second signer can at least trust that the user has successfully signed in to the IdP through the oAuth server.
This is how the MPC nodes in Web3Auth validate user identity. But in this case we don’t need to solve all the complexities around threshold cryptography since we have the EVM. Neither oAuth and second signer can trick each other into signing. The user must use the oAuth server to authenticate with the IdP and get a valid ID token which both entities can verify.
The only trust assumption is that the IdP itself does not act maliciously. But this is true for any social login.
We can also increase redundancy to 2/3 if we include another independent party. For example, the signer list could look like:
In practice EVM multi-sig with key seems like the best approach and an alternative to MPC. Although verifying directly with an IdP may seem the simplest in terms of architecture, in practice it may not be feasible given key rotations and required trust assumptions. Instead we can combine concepts from both the simple EVM multi-sig and direct IdP verification to have something that is more practical.