In general a Hackathon is a social coding event that brings together computer programmers like software developers, analysts, designers and user interface specialists along with industry process experts and professionals to identify issues and create software solutions within a specific period of time.
Everybody is welcome to apply and be part of the Open Data Community (ODC) hackathon. We are looking for hackers, coders, makers but also for innovators, engineers, creative people, designers, data geeks and startups. Specifically, Data analysts and data scientists, software developers, and DevRel contributors are all well suited for the different bounties and contests of this hackathon.
Yes, we have prepared a dataset for your use that includes information about recent grants rounds with Unicef and with Fantom:
We have tied together all our datasets uploads here: https://market.oceanprotocol.com/profile/0x6fd78613E08FCB92890e65eA14450750aCAFF7b5 (opens in a new tab)
In the Anti-Sybil Dashboard bounty, are we expected to come up with new LEGOs or work with the existing ones?
You can work with existing ones! To get started, you might want to go with mock ones of course.
Yes, absolutely. It just has to be really good!
Where can I find an example dashboard screen that I may find useful in developing legos for the OpenData Community Hackathon?
For a fancy version, you can see the docs from TrustaLabs. They just launched a commercial Sybil scoring project and are one of our sponsors here as well.
More importantly,Program managers want to know how is my round doing and Is it getting attacked! Here are some of the lego ideas.. https://docs.google.com/spreadsheets/d/1sMgm3cg3pfMvRbmrteknpu44qsyuQlvn3vizgwnOgOU/edit#gid=2020378185 (opens in a new tab)
https://v2-docs.zksync.io/dev/fundamentals/zkSync.html#prerequisites (opens in a new tab)zkSync is a layer2 solution for transferring Ether and ERC20 tokens. The L2 protocol positions itself as a scaling and privacy engine for Ethereum. The project is built on zero Knowledge (ZK) rollup architecture with the idea of “unlimited” Ethereum scaling.
Ethereum scaling is dependent on addressing inherent drawbacks of Ethereum - slow transactions and high gas fees due to limited throughput(i.e the amount of transactions that can happen at any given period of time).
To ensure “unlimited” Ethereum scaling with zkSync, computation is performed off-chain and most data is stored off-chain. As all transactions are proven using what is known as validity proofs on the Ethereum mainchain, and users enjoy the same level of security as in Ethereum. What this allows is to batch many transactions together (i.e bulk transactions) and send them to the L1 (i.e Ethereum). Currently, a batch is guaranteed to be able to successfully process a max of 50 transactions.
There are many layer 2 solutions to solve Ethereum’s congestion problem. One of these solutions are rollups: ZK rollups (ex. zkSync) and Optimistic rollups. The rollup is basically an Ethereum extension designed to increase scalability. The extension rolls up many transactions into one batch and sends them all to Ethereum in one action. In other words, a roll-up block is a summary of changes reflecting all transactions in a single batch.
Collect Transactions > Generate Proofs > Send to L1
A Sybil attack is a kind of security threat on an online system where one person tries to take over the network by creating multiple accounts, nodes or computers. This can be as simple as one person creating multiple social media accounts, but in the world of cryptocurrencies, a more relevant example is where somebody runs multiple nodes on a blockchain network.
Attackers may be able to out-vote the honest nodes on the network if they create enough fake identities (or Sybil identities). They can then refuse to receive or transmit blocks, effectively blocking other users from a network.
In really large-scale Sybil attacks, where the attackers manage to control the majority of the network computing power or hash rate, they can carry out a 51% attack. In such cases, they may change the ordering of transactions, and prevent transactions from being confirmed. They may even reverse transactions that they made while in control, which can lead to double spending.
Over the years, computer scientists have dedicated a lot of time and research to figure out how to detect and prevent Sybil attacks, with varying degrees of effectiveness. For now, there’s no guaranteed defense.
A good question to ask is: In what ways can zkSync (ZK Rollups) help reduce Sybil attacks, or help increase the risks of a network having a Sybil attack?
For further reading:
- https://go.gitcoin.co/blog/a-community-based-roadmap-for-sybil-detection-across-web-3 (opens in a new tab)
- https://www.youtube.com/watch?v=-EKhIBUQjcA (opens in a new tab)
- https://www.youtube.com/watch?v=_VolZn0y-FM (opens in a new tab)
- https://vitalik.ca/general/2021/01/05/rollup.html (opens in a new tab)
Ocean Protocol allows for the decentralized sharing of data and algorithms.
DataBuilder Hackathon participants are strongly encouraged to use the Ocean Protocol. It is free, other than minor gas fees, and preferred by the judges as an easy way to show your commitment to resisting centralization as the data layer.
Ocean provides a free and easy way to decentralize access control to algorithms and/or deploy ERC725Y as soulbound tokens for preventing misuse of DAO proposal mechanisms.
Ocean Protocol has authored a step-by-step guide for DataBuilder Hackathon participants here:
https://github.com/oceanprotocol/data-challenges/blob/main/DataBuilders%20Hackathon.md (opens in a new tab)
Ocean Protocol experts will also be available on the DataBuilder hackathon Discord within the OpenData Community.
And the Ocean Protocol has extensive documentation here: https://docs.oceanprotocol.com/ (opens in a new tab)
NOTE: If you have any issues getting your data from ocean (maybe your download won't start), do try to download again, if it persist, we recommend using the Firefox browser and to be a bit more patient as it may take sometime.
The ODC team has detailed how best to contribute to the hackthon here : https://github.com/OpenDataforWeb3/Resources/blob/main/CONTRIBUTING.md (opens in a new tab)
Pocket Network is an open-source decentralized RPC network with a contributor-friendly ecosystem governed by a high-performance DAO. It allows Web3 developers to remove the most important layer of centralization in their projects: connection to the blockchain. Instead of using a centralized service, Pocket's decentralized network of tens of thousands of nodes is run by the community. DataBuilder Hackathon participants can use the Pocket Network to access blockchain data free from the centralization of many providers of blockchain nodes and data.
DataBuilder Hackathon participants are strongly encouraged to use Pocket Network.
All hackathon participants have access to the Pocket Network decentralized RPC endpoints. You can go to the Pocket Portal and sign up for a free account, and you'll be able to access any one of dozens of blockchains for up to 250,000 relays per day at no charge. https://www.portal.pokt.network/ (opens in a new tab)
Users will also have access to Pocket's robust RPC API, with connections available to Ethereum and all other major blockchains. More information about the Pocket Network APIs are available here: https://docs.pokt.network/api-docs/ (opens in a new tab) In order to query againsts these inputs you may choose to use True Blocks or another solution to create an index based on wallet ID.
Pocket team members will be on-hand in the OpenData Community Hackathon Discord channel throughout the event to answer any questions you might have.
On-chain data is the foundation of crypto, the ultimate source of truth. However, on-chain data is difficult to work with natively, without any transformations, in part because it is not indexed based on account. As a result in the past many data analysts would rely on centralized sources of information, such as Etherscan or various commercial solutions, because of their ease of use. However, these solutions are typically not open source or decentralized and are out of the control of the data analyst, introducing the risk of lock-in or capture and limiting the opportunities for customization and control.
TrueBlocks is an open-source project funded in part by the Gitcoin community and the Ethereum foundation that is dedicated to providing an index that improves with usage and that is thoroughly decentralized.
DataBuilder Hackathon participants should strongly consider using TrueBlocks to provide a local index that can be used to query an RPC from any source, including the decentralized Pocket Network.
To get started, participants can Docker run a local copy of the index. https://github.com/TrueBlocks/trueblocks-docker (opens in a new tab)
TrueBlocks engineers will be available on the DataBuilder Hackathon OpenData Community Discord.
TrueBlocks documentation which explains much more about how to build and contribute to the shared “unchained” index as well as how to use TrueBlocks and the index for data analysis is available here: https://trueblocks.io/docs/ (opens in a new tab)
A Sybil attack refers to a type of attack in which an attacker creates multiple fake identities, or "Sybils” to gain an unfair advantage within a decentralized network. This term is a reference to a book entitled Sybil (opens in a new tab). This book was also made into two television movies, the main character Sybil Dorsett endorsed different personalities.
In the context of systems/organizations, Sybil attacks happen when actions or rewards are calculated based on “one human, one identity”.
Fraudsters or 'Sybils' create multiple identities in an attempt to control (vote) or profit ( distributions of funds or rewards).
Fighting this kind of practice enables organizations to have democratic practices without the physical presence of its participants, for example, online voting in DAOs.
For more in-depth discussion you can check:
- 33 - Sybil Resistance with Bryan Ford (opens in a new tab) ( at 14:00 they start to discuss possibilities in a world where we would already solved the sybil attack problem)
Quadratic funding is a mechanism used by Gitcoin to fund public goods. Contributions to a grant are matched in a quadratic way by a matching fund.
Check this 60 sec video by Gitcoin to better understand and come back here:
Nice, now that you understood, is easy to get why fraudsters create multiple accounts. The idea is simple, make really small contributions to their own project from different accounts in an attempt to maximize ROI, as the quadratic funding algorithm prioritizes distribution of matching funds pool to projects that received donations from the biggest number of community members.
Similarly, the quadratic mechanism can also be applied to voting (quadratic voting) where each voter votes with some tokens, then fraudsters are incentivized to vote for a proposal from many addresses rather than a single address.
The price (or cost) of forgery, as it says, is the cost paid to forge an identity in a system. There are a lot of discussions about economic games involved in Sybil defense. Most systems are designed to develop a high cost of forgery in an attempt to disincentivize Sybil attacks, making the attack expensive and not profitable. Some discussions argue that this could stop Sybil attacks focused on profit but if the motivation is other, like ideological, the cost could be not a barrier.
In ep 36 of Green Pill podcast, Kevin Owocki and Petr Porobov, founder of Upala, have some discussion around the topic and how Upala is trying to understand the real price of forgery using incentives.
- Gitcoin Fraud Defense and Detection Forum (opens in a new tab)
- Green Pill Podcast (opens in a new tab)- Season 2 is dedicated to regenerative society, some episodes are focused on Digital Identities and Sybil Fighting
- BlockScience Blog (opens in a new tab)- Articles with analyses of passed Gitcoin Grants Rounds and other complex system analyses.
- Prof. Brain Ford Papers (opens in a new tab) - Prof Brian Ford has studies on decentralized systems, identity and blockchain. Special highlight to: Identity and Personhood in Digital Democracy: Evaluating Inclusion, Equality, Security, and Privacy in Pseudonym Parties and Other Proofs of Personhood
LEGO is a concept used in the context of software development and refers to small, modular, and reusable pieces of code that can be combined to create new applications. Different Legos are used to check various user attributes and behaviors, such as username similarity, shared IP addresses, donor and grant profiles, and on-chain interactions, to determine the likelihood of Sybil behavior. New Legos can be developed based on any analysis that can be shown to indicate Sybil behavior and can be implemented as an algorithm. Legos need to be tightly scoped, open and accessible, permissionless, have few dependencies, be modular, have open governance, take well defined inputs, and provide certain known outputs in order to be composable. ERC-20/721 token standards and Ethereum smart contracts are two examples of assembleable Legos. By giving the community the resources for innovative new solutions, the objective is to introduce freedom and agency to the funding of public goods. The gitcoin passport is one example of aUp Lego.
An anti-Sybil tool called the Gitcoin Passport uses stamps to act as a means of identification. For the retroactive squelching of Sybils, a trained machine learning pipeline is employed, but it is centralized and operated by a small group of specialists. By creating a standardized framework for modeling and model auditing, users will be able to train their own models, turning this into a set of assembleable Sybil defensive legos. To counter Sybil assaults, additional tools can be created, such as rules for resolving conflicts and governance legos. The decision to use Gitcoin models or train one's own models rests with the user.
Levenshtein distance: This method calculates the difference between two strings of text, such as username, to determine their similarity. If a username is too similar to others, it may indicate that the account is auto-generated or a Sybil account.
Shared IP: This method checks the IP addresses of users to see if they are shared by many other users. If many addresses are originating from the same IP, it could be a marker for a Sybil attacker.
SAD (Social Attribute DNA) model: This method analyzes the history of a Gitcoin account to give a Sybil-likelihood score. The model takes into account factors such as the frequency and pattern of the user's activities on the platform.
DonorDNA: This method analyzes the profile of past donations made by a donor to determine if it is similar to other groups of users. If the profile is similar, it may indicate that the donor is part of a Sybil ring.
GrantDNA: This method represents each grant as a set of binary data, which is then compared to flagged grants to see if they have similar donor profiles. If a grant has a donor profile that is similar to a flagged grant, it may indicate that the grant is being manipulated.
Onchain Intersectionality: This method checks the number of on-chain credentials a user has, such as Ethereum addresses, wallet IDs, and grant/round nonce. The more credentials a user has, the higher the likelihood that the user is a Sybil attacker
There are many possibilities for more Legos - any analysis that can be shown to be indicative of Sybil behaviour and implementable as an algorithm could be turned into a Lego. Some might be relatively complex analyses of on-chain data, like detecting when a user has rapidly swapped funds back and forth in order to seem more active. Sequences of rapid transactions between wallets, especially when they are ultimately returned to where they started or at least stay within a small group of addresses, could be a Sybil behaviour. Other on-chain indicators might be whether a wallet was initially funded from a contract, as this might be a way for Sybil attackers to automate their donations from many externally-owned accounts.
Other simple Legos could be checks for users that hold certain POAPs or NFTs. Some POAPs and NFTs are easy to farm, others require significant investment of time and/or capital. Below are some highlighted ideas
- Farmer Boolean: (uses on-chain data to determine whether a user has >X ERC-20 tokens and an average transaction value <Y ETH)
- Onchain History Boolean: (has a user engaged in certain web3 activities in a specific timeframe? Activites and timeframe can be customized by round owner)
- Money-Mixer: (Does a user interact with mixers e.g. Tornado cash)
- On-Trend / Off-Trend: (is the donation profile of a user similar to a grant’s target community?)
- Flagged Activity on Etherscan: (is an address closely associated with addresses flagged as phishing/spam on etherscan?)