March 26, 2024, 4:44 a.m. | Gabriele Morello, Mojtaba Eshghie, Sofia Bobadilla, Martin Monperrus

cs.LG updates on arXiv.org arxiv.org

arXiv:2403.16861v1 Announce Type: cross
Abstract: The DISL dataset features a collection of $514,506$ unique Solidity files that have been deployed to Ethereum mainnet. It caters to the need for a large and diverse dataset of real-world smart contracts. DISL serves as a resource for developing machine learning systems and for benchmarking software engineering tools designed for smart contracts. By aggregating every verified smart contract from Etherscan up to January 15, 2024, DISL surpasses existing datasets in size and recency.

abstract arxiv collection cs.dc cs.lg cs.se dataset diverse ethereum features files learning systems machine machine learning research smart smart contracts systems type world

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Principal Data Engineering Manager

@ Microsoft | Redmond, Washington, United States

Machine Learning Engineer

@ Apple | San Diego, California, United States