Avaliação de LLMS na detecção de  vulnerabilidades em contratos inteligentes

Lis Loureiro Sousa; Cauê Rodrigues de Aguiar; Breno Novais Couto; Ângela Gabriele de Souza Silva; Hélio Lopes dos Santos

doi:10.22481/recic.v8i1.18263

Authors

Lis Loureiro Sousa Universidade Estadual do Sudoeste da Bahia https://orcid.org/0009-0000-2387-6433
Cauê Rodrigues de Aguiar Universidade Estadual do Sudoeste da Bahia https://orcid.org/0009-0002-1150-2375
Breno Novais Couto Universidade Estadual do Sudoeste da Bahia https://orcid.org/0009-0001-7046-9993
Ângela Gabriele de Souza Silva Universidade Estadual do Sudoeste da Bahia https://orcid.org/0009-0004-0912-2511
Hélio Lopes dos Santos Universidade Estadual do Sudoeste da Bahia https://orcid.org/0009-0007-7709-3830

DOI:

https://doi.org/10.22481/recic.v8i1.18263

Keywords:

smart contracts, LLMs, blockchain, security, vulnerabilities

Abstract

Smart contract auditing is essential to ensure the security of decentralized applications, preventing critical failures in immutable environments such as blockchains. This study aims to evaluate the effectiveness of Large Language Models (LLMs) in the automated detection of vulnerabilities in smart contracts. To this end, 40 contracts from different categories were analyzed by four LLMs — GPT-4o, DeepSeek-R1, Llama-3.3, and Gemini 2.0 Flash — using a unified prompt to extract and classify vulnerabilities by severity. Performance was measured using precision, recall, F1-score, and mean absolute error, comparing the detected vulnerabilities with reference audits. The best-performing model achieved 10.36% precision and 22.48% recall, indicating that LLMs still require improvements for reliable autonomous use.

Downloads

Download data is not yet available.

Author Biographies

Lis Loureiro Sousa, Universidade Estadual do Sudoeste da Bahia

Sou Lis Loureiro Sousa, graduanda em Direito e Ciência da Computação, com trajetória marcada pela interseção entre os campos jurídico e tecnológico. Tenho como propósito aliar técnica, inovação e compromisso social para enfrentar os desafios jurídicos contemporâneos.

Atualmente, sou estagiária na VERT Capital, onde atuo com foco em Direito Regulatório e Estruturação de Operações de Crédito, especialmente no mercado de capitais. Essa experiência tem ampliado minha compreensão sobre instrumentos financeiros complexos, securitização e inovações tecnológicas no mercado financeiro voltadas ao blockchain, conectando aspectos jurídicos e econômicos em operações estruturadas.

Paralelamente, desenvolvo pesquisa jurídica em parceria com a Procuradoria-Geral da Fazenda Nacional (PGFN) e a Universidade de Brasília (UnB), no âmbito de um projeto de iniciação científica voltado ao estudo das aplicações de inteligência artificial no modelo jurídico brasileiro. A investigação analisa os impactos positivos da IA e sua otimização no setor público, especialmente em demandas tributárias.

Apaixonada por estudar como o Direito pode se reinventar diante das transformações sociais e tecnológicas, busco contribuir para um sistema mais eficiente, justo e acessível. Tenho especial interesse em soluções tecnológicas aplicadas às áreas jurídica e administrativa, regulação de novas tecnologias e os efeitos jurídicos da digitalização das relações jurídicas.

Cauê Rodrigues de Aguiar, Universidade Estadual do Sudoeste da Bahia

Undergraduate student in Computer Science at the State University of Southwest Bahia (UESB). Has experience in software development, user experience (UX) design, and artificial intelligence. Worked as a chatbot developer intern, focusing on conversational flow design, REST API integration, and technical documentation. Was a software resident, developing solutions in cloud environments. Currently a research fellow at UESB, investigating Ethics and Artificial Intelligence in Computing Education. Member of the NUPEL/LABIDD-UFBA study group on AI and Digital Rights.

Breno Novais Couto, Universidade Estadual do Sudoeste da Bahia

I am a Computer Science student at the State University of Southwest Bahia (UESB), with experience in software development and digital solutions. Throughout my studies, I have developed skills in programming, databases, algorithms, and software engineering.

I have participated in several technological projects both within and outside the university, such as Embarcatech, where I worked with embedded systems. In the UESB journals department, I contributed to the development and maintenance of digital solutions on the DigiPub platform, aimed at improving online scientific publications.

Currently, I am part of PET – Digital Health, working on updating the MPI Brasil application in collaboration with healthcare professionals. The project aims to support the prescription and deprescription of medications for the elderly, promoting greater safety and impact in clinical practice.

Ângela Gabriele de Souza Silva, Universidade Estadual do Sudoeste da Bahia

I am currently pursuing a degree in Computer Science at the State University of Southwest Bahia (UESB), where I have been developing an interest in software development and web technologies. Throughout my studies, I have built a solid foundation in algorithms, data structures, and software engineering.

I have participated in projects such as working as a scholarship student in the UESB journals department, contributing to the DigiPub platform by supporting the maintenance and improvement of digital scientific publications. I am also a member of Coletivo Lovelace, an initiative aimed at promoting women’s participation in technology, where I collaborate in organizing events and activities in the field.

Hélio Lopes dos Santos, Universidade Estadual do Sudoeste da Bahia

Possui graduação em Ciência da Computação pela Universidade Federal de Mato Grosso (2000), mestrado (2003) e doutorado(2008) em Ciências da Computação pela Universidade Federal de Pernambuco . Tem experiência na área de Ciência da Computação, com ênfase em Banco de Dados e Engenharia de Software, atuando principalmente nos seguintes temas: mof, mof - meta object facility, metamodelo, metamodelagem e xml.

References

Z. Wei, J. Sun, Z. Zhang, and X. Zhang, “LLM-SmartAudit: Advanced Smart Contract Vulnerability Detection,” arXiv, Oct. 2024. Available: http://arxiv.org/abs/2410.09381

D. Perez and B. Livshits, “Smart contract vulnerabilities: Vulnerable does not imply exploited,” in 30th USENIX Security Symposium (USENIX Security 21), Aug. 2021, pp. 1325–1341. Available: https://www.usenix.org/conference/usenixsecurity21/presentation/perez

C. Chen, J. Su, J. Chen, Y. Wang, T. Bi, J. Yu, Y. Wang, X. Lin, T. Chen, and Z. Zheng, “When ChatGPT meets Smart Contract Vulnerability Detection: How far are we?,” ACM Transactions on Software Engineering and Methodology, Nov. 2024. Available: https://doi.org/10.1145/3702973

D. He, Z. Deng, Y. Zhang, S. Chan, Y. Cheng, and N. Guizani, “Smart Contract Vulnerability Analysis and Security Audit,” IEEE Network, vol. 34, no. 5, pp. 276–282, Jul. 2020. Available: https://doi.org/10.1109/MNET.001.1900656

H. Zhou, A. M. Fard, and A. Makanju, “The State of Ethereum Smart Contracts Security: Vulnerabilities, Countermeasures, and Tool Support,” Journal of Cybersecurity and Privacy, vol. 2, no. 2, pp. 358–378, May 2022. Available: https://doi.org/10.3390/jcp2020019

S. S. Kushwaha, S. Joshi, D. Singh, M. Kaur, and H.-N. Lee, “Systematic Review of Security Vulnerabilities in Ethereum Blockchain Smart Contract,” IEEE Access, vol. 10, pp. 6605–6621, Jan. 2022. Available: https://doi.org/10.1109/ACCESS.2021.3140091

S. Hu, T. Huang, F. İlhan, S. F. Tekin, and L. Liu, “Large Language Model-Powered Smart Contract Vulnerability Detection: New Perspectives,” arXiv, Jan. 2023. Available: https://arxiv.org/abs/2310.01152

I. David, L. Zhou, K. Qin, D. Song, L. Cavallaro, and A. Gervais, “Do You Still Need a Manual Smart Contract Audit?,” arXiv preprint arXiv:2306.12338, 2023.

Z. Xiao, Q. Wang, H. Pearce, and S. Chen, “Logic Meets Magic: LLMs Cracking Smart Contract Vulnerabilities,” arXiv, 2025. Available: https://arxiv.org/abs/2501.07058

W. Ma, D. Wu, Y. Sun, T. Wang, S. Liu, J. Zhang, Y. Xue, and Y. Liu, “Combining Fine-Tuning and LLM-Based Agents for Intuitive Smart Contract Auditing with Justifications,” arXiv preprint arXiv:2403.16073, 2024.

Y. Liu, Y. Xue, D. Wu, Y. Sun, Y. Li, M. Shi, and Y. Liu, “PropertyGPT: LLM-Driven Formal Verification of Smart Contracts Through Retrieval-Augmented Property Generation,” arXiv preprint arXiv:2405.02580, 2025, to appear in NDSS.

G. Iuliano and D. N. Dario, “Smart Contract Vulnerabilities, Tools, and Benchmarks: An Updated Systematic Literature Review,” arXiv, Dec. 2024. Available: http://arxiv.org/abs/2412.01719

P. Sahoo, A. K. Singh, S. Saha, V. Jain, S. Mondal, and A. Chadha, “A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications,” arXiv, Feb. 2024. Available: http://arxiv.org/abs/2402.07927

Quantstamp, “Athens Token Smart Contract Audit,” 2023. Available: https://certificate.quantstamp.com/full/athens.pdf

StormX, “Athens Token Smart Contracts,” 2023. Available: https://github.com/stormxio/athens-token/tree/7ed63ba12f03c4e7856eb5845a5a234d0f806bd2/contracts

Hacken, “CryptoToday ERC20/ERC1155 Voting Smart Contract Audit,” 2022. Available: https://hacken.io/audits/cryptotoday/sca-cryptotoday-erc20-erc1155-voting-feb2022/

CryptoToday, “CryptoToday Contracts,” 2022. Available: https://github.com/cryptotodaycom/contracts/tree/548c1ef24d996a3adc0557638601d099a5ef745d

Hacken, “Openware Yellow Network Smart Contract Audit,” 2023. Available: https://hacken.io/audits/openware-yellow-network/sca-yellow-network-erc20-mar2023/

Layer-3, “ClearSync Smart Contracts,” 2023. Available: https://github.com/layer-3/clearsync/tree/5b86a2134d295ac11af97d4f239782222e95fe24/contracts

Hacken, “ZKRace ERC20 Smart Contract Audit,” 2024. Available: https://hacken.io/audits/zkrace/sca-zkrace-erc20-mar2024/

Hacken, “Bloqhouse Technologies RWA Smart Contract Audit,” 2023. Available: https://hacken.io/audits/bloqhouse-technologies-b-v/sca-bloqhouse-technologies-rwa-mar2023/

A. Persson, “Token Shares Solidity Contracts,” 2023. Available: https://bitbucket.org/alfredpersson/token-shares-solidity/src/cbdc7c0d6162346b96cf62cb2ff93c15f416819e/

Hacken, “Ethereum Towers Staking Smart Contract Audit,” 2022. Available: https://hacken.io/audits/ethereum-towers/sca-ethereum-towers-staking-jun2022/

Ethereum Towers, “Ethereum Towers Contracts,” 2022. Available: https://github.com/ethereumtowers/contracts/tree/94eb48031a02455bb3c48285ffe41fbbe3498079

Quantstamp, “Tengoku Senso Smart Contract Audit Certificate,” 2023. Available: https://certificate.quantstamp.com/full/tengoku-senso/5361cc88-760a-4571-8284-7951b4dbbff4/index.html

A. Sharma, “TGK Smart Contracts Audit,” 2023. Available: https://github.com/AkshaySharma96/TGK-Smart-Contracts-Audit/tree/68f99d348ee637d90ba91b2996d1e132f7cf4268

Quantstamp, “Sequence Smart Wallet Audit Report,” 2023. Available: https://certificate.quantstamp.com/full/sequence-smart-wallet.pdf

xSequence, “Sequence Wallet Contracts,” 2023. Available: https://github.com/0xsequence/wallet-contracts/tree/7492cb33cea25696355a0e2a76f1fe9ea2adfbbd

QuillAudits, “Taiko Smart Contract Audit,” 2024. Available: https://www.quillaudits.com/leaderboard/taiko

Taiko Labs, “Taiko Mono Smart Contracts,” 2024. Available: https://github.com/taikoxyz/taiko-mono/tree/based%20contestable%20zkrollup

QuillAudits, “Meta Monkey Smart Contract Audit,” 2024. Available: https://www.quillaudits.com/leaderboard/meta-monkey

J. Li, G. Li, Y. Li, and Z. Jin, “Structured Chain-of-Thought Prompting for Code Generation,” arXiv, Jan. 2023. Available: https://arxiv.org/abs/2305.06599

A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever, et al., “Language Models are Unsupervised Multitask Learners,” OpenAI Blog, vol. 1, no. 8, 2019.

X. Zhao, M. Li, W. Lu, C. Weber, J. H. Lee, K. Chu, and S. Wermter, “Enhancing Zero-Shot Chain-of-Thought Reasoning in Large Language Models through Logic,” arXiv, Jan. 2023. Available: https://arxiv.org/abs/2309.13339

Y. Gao, Y. Xiong, X. Gao, K. Jia, J. Pan, Y. Bi, J. Sun, H. Wang, and H. Wang, “Retrieval-Augmented Generation for Large Language Models: A Survey,” arXiv preprint arXiv:2312.10997, vol. 2, no. 1, 2023.

J. R. Landis and G. G. Koch, “The Measurement of Observer Agreement for Categorical Data,” Biometrics, vol. 33, no. 1, p. 159, Mar. 1977. Available: https://doi.org/10.2307/2529310

T. Durieux, J. F. Ferreira, R. Abreu, and P. Cruz, “Empirical Review of Automated Analysis Tools on 47,587 Ethereum Smart Contracts,” in ICSE, Jun. 2020. Available: https://doi.org/10.1145/3377811.3380364

L. S. H. Colin, P. M. Mohan, J. Pan, and P. L. K. Keong, “An Integrated Smart Contract Vulnerability Detection Tool Using Multi-Layer Perceptron on Real-Time Solidity Smart Contracts,” IEEE Access, vol. 12, pp. 23 549–23 567, Jan. 2024. Available: https://doi.org/10.1109/ACCESS.2024.3364351

I. Amaro, A. Della Greca, R. Francese, G. Tortora, and C. Tucci, AI Unreliable Answers: A Case Study on ChatGPT, Jan. 2023. Available: https://doi.org/10.1007/978-3-031-35894-4_2

Y. Liu, Y. Xue, D. Wu, Y. Sun, Y. Li, M. Shi, and Y. Liu, “PropertyGPT: LLM-Driven Formal Verification of Smart Contracts Through Retrieval-Augmented Property Generation,” arXiv, May 2024. Available: http://arxiv.org/abs/2405.02580