Técnicas de aprendizado de máquinas aplicadas à classificação de decisões judiciais

Dimmy Magalhães; Aurora Pozo; Sidnei Machado

doi:10.19092/reed.v9.573

Artigos

v. 9 (2022): Revista de Estudos Empíricos em Direito

Técnicas de aprendizado de máquinas aplicadas à classificação de decisões judiciais

Dimmy Magalhães^▸^▾
Aurora Pozo^▸^▾
Sidnei Machado^▸^▾

pdf

DOI: https://doi.org/10.19092/reed.v9.573
Enviado: janeiro 27, 2021
Publicado: 2023-01-12

Resumo

A análise de processos judiciais é uma tarefa cara, que requer muito tempo de juizes e assessores, seja para tomar decisões, seja para classificar de acordo com a jurisprudência vigente. Porém, esse processo é repetitivo e extrair a semântica desse corpus pode ser uma etapa de apoio a esse processo. O objetivo desta pesquisa é desenvolver uma metodologia capaz de gerar automaticamente classificações de documentos jurídicos, utilizando técnicas de processamento de linguagem natural. Primeiramente, coletamos 430.000 sentenças de tribunais trabalhistas brasileiros de 2006 a 2018. Então propomos o uso de técnicas de geração de representação de palavras para representação de dados. Em seguida, usamos técnicas de agrupamento para agrupar semanticamente as decisões judiciais semelhantes. Finalmente, os grupos são usados para criar rótulos artificiais para cada documento. Por fim, utilizamos técnicas de classificação para produzir modelos capazes de captar a semântica do texto judicial. Os resultados são promissores na captura do contexto semântico dos textos jurídicos e, portanto, essa metodologia pode ser utilizada como suporte para o processo decisório brasileiro.

Referências

Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard,
Breiman, L. Random forests. Machine learning 45 (1): 5–32, 2001.
Criminisi, A., Shotton, J., Konukoglu, E., et al. Decision forests: A unified framework for classification, regression, density estimation, manifold learning and semi-supervised learning. Foundations and Trends R in Computer Graphics and Vision 7 (2–3): 81–227, 2012.
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint ArXiv:1810.04805 , 2018.
Fersini, E., Messina, E., Archetti, F., and Cislaghi, M. Semantics and machine learning: A new generation of court management systems. In International Joint Conference on Knowledge Discovery, Knowledge Engineering, and Knowledge Management. Springer, pp. 382–398, 2010.
Gardner, M. W. and Dorling, S. Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences. Atmospheric environment 32 (14-15): 2627–2636, 1998.
Gers, F. A., Schmidhuber, J., and Cummins, F. Learning to forget: Continual prediction with lstm, 1999.
Junior, E. S., Rotta, M., Vieira, P., da Silva, E. R. G., Rover, A. J., and Sell, D. Modelagem de sistema baseado em conhecimento em um tribunal de justiça utilizando commonkads. Revista Democracia Digital e Governo Eletrônico 2 (7), 2012.
Júnior, M. d. S. B. Proposta de modelo RBC para a recuperação inteligente de jurisprudência na Justiça Federal. Ph.D. thesis, Universidade Federal De Santa Catarina, Centro Tecnológico, 2001.
Justiça em Números. (2018). Conselho nacional de justiça.
Justiça em Números. (2019). Conselho nacional de justiça.
Justiça em Números. (2020). Conselho nacional de justiça.
Kim, Y. Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 , 2014.
Kruskal, W. H. and Wallis, W. A. Use of ranks in one-criterion variance analysis. Journal of the American statistical Association 47 (260): 583–621, 1952.
Le, Q. and Mikolov, T. Distributed representations of sentences and documents. In International conference on machine learning. pp. 1188–1196, 2014.
Lu, Q., Conrad, J. G., Al-Kofahi, K., and Keenan, W. Legal document clustering with built-in topic segmentation. In Proceedings of the 20th ACM international conference on Information and knowledge management. ACM, pp. 383–392, 2011.
M., et al. Tensorflow: A system for large-scale machine learning. In 12th fUSENIXg Symposium on Operating Systems Design and Implementation (fOSDIg 16). pp. 265–283, 2016.
MacQueen, J. et al. Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability. Vol. 1. Oakland, CA, USA, pp. 281–297, 1967.
Maia Filho, M. S. and Junquilho, T. A. Projeto victor: perspectivas de aplicação da inteligência artificial ao direito. Revista de Direitos e Garantias Fundamentais 19 (3): 218–237, 2018.
McCarty, L. T. Deep semantic interpretations of legal texts. In Proceedings of the 11th international conference on Artificial intelligence and law. ACM, pp. 217–224, 2007.
Mikolov, T., Chen, K., Corrado, G., and Dean, J. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 , 2013.
Rajpurkar, P., Zhang, J., Lopyrev, K., and Liang, P. Squad: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:1606.05250 , 2016.
Rousseeuw, P. J. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics vol. 20, pp. 53–65, 1987.
Simanjuntak, D. A., Ipung, H. P., Nugroho, A. S., et al. Text classification techniques used to faciliate cyber terrorism investigation. In 2010 Second International Conference on Advances in Computing, Control, and Telecommunication Technologies. IEEE, pp. 198–200, 2010.
Tribunal de Justiça de Rondônia. https://www.tjro.jus.br/, 2019a. Acessado: 21 maio 2019.
Tribunal Superior Eleitoral. https://www.tse.jus.br/, 2019b. Acessado: 15 fev. 2019.
Vapnik, V. The nature of statistical learning theory. Springer science & business media, 2013.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I. Attention is all you need. In Advances in neural information processing systems. pp. 5998–6008, 2017.
Wagh, R. S. Knowledge discovery from legal documents dataset using text mining techniques. International Journal of Computer Applications 66 (23), 2013.
Wagstaff, K., Cardie, C., Rogers, S., Schrödl, S., et al. Constrained k-means clustering with background knowledge. In Icml. Vol. 1. pp. 577–584, 2001.
Walker, V. R., Han, J. H., Ni, X., and Yoseda, K. Semantic types for computational legal reasoning: propositional connectives and sentence roles in the veterans’ claims dataset. In Proceedings of the 16th edition of the International Conference on Articial Intelligence and Law, ICAIL 2017, London, United Kingdom, June 12-16, 2017. pp. 217–226, 2017.
Wan, Y. and Gao, Q. An ensemble sentiment classification system of twitter data for airline services analysis. In 2015 IEEE international conference on data mining workshop (ICDMW). IEEE, pp. 1318–1325, 2015.
Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., and Bowman, S. R. Glue: A multi-task benchmark and analysis platform for natural language understanding. arXiv preprint arXiv:1804.07461 , 2018.
Zellers, R., Bisk, Y., Schwartz, R., and Choi, Y. Swag: A large-scale adversarial dataset for grounded commonsense inference. arXiv preprint arXiv:1808.05326 , 2018.

Downloads

Não há dados estatísticos.

Palavras-chave

aprendizado de maquina
classificação
agrupamento
direito

Como Citar

Magalhães, D., Pozo, A., & Machado, S. (2023). Técnicas de aprendizado de máquinas aplicadas à classificação de decisões judiciais. Revista De Estudos Empíricos Em Direito, 9. https://doi.org/10.19092/reed.v9.573

Este trabalho está licenciado sob uma licença Creative Commons Attribution 4.0 International License.