Asset Details
MbrlCatalogueTitleDetail
Do you wish to reserve the book?
Automated Classification of Crime Narratives Using Machine Learning and Language Models in Official Statistics
by
Agloni, Ignacio
, Berhó, Nicolás
, Lehmann, Klaus
, Diaz, Oswaldo
, Preuss, Javiera
, Villaseñor, Elio
, Pimentel, Alejandro
in
Algorithms
/ automated coding
/ Automation
/ Bank fraud
/ Burglary
/ Chile
/ Classification
/ Computational linguistics
/ Crime
/ Crime analysis
/ Criminal statistics
/ Deep learning
/ Dictionaries
/ Economic activity
/ Households
/ Language
/ language model
/ Language processing
/ Machine learning
/ Methods
/ Natural language interfaces
/ NLP
/ Robbery
/ Spain
/ Statistics
/ Technology application
/ Vandalism
/ Victimization
/ Workloads
2025
Hey, we have placed the reservation for you!
By the way, why not check out events that you can attend while you pick your title.
You are currently in the queue to collect this book. You will be notified once it is your turn to collect the book.
Oops! Something went wrong.
Looks like we were not able to place the reservation. Kindly try again later.
Are you sure you want to remove the book from the shelf?
Automated Classification of Crime Narratives Using Machine Learning and Language Models in Official Statistics
by
Agloni, Ignacio
, Berhó, Nicolás
, Lehmann, Klaus
, Diaz, Oswaldo
, Preuss, Javiera
, Villaseñor, Elio
, Pimentel, Alejandro
in
Algorithms
/ automated coding
/ Automation
/ Bank fraud
/ Burglary
/ Chile
/ Classification
/ Computational linguistics
/ Crime
/ Crime analysis
/ Criminal statistics
/ Deep learning
/ Dictionaries
/ Economic activity
/ Households
/ Language
/ language model
/ Language processing
/ Machine learning
/ Methods
/ Natural language interfaces
/ NLP
/ Robbery
/ Spain
/ Statistics
/ Technology application
/ Vandalism
/ Victimization
/ Workloads
2025
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
Do you wish to request the book?
Automated Classification of Crime Narratives Using Machine Learning and Language Models in Official Statistics
by
Agloni, Ignacio
, Berhó, Nicolás
, Lehmann, Klaus
, Diaz, Oswaldo
, Preuss, Javiera
, Villaseñor, Elio
, Pimentel, Alejandro
in
Algorithms
/ automated coding
/ Automation
/ Bank fraud
/ Burglary
/ Chile
/ Classification
/ Computational linguistics
/ Crime
/ Crime analysis
/ Criminal statistics
/ Deep learning
/ Dictionaries
/ Economic activity
/ Households
/ Language
/ language model
/ Language processing
/ Machine learning
/ Methods
/ Natural language interfaces
/ NLP
/ Robbery
/ Spain
/ Statistics
/ Technology application
/ Vandalism
/ Victimization
/ Workloads
2025
Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy
We have requested the book for you!
Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.
Oops! Something went wrong.
Looks like we were not able to place your request. Kindly try again later.
Automated Classification of Crime Narratives Using Machine Learning and Language Models in Official Statistics
Journal Article
Automated Classification of Crime Narratives Using Machine Learning and Language Models in Official Statistics
2025
Request Book From Autostore
and Choose the Collection Method
Overview
This paper presents the implementation of a language model–based strategy for the automatic codification of crime narratives for the production of official statistics. To address the high workload and inconsistencies associated with manual coding, we developed and evaluated three models: an XGBoost classifier with bag-of-words features and word embeddings features, an LSTM network using pretrained Spanish word embeddings as a language model, and a fine-tuned BERT language model (BETO). Deep learning models outperformed the traditional baseline, with BETO achieving the highest accuracy. The new ENUSC (Encuesta Nacional Urbana de Seguridad Ciudadana) workflow integrates the selected model into an API for automated classification, incorporating a certainty threshold to distinguish between cases suitable for automation and those requiring expert review. This hybrid strategy led to a 68.4% reduction in manual review workload while preserving high-quality standards. This study represents the first documented application of deep learning for the automated classification of victimization narratives in official statistics, demonstrating its feasibility and impact in a real-world production environment. Our results demonstrate that deep learning can significantly improve the efficiency and consistency of crime statistics coding, offering a scalable solution for other national statistical offices.
This website uses cookies to ensure you get the best experience on our website.