Factors Associated with Domestic Violence in Peru (2019-2021): an Approach from Data Science
Domestic violence, risk factors, multinomial regression, random forestsAbstract
The main objective of this study is to identify the factors associated with domestic violence. In terms of methodology, it is an explanatory and applied non-experimental research of longitudinal method (2019, 2020 and 2021). The sample consisted of 295,000 reports of domestic violence published in Line 100 by the Minister for Women, which are available on the National Open Data Platform. The workflow followed the CRISP-DM methodology and two models of synergistic work, that is, multinomial regression and random forests. The first one helped to identify risk factors associated with domestic violence ; and the latter to weight their significance ( ). Results show a significant statistical correlation between domestic violence and variables like age (victim and offender), offender- victim relationship, city (victim), education level (victim and offender), victim’s risk level, number of children, educational gap, gender (offender), ethnicity (victim), employment status (offender) and frequency of aggressions. Likewise, the distribution of violence against women happened to be homogeneous at different ages, and against men it was more frequent at a young and old age.
