Forecasting Defendant Failure to Appear and Case Disposition Time

This is my capstone project. My teammates and I forecasted the Defendant Failure to Appear rate and Time to Dispostion for a local Washington state county (data, codes, and report are unable to be published due to privacy).

Problem

Before a case is presented in front of a court, a hearing would be scheduled to justify the legitimation of the case and whether the defendant is held guilty. There is an ongoing problem of defendants failing to appear the hearing, leading to different consequences (e.g. delay time to disposition).

Solution and Results

  • Developed statistical models (logistic regression) as a benchmark for machine learning models (decision tree, SVN, KNN, and neural network) by utilizing Python to predict defendants' hearing appearance pattern and time to disposition with the accuracy rate of 68%.
  • Processed local Washington state county data, conducted feature engineering and correlation matrix to identify the multicollinearity, and visualized Exploratory Data Analysis (EDA) from six datasets by utilizing Python, improving the performance of models by 38%.
  • Based on the baseline logistic regression model, selected variables for decision tree, SVN, KNN, and neural network, evaluated and compared metrics through confusion matrix, increasing recall rate by 10%.
  • Presented results, wrote reports, and proposed recommendations to the local criminal justice system by minimizing defendant failure-to-appear rate and time to disposition.