A Comparative Analysis of Artificial Neural Networks, Classification Trees, and Multivariate Linear Regression for Predicting Retail Employee Tenure and Turnover
David E. Ostberg
ABSTRACT
The purpose of the study was to demonstrate empirically the value of applying neural network modeling for predicting various employee job performance criteria; namely, employee tenure, eligibility for rehire, and voluntary/involuntary termination classification. Analogous to biological nerve cells of brain tissue, neural network processors are able to learn and remember features and relationships within their environment and subsequently apply this ?knowledge? to new data (Scarborough, 1995). Aptly, these pattern-recognition qualities make ANNs well-suited to social science problems involving classification or prediction of one or more outputs from a larger number of inputs which may be flawed, incomplete, or of different type (Garson, 1998), or where the relationships amongst the data are unknown, complex, or non-linear (Somers, 1999).
Archival predictor-criterion data from 4,299 hourly employees with a national sporting goods retailer consisted of scale scores from a self-report personality assessment as well as various biographical data collected via an in-store, electronic job application kiosk. Criterion data were collected from client payroll data feeds.
As expected, the multi-layer perceptron (MLP) neural networks significantly outperformed the multiple regression models in predicting voluntary/involuntary termination status and eligibility for rehire when the classification thresholds on the regression models were set at .50 as is common in selection research. Surprisingly, the MLP and regression models performed identically in predicting employee tenure, and the radial basis function neural networks did not perform quite as well in predicting the dichotomous outcomes as did the MLPs. In predicting all three job performance outcomes, the classification trees consistently had the poorest performance. In follow up analyses where the DV classification thresholds were set to mimic the observed distributions of the dichotomous DVs, no significant differences were found between the regression and neural models. Overall, there were no discernable biases amongst the models by age, race, or gender as indicated by predicted mean scores. Findings suggest that the neural modeling techniques offer a viable alternative to traditional predictive approaches and may lend insight into variable relationships which may be overlooked with conventional analyses. However, this study suggests that the different models may vary in usefulness for different types of data.
July 8th, 2005
DISSERTATION COMMITTEE
Donald Truxillo, Chair
Talya Bauer
George G. Lendaris
Robert Sinclair
James A. Paulson
Peter Collier, Graduate Studies Representative
|