Researchers from the Universidad Politécnica de Madrid's Facultad de Informática have developed a fuzzy neural network that uses a numerical and categorical imputation method to reconstruct incomplete datasets. This network achieves substantially better results than the imputation methods now in use in opinion polling.
The research results have been published in the journal Neural Computing & Applications.
Missing data is a widespread problem in most surveys conducted in a host of domains. Inaccuracy in political opinion polling can be mostly put down to how unanswered questions are processed. Imputation is the most commonly used technique for accounting for missing information. Imputation methods calculate and add the missing data to the sample to get a more complete dataset. Imputation methods can infer numerical and categorical data.
The researchers Jesús Cardeñosa, of the Artificial Intelligence Department at the Facultad de Informática, and Pilar Rey del Castillo, of the Institute of Fiscal Studies at the Spanish Ministry of Economics, have modified Gabrys and Bargiela's numerical imputation method to infer categorical data.
With this system, imputation can, for example, determine the voting intention of a person that has not answered all the opinion poll questions with near 90% reliability. This is a substantial improvement on current methods. Other potential applications for this neural network are medical diagnosis or surveying using categorical variables.
The first step of this new imputation method is to define the distances between categories using fuzzy logic. With the support of the neural network that learns from each case, it then determines where each category is located within the different dataset spaces. Finally, it extends the network architecture to all the data and processes the missing data.
The above story is based on materials provided by Facultad de Informática de la Universidad Politécnica de Madrid. Note: Materials may be edited for content and length.
Cite This Page: