A new £1 million initiative to help academics with their struggle against data deluge will be launched on 21 March at Manchester Town Hall.
The National Centre for Text Mining (NaCTeM) is a collaboration between the Universities of Manchester, Liverpool and Salford. Funding is provided by the Joint Information Systems Committee (JISC), the Biotechnology and Biological Research Council (BBSRC) and the Engineering and Physical Sciences Research Council (EPSRC).
Search engines return thousands of documents, but the difficulty for the user is to find those which are most personally relevant. Most of these searches have little concept of the meaning of words that is gained from the context of a sentence. By using natural language processing, text mining can discover this meaning and focus on specific needs of the user.
Detailed abstracts can then be compared and contrasted using data mining to discover patterns and associations that the human eye is more likely to miss. This has proved to be particularly useful in the fields of drug discovery and predictive toxicology.
Initially focusing on providing a service for the fields of biological and biomedical science, the Centre will also serve the broader needs of the academic community through the provision of text mining tools, advice and ongoing research.
Strong contacts will be forged by the Centre with business and government sectors to achieve long term sustainability for the service.
Presenters at the launch will include Dr Anne Trefethen (Deputy Director, e-Science Core Programme) and Professor Margaret King (University of Geneva), Professor Ray Larson (University of California, Berkeley), Professor Regan Moore (San Diego Supercomputer Center) and Professor Jun'ichi Tsujii (University of Tokyo). All are leaders in the field of informatics and computing.
Professor John Keane from the University's School of Informatics, and Co-Director of the National Centre for Text Mining commented: "The potential of text mining is virtually endless. In the future, databases could be populated with accurate, valid, exhaustive, rapidly updated data where users find what they want all the time; where drug discovery costs and development time are slashed and animal experimentation is reduced through early identification of unpromising paths."
Cite This Page: