Keywords:proteins; Neural Networks; Knowledge Discovery; Secondary Structure Prediction
There have been several attempts over the last 20 years to develop tools for predicting membrane-spanning regions, but the problem of prediction is made topologically more complex by the presence of several transmembrane domains in many proteins, and current tools are far away from achieving 95% reliability in prediction. Though neural networks have been considered as classification and regression systems whose inner working principles were very difficult to interpret, it is now becoming apparent that algorithms can be designed which extract understandable representations from trained neural networks that might be a powerful tool for biological data mining. In this research construction of novel neural network architectures/algorithms, amino acid representations to the neural networks with appropriate encodings and understanding of the relationship between structure and function of transmembrane proteins were studied.
This work seeks to develop the use of artificial neural networks for analysing primary sequences for the presence of MSRs and to attempt classification according to functional and /or structural properties. This could be achieved by developing techniques for analysing primary protein sequences for the presence of membrane spanning regions using artificial neural network approaches. The expected benefits include an increased understanding of how to create and train optimal neural networks for membrane protein datasets, which will be extremely useful in both academia and industry. In addition, novel neural network architectures will be generated, leading to an enhancement of understanding of these machine-learning techniques.