Research Group 'Psychiatric Genetics', Head: Prof. Dr. Hans H. StassenDepartment of Psychiatry, Psychotherapy and PsychosomaticsPsychiatric Hospital, University of Zurich 

MolecularGenetic Neural Network AnalysisModeling Gene ProductsStandard association methods aim at connecting genotype with phenotype in a direct way, thus greatly simplifying biology. In fact, genes code for proteins or RNA ("gene products") which may interact in a variety of ways and influence the phenotype only after a cascade of intermediate steps. Moleculargenetic Neural Networks (NNs) generalize standard regression analysis in a natural way by (1) implementing multistage gene products through one or more intermediate "layer(s)", and (2) allowing for (linear/nonlinear) interactions between genes and between gene products. It is the advantage of NNs that the specific knowledge about the cascade of intermediate steps, which ultimately lead from genotype to phenotype, can be incomplete or even unknown ("hidden layers"). Fitting the NN ModelDuring optimization the algorithm systematically improved genotypephenotype correlations by iteratively adding or removing genomic loci and fitting the NN model to the set of 1,042 observations under the constraint of reproducibility with kfold crossvalidation (k = 10). Using a single layer for gene products, we set the number of gene products equal to the number of genomic loci included in the NN model, while a onedimensional phenotype was chosen to reflect the IgM level as derived from the multidimensional genotype. The convergence criterion was set to c = 0.03 with a maximum number of iterations of 70,000 and an initial learning rate of l = 0.012 that was gradually modified during iteration when the method of gradient descent got "stuck" without achieving convergence. Averaged across the k solutions and applied to the 1,042 probes, weight matrices and classifiers yielded an overall performance for each optimization step [Figure]. The optimization stopped when a plateau was reached at a rate of 77.3% correctly classified subjects out of the entire sample. Prediction of IgM Levels by GenotypeThe table below gives reclassification rates, sensitivity and specificity of NNbased predictors as derived during the process of kfold crossvalidation prior to averaging weight matrices and classifiers. Such predictors tend to be overoptimistic, in particular if the population under investigation includes subgroups. Therefore, averaging weight matrices and classifiers allows one to compensate for "local" data characteristics and yield a better performance when new, "unknown" probes are to be classified. 
Iterative optimization of the starting configuration by systematically adding/removing genomic loci while fitting the NN model to the set of 1,042 observations under the constraint of reproducibility with 10fold crossvalidation. The red circles designate the percentage of correctly classified subjects for each optimization step, with optimization steps plotted along the xaxis (over proportionally large decreases in performance indicate removal of loci of larger weight). 

[ Mail to Webmaster ] k454910@bli.uzh.ch 