<?xml version="1.0" encoding="ISO-8859-1"?>
<rss version="2.0">
<channel>
<title>IEEE/ACM Transactions on Computational Biology and Bioinformatics</title>
<link>http://www.computer.org/tcbb</link>
<description>The IEEE/ACM Transactions on Computational Biology and Bioinformatics is a new quarterly that will publish archival research results related to the algorithmic, mathematical, statistical, and computational methods that are central in bioinformatics and computational biology; the development and testing of effective computer programs in bioinformatics; the development and optimization of biological databases; and important biological results that are obtained from the use of these methods, programs, and databases.	</description>
	<language>en-us</language>
	<pubDate>Tue, 21 May 2013 10:00:17 GMT</pubDate>
	<image>
		<url>http://csdl.computer.org/common/images/logos/tcbb.gif</url>
		<title>IEEE Computer Society</title>
		<description>List of recently published journal articles</description>
		<link>http://www.computer.org/tcbb</link>
	</image>
  <item>
     <title>PrePrint: Multivariate Hypergeometric Similarity Measure</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.28</link>
     <description>We propose a similarity measure based on the multivariate hypergeometric distribution for the pairwise comparison of images and data vectors. The formulation and performance of the proposed measure are compared with other similarity measures using synthetic data. A method of piecewise approximation is also implemented to facilitate application of the proposed measure to large samples. Example applications of the proposed similarity measure are presented using mass spectrometry imaging (MSI) data and gene expression microarray data. Results from synthetic and biological data indicate that the proposed measure is capable of providing meaningful discrimination between samples, and that it can be a useful tool for identifying potentially related samples in large-scale biological datasets.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.28</guid>
  </item>
  <item>
     <title>PrePrint: Profile-Based LC-MS Data Alignment &amp;#x2014 A Bayesian Approach</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.25</link>
     <description>A Bayesian alignment model (BAM) is proposed for alignment of liquid chromatography-mass spectrometry (LC-MS) data. BAM belongs to the category of profile-based approaches, which are composed of two major components: a prototype function and a set of mapping functions. Appropriate estimation of these functions is crucial for good alignment results. BAM uses Markov chain Monte Carlo (MCMC) methods to draw inference on the model parameters and improves on existing MCMC-based alignment methods through 1) the implementation of an efficient MCMC sampler and 2) an adaptive selection of knots. A block Metropolis-Hastings algorithm that mitigates the problem of the MCMC sampler getting stuck at local modes of the posterior distribution is used for the update of the mapping function coefficients. In addition, a stochastic search variable selection (SSVS) methodology is used to determine the number and positions of knots. We applied BAM to a simulated data set, an LC-MS proteomic data set and two LC-MS metabolomic data sets, and compared its performance with the Bayesian hierarchical curve registration (BHCR) model, the dynamic time warping (DTW) model, and the continuous profile model (CPM). The advantage of applying appropriate profile-based retention time correction prior to performing a feature-based approach is also demonstrated through the metabolomic data sets.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.25</guid>
  </item>
  <item>
     <title>PrePrint: Unrooted Tree Reconciliation: A Unified Approach</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.22</link>
     <description>Tree comparison functions are widely used in phylogenetics for comparing trees. Unrooted trees can be compared with rooted trees by identifying all rootings that minimize a given cost function between two rooted trees. The plateau property is satisfied by the provided function, if all optimal rootings form a subtree, or plateau, in the unrooted tree, from which the rootings along every path towards a leaf have monotonically increasing costs. This property is sufficient for the linear-time identification of all optimal rootings and rooting costs. However, the plateau property has only been proven for a few rooted comparison functions, requiring individual proofs for each function without benefitting from inherent structural features of such functions. Here, we introduce the consistency condition that is sufficient for a function to satisfy the plateau property. For consistent functions we introduce general linear-time solutions that identify optimal rootings/costs. Further we identify novel relationships between consistent functions in terms of plateaus. Especially, the plateau of the well-studied duplication-loss function is part of a plateau of every other consistent function. We introduce a novel approach for identifying consistent cost functions by defining a formal language of boolean costs. Finally, we demonstrate the performance of our general algorithms in practice using empirical and simulation studies.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.22</guid>
  </item>
  <item>
     <title>PrePrint: Parameter Estimation of Biological Phenomena: An Unscented Kalman Filter Approach</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.19</link>
     <description>Recent advances in high-throughput technologies for biological data acquisition have spurred a broad interest in the construction of mathematical models for biological phenomena.The development of such mathematical models relies on the estimation of unknown parameters of the system using the time-course profiles of different metabolites in the system. One of the main challenges in the parameter estimation of biological phenomena is the fact that the number of unknown parameters is much more than the number of metabolites in the system. Moreover, the available metabolite measurements are corrupted by noise. In this paper, a new parameter estimation algorithm is developed based on the stochastic estimation framework for nonlinear systems, namely the unscented Kalman filter (UKF). A new iterative unscented Kalman Filtering algorithm with co-variance resetting is developed in which the UKF algorithm is applied iteratively to the available noisy time profiles of the metabolites. The proposed estimation algorithm is applied to noisy time-course data synthetically produced from a generic branched pathway as well as real time-course profile for the Cad system of E.coli. The simulation results demonstrate the effectiveness of the proposed scheme.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.19</guid>
  </item>
  <item>
     <title>PrePrint: Protein Chain Pair Simplification Under the Discrete Fr&amp;#x00E9;chet Distance</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.17</link>
     <description>For protein structure alignment and comparison, a lot of work has been done using RMSD as the distance measure, which has drawbacks under certain circumstances. Thus, the discrete Fr&#x00E9;chet distance was recently applied to the problem of protein (backbone) structure alignment and comparison with promising results. Visualization is also important since protein chain backbones can have up to 500~600 &amp;amp;#x03B1;-carbon atoms which constitute the vertices in the comparison. Even with an excellent alignment, the similarity can be difficult to visualize. Thus, the chain pair simplification problem (CPS-3F) was proposed in 2008 to simultaneously simplify both chains with respect to each other under the discrete Fr&amp;amp;#x00E9;chet distance. The complexity of CPS-3F is unknown, so heuristic methods have been developed. Here, we define a variation of CPS-3F, the constrained CPS-3F problem (CPS-3F&amp;amp;#x002B;), and prove it is polynomially solvable by presenting a dynamic programming solution, which we prove is a factor-2 approximation for CPS-3F. We then compare the CPS-3F&amp;amp;#x002B; solutions with previous empirical results, and further demonstrate the benefits of the simplified comparisons. CPS based on the Hausdorff distance (CPS-2H) is NP-complete, and we prove that CPS-2H&amp;amp;#x002B; is also NP-complete. Finally, we discuss future work and implications along with a software library implementation, named FPACT (The Fr&amp;amp;#x00E9;chet-based Protein Alignment &amp;amp;#x0026; Comparison Toolkit).</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.17</guid>
  </item>
  <item>
     <title>PrePrint: Biological Sequence Analysis with Multivariate String Kernels</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.15</link>
     <description>String kernel-based machine learning methods have yielded great success in practical tasks of structured/sequential data analysis. They often exhibit state-of-the-art performance on many practical tasks of sequence analysis such as biological sequence classification, remote homology detection, or protein superfamily and fold prediction. However, typical string kernel methods rely on analysis of discrete one-dimensional (1D) string data (e.g., DNA or amino acid sequences). In this work we address the multi-class biological sequence classification problems using multivariate representations in the form of sequences of features vectors (as in biological sequence profiles, or sequences of individual amino acid physico-chemical descriptors) and a class of multivariate string kernels that exploit these representations. On a number of protein sequence classification tasks proposed multivariate representations and kernels show significant 15-20\% improvements compared to existing state-of-the-art sequence classification methods.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.15</guid>
  </item>
  <item>
     <title>PrePrint: Proximity Measures for Clustering Gene Expression Microarray Data: A Validation Methodology and a Comparative Analysis</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.9</link>
     <description>Cluster analysis is usually the first step adopted to unveil information from gene expression microarray data. Besides selecting a clustering algorithm, choosing an appropriate proximity measure (similarity or distance) is of great importance to achieve satisfactory clustering results. Nevertheless, up to date, there are no comprehensive guidelines concerning how to choose proximity measures for clustering microarray data. Pearson is the most used proximity measure, whereas characteristics of other ones remain unexplored. In this paper we investigate the choice of proximity measures for the clustering of microarray data by evaluating the performance of 16 proximity measures in 52 datasets from time-course and cancer experiments. Our results support that measures rarely employed in the gene expression literature can provide better results than commonly employed ones, such as Pearson, Spearman and, Euclidean distance. Given that different measures stood out for time-course and cancer data evaluations, their choice should be specific to each scenario. To evaluate measures on time-course data we preprocessed and compiled 17 datasets from the microarray literature in a benchmark along with a new methodology, called Intrinsic Biological Separation Ability (IBSA). Both can be employed in future research to assess the effectiveness of new measures for gene time-course data.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.9</guid>
  </item>
  <item>
     <title>PrePrint: Mining Featured Patterns of MiRNA Interaction based on Sequence and Structure Similarity</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.5</link>
     <description>MicroRNA (miRNA) is an endogenous small non-coding RNA that plays an important role in gene expression through the post-transcriptional gene regulation pathways. There are many literature works focusing on predicting miRNA targets and exploring gene regulatory networks of miRNA families. We suggest however, the study to identify the interaction between miRNAs is insufficient. This paper presents a framework to identify relationships between miRNAs using joint entropy, to investigate the regulatory features of miRNAs. Both the sequence and secondary structure are taken into consideration to make our method more relevant from the biological viewpoint. Further, joint entropy is applied to identify correlated miRNAs, which are more desirable from the perspective of the gene regulatory network. A dataset including Drosophila melanogaster and Anopheles gambiae is used in the experiment. The results demonstrate that our approach is able to identify known miRNA interaction and uncover novel patterns of miRNA regulatory network.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.5</guid>
  </item>
  <item>
     <title>PrePrint: Mining Quasi-Bicliques from HIV-1--Human Protein Interaction Network: A Multiobjective Biclustering Approach</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2012.139</link>
     <description>In this work, we model the problem of mining quasi-bicliques from weighted viral-host protein-protein interaction network as a biclustering problem for identifying strong interaction modules. In this regard, a multiobjective genetic algorithm based biclustering technique is proposed that simultaneously optimizes three objective functions to obtain dense biclusters having high mean interaction strengths. The performance of the proposed technique has been compared with that of other existing biclustering methods on an artificial data. Subsequently, the proposed biclustering method is applied on the records of biologically validated and predicted interactions between a set of HIV-1 proteins and a set of human proteins to identify strong interaction modules. For this, the entire interaction information is realized as a bipartite graph. We have further investigated the biological significance of the obtained biclusters. The human proteins involved in the strong interaction module have been found to share common biological properties and they are identified as the gateways of viral infection leading to various diseases. These human proteins can be potential drug targets for developing anti-HIV drugs.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2012.139</guid>
  </item>
  <item>
     <title>PrePrint: Multiscale Modelling and Analysis of Planar Cell Polarity in the Drosophila Wing</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2012.101</link>
     <description>Modelling across multiple scales is a current challenge in Systems Biology, especially when applied to multicellular organisms. In this paper we present an approach to model at different spatial scales, using the new concept of hierarchically coloured Petri Nets (HCPN). We apply HCPN to model a tissue comprising multiple cells hexagonally packed in a honeycomb formation in order to describe the phenomenon of Planar Cell Polarity (PCP) signalling in Drosophila wing. We have constructed a family of related models, permitting different hypotheses to be explored regarding the mechanisms underlying PCP. In addition our models include the effect of well-studied genetic mutations. We have applied a set of analytical techniques including clustering and model checking over time series of primary and secondary data. Our models support the interpretation of biological observations reported in the literature.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2012.101</guid>
  </item>
  <item>
     <title>IEEE/ACM Transactions on Computational Biology and Bioinformatics - </title>
     <link>http://opac.ieeecomputersociety.org/opac?year=&amp;volume=&amp;issue=&amp;acronym=tcbb</link>
     <description>IEEE/ACM Transactions on Computational Biology and Bioinformatics</description>
     <guid isPermaLink="true">http://www.computer.org/portal/site/tcbb/</guid>
  </item>
  <item>
     <title>PrePrint: Improved Multiple Sequence Alignments Using Coupled Pattern Mining</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.36</link>
     <description>We present ARMiCoRe, a novel approach to a classical bioinformatics problem, viz. multiple sequence alignment (MSA) of gene and protein sequences. Aligning multiple biological sequences is a key step in elucidating evolutionary relationships, annotating newly sequenced segments, and understanding the relationship between biological sequences and functions. Classical MSA algorithms are designed to primarily capture conservations in sequences whereas couplings, or correlated mutations, are well known as an additional important aspect of sequence evolution. (Two sequence positions are coupled when mutations in one are accompanied by compensatory mutations in another). As a result, it is not uncommon for practitioners to hand-tweak a conservation-based alignment to better expose couplings. ARMiCoRe introduces a distinctly pattern mining approach to improving MSAs: using frequent episode mining as a foundational basis, we define the notion of a coupled pattern and demonstrate how the discovery and tiling of coupled patterns using a max-flow approach can yield MSAs that are significantly better than conservation-based alignments. Although we were motivated to improve MSAs for the sake of better exposing couplings, we demonstrate that our MSAs are also improvements in terms of traditional metrics of assessment. We demonstrate the effectiveness of ARMiCoRe on a large collection of datasets.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.36</guid>
  </item>
  <item>
     <title>PrePrint: Computer-Aided Biophysical Modeling: A Quantitative Approach to Complex Biological Systems</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.35</link>
     <description>When dealing with the biophysics of tumors, analytical and numerical modeling tools have long been regarded as potentially useful but practically immature tools. Further developments could not just overturn this predicament, but lead to completely new perspectives in biology. Here we give an account of our own computational tool and how we have put it to good use, and we discuss a paradigmatic example to outline a path to making cell biology more quantitative and predictive.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.35</guid>
  </item>
  <item>
     <title>PrePrint: Stochastic Model Simulation Using Kronecker Product Analysis and Zassenhaus Formula Approximation</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.34</link>
     <description>Probabilistic Models are regularly applied in Genetic Regulatory Network modeling to capture the stochastic behavior observed in the generation of biological entities such as mRNAs or proteins. Several approaches including Stochastic Master Equation (SME) and Probabilistic Boolean Network (PBN) have been proposed to model the stochastic behavior in genetic regulatory networks. It is generally accepted that SME is a fundamental model that can describe the system being investigated in fine detail, but the application of this model is computationally enormously expensive. On the other hand, PBN captures only the coarse-scale stochastic properties of the system. We propose a new approximation of the SME model that is able to capture the finer details of the modeled system including bi-stabilities and oscillatory behavior, and yet has a significantly lower computational complexity. We represent the system using tensors and apply Zassenhaus formula to approximate the exponential of a sum of matrices as a product of matrices. Simulation results of the new method on four different biological benchmark systems illustrate performance comparable to detailed SME models but with considerably lower computational complexity. The results also demonstrate the reduced complexity of the new approach as compared to commonly used Stochastic Simulation Algorithm for equivalent accuracy.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.34</guid>
  </item>
  <item>
     <title>PrePrint: Novel Multi-Sample Scheme for Inferring Phylogenetic Markers from Whole Genome Tumor Profiles</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.33</link>
     <description>Computational cancer phylogenetics seeks to enumerate the temporal sequence of aberrations in tumor evolution, thereby delineating the evolution of possible tumor progression pathways, molecular subtypes and mechanisms of action. We previously developed a pipeline for constructing phylogenies describing evolution between major recurring cell types computationally inferred from whole-genome tumor profiles. The accuracy and detail of the phylogenies, however, depends on the identification of accurate, high-resolution molecular markers of progression, i.e., reproducible regions of aberration that robustly differentiate different subtypes and stages of progression. Here we present a novel hidden Markov model (HMM) scheme for the problem of inferring such phylogenetically significant markers through joint segmentation and calling of multi-sample tumor data. Our method classifies sets of genome-wide DNA copy number measurements into a partitioning of samples into normal (diploid) or amplified at each probe. It differs from other similar HMM methods in its design specifically for the needs of tumor phylogenetics, by seeking to identify robust markers of progression conserved across a set of copy number profiles. We show an analysis of our method in comparison to other methods on both synthetic and real tumor data, which confirms its effectiveness for tumor phylogeny inference and suggests avenues for future advances.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.33</guid>
  </item>
  <item>
     <title>PrePrint: How Many Clusters: A Validation Index for Arbitrary Shaped Clusters.</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.32</link>
     <description>Clustering validation indexes are intended to assess the goodness of clustering results. Many methods used to estimate the number of clusters rely on a validation index as a key element to find the correct answer. This paper presents a new validation index based on graph concepts, which has been designed to find arbitrary shaped clusters by exploiting the spatial layout of the patterns and their clustering label. This new clustering index is combined with a solid statistical detection framework, the Gap Statistic. The resulting method is able to find the right number of arbitrary shaped clusters in diverse situations, as we show with examples where this information is available. A comparison with several relevant validation methods is carried out using artificial and gene expression datasets. The results are very encouraging, showing that the underlying structure in the data can be more accurately detected with the new clustering index. Our gene expression data results also indicate that this new index is stable under perturbation of the input data.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.32</guid>
  </item>
  <item>
     <title>PrePrint: Systematic Analysis of the Mechanisms of Virus-Triggered Type I IFN Signaling Pathways Through Mathematical Modeling</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.31</link>
     <description>Based on biological experimental data, we developed a mathematical model of the virus-triggered signaling pathways that lead to induction of type I IFNs and systematically analyzed the mechanisms of the cellular antiviral innate immune responses, including the negative feedback regulation of ISG56 and the positive feedback regulation of IFNs. We found that the time between 5 and 48 hours after viral infection is vital for the control and/or elimination of the virus from the host cells and demonstrated that the ISG56-induced inhibition of MITA activation is stronger than the ISG56-induced inhibition of TBK1 activation. The global parameter sensitivity analysis suggests that the positive feedback regulation of IFNs is very important in the innate antiviral system. Furthermore, the robustness of the innate immune signaling network was demonstrated using a new robustness index. These results can help us understand the mechanisms of the virus-induced innate immune response at a system level and provide instruction for further biological experiments.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.31</guid>
  </item>
  <item>
     <title>PrePrint: Non-Negative Least Squares Methods for the Classification of High Dimensional Biological Data</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.30</link>
     <description>Microarray data can be used to detect diseases and predict responses to therapies through classification models. However, the high dimensionality and low sample size of such data results in many computational problems such as reduced prediction accuracy and slow classification speed. In this paper, we propose a novel family of non-negative-least-squares classifiers for high dimensional microarray gene expression and comparative genomic hybridization data. Our approaches are based on combining the advantages of using local learning, transductive learning and ensemble learning, for better prediction performance. To study the performances of our methods, we performed computational experiments on seventeen well-known data sets with diverse characteristics. We have also performed statistical comparisons with many classification techniques including the well-performing SVM approach and two related but recent methods proposed in literature. Experimental results show that our approaches are faster and achieve generally a better prediction performance over compared methods.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.30</guid>
  </item>
  <item>
     <title>PrePrint: Probabilistic Search and Energy Guidance for Biased Decoy Sampling in Ab-Initio Protein Structure Prediction</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.29</link>
     <description>Adequate sampling of the conformational space is a central challenge in ab-initio protein structure prediction. In the absence of a template structure, a conformational search procedure guided by an energy function explores the conformational space, gathering an ensemble of low-energy decoy conformations. If the sampling is inadequate, the native structure may be missed altogether. Even if reproduced, a subsequent stage that selects a subset of decoys for further structural detail and energetic refinement may discard near-native decoys if they are high-energy or insufficiently represented in the ensemble. Sampling should produce a decoy ensemble that facilitates the subsequent selection of near-native decoys. In this paper, we investigate a robotics-inspired framework that allows directly measuring the role of energy in guiding sampling. Testing demonstrates that a soft energy bias steers sampling towards a diverse decoy ensemble less prone to exploiting energetic artifacts and thus more likely to facilitate retainment of near-native conformations by selection techniques. We employ two different energy functions, the Associative Memory Hamiltonian with Water (AMW) and Rosetta. Results show that enhanced sampling provides a rigorous testing of energy functions and exposes different deficiencies in them, thus promising to guide development of more accurate representations and energy functions.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.29</guid>
  </item>
  <item>
     <title>PrePrint: Evaluation of Breast Cancer Susceptibility Using Improved Genetic Algorithms in Generating Genotype SNP Barcodes</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.27</link>
     <description>In this study, a genetic algorithm (GA) is developed to detect the association of genotype frequencies of cancer cases and non-cancer cases based on statistical analysis. An improved genetic algorithm (IGA) is proposed to improve the reliability of the GA method for high-dimensional SNP-SNP interactions. The strategy offers the top five results to the random population process, in which they guide the GA toward a significant search course. IGA increases the likelihood of quickly detecting the maximum ratio difference between cancer cases and non-cancer cases. The study systematically evaluates the joint effect of 23 SNP combinations of six steroid hormone metabolisms and signaling-related genes involved in breast carcinogenesis pathways were systematically evaluated, with IGA success fully detecting significant ratio differences between breast cancer cases and non-cancer cases. The possible breast cancer risks were subsequently analyzed by odds-ratio (OR) and risk-ratio (RR) analysis. The estimated OR of the best SNP barcode is significantly higher than 1 (between 1.15 and 7.01) for specific combinations of two to 13 SNPs. Analysis results support that IGA provides higher ratio difference values than GA between breast cancer cases and non-cancer cases over 3-SNP to 13-SNP interactions. A more specific SNP-SNP interaction profile for the risk of breast cancer is also provided.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.27</guid>
  </item>
  <item>
     <title>PrePrint: Gene Regulation Networks in Early-Phase of Duchenne Muscular Dystrophy</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.24</link>
     <description>The aim of this study was to analyze previously published gene expression data of skeletal muscle biopsies of Duchenne muscular dystrophy (DMD) patients and controls (Gene Expression Omnibus database, accession #GSE6011) using systems biology approaches. We applied an unsupervised method to discriminate patient and control populations, based on principal component analysis, using the gene expressions as units and patients as variables. The genes having the highest absolute scores in the discrimination between the groups, were then analyzed in terms of gene expression networks, on the basis of their mutual correlation in the two groups. The correlation network structures suggest two different modes of gene regulation in the two groups, reminiscent of important aspects of DMD pathogenesis.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.24</guid>
  </item>
  <item>
     <title>PrePrint: On the Increase in Network Robustness and Decrease in Network Response Ability During the Aging Process: A Systems Biology Approach via Microarray Data</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.23</link>
     <description>Aging, an extremely complex and system-level process, has attracted much attention in medical research, especially since chronic diseases are quite prevalent in the aged population. These may be the result of both gene mutations that lead to intrinsic perturbations and environmental changes that may stimulate signaling in the body. Therefore, network robustness to tolerate intrinsic perturbations and network response ability to respond external stimuli of gene network during aging process may provide insight into the systematic changes of aging. We first propose novel methods to estimate network robustness and measure network response ability of gene regulatory networks by their corresponding microarray data in aging process. Then we find that an aging-related gene network is more robust to intrinsic perturbations in the elderly than the young, and therefore is less responsive to external stimuli. Finally, response abilities of individual genes, especially FOXOs, NF-kB and p53, are significantly different in the young versus the aged subjects. These observations are consistent with experimental findings in the aged population, e.g. the elevated incidence of tumorigenesis and declining resistance to oxidative stress. The proposed method can also be used for exploring and analyzing dynamical properties for other biological processes via corresponding microarray data to provide useful information on clinical strategy and drug target selection.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.23</guid>
  </item>
  <item>
     <title>PrePrint: Reconstruction of Signaling Network from Protein Interactions Based on Function Annotations</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.20</link>
     <description>The directionality of protein interactions is the prerequisite of forming various signaling networks and the construction of signaling networks is a critical issue in the discovering the mechanism of the life process. In this paper, we proposed a novel method to infer the directionality in protein-protein interaction networks and furthermore construct signaling networks. Based on the functional annotations of proteins, we proposed a novel parameter GODS and established the prediction model. This method shows high sensitivity and specificity to predict the directionality of protein interactions, evaluated by 5-fold cross-validation. By taking the threshold value of GODS as 2, we achieved accuracy 95.56% and coverage 74.69% in the human test set. Also, this method was successfully applied to reconstruct the classical signaling pathways in human. This study not only provided an effective method to unravel the unknown signaling pathways, but the deeper understanding for the signaling networks, from the aspect of protein function.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.20</guid>
  </item>
  <item>
     <title>PrePrint: FNphasing: A Novel Fast Heuristic Algorithm for Haplotype Phasing Based on Flow Network Model</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.18</link>
     <description>An enormous amount of sequence data has been generated with the development of new DNA sequencing technologies, which presents great challenges for computational biology problems such as haplotype phasing. Although arduous efforts have been made to address this problem, the current methods still cannot efficiently deal with the incoming flood of large-scale data. In this paper, we propose a flow network model to tackle haplotype phasing problem, and explain some classical haplotype phasing rules based on this model. By incorporating the heuristic knowledge obtained from these classical rules, we design an algorithm FNphasing based on the flow network model. Theoretically, the time complexity of our algorithm is O(n&amp;amp;#x00B2;m+m&amp;amp;#x00B2;), which is better than that of 2SNP, one of the most efficient algorithms currently. After testing the performance of FNphasing with several simulated data sets, the experimental results show that when applied on large-scale data sets, our algorithm is significantly faster than the state-of-the-art Beagle algorithm. FNphasing also achieves an equal or superior accuracy compared with other approaches.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.18</guid>
  </item>
  <item>
     <title>PrePrint: Text Categorization of Biomedical Data Sets Using Graph Kernels and a Controlled Vocabulary</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.16</link>
     <description>Recently, graph representations of text have been showing improved performance over conventional bag-of-words representations in text categorization applications. In this paper we present a graph-based representation for biomedical articles and use graph kernels to classify those articles into high level categories. In our representation, common biomedical concepts and semantic relationships are identified with the help of an existing ontology and are used to build a rich graph structure that provides a consistent feature set and preserves additional semantic information that could improve a classifier's performance. We attempt to classify the graphs using both a set-based graph kernel that is capable of dealing with the disconnected nature of the graphs and a simple linear kernel. Finally, we report the results comparing the classification performance of the kernel classifiers to common text-based classifiers.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.16</guid>
  </item>
  <item>
     <title>PrePrint: GENESHIFT: a Non-Parametric Approach for Integrating Microarray Gene Expression Data Based on the Inner Product as a Distance Measure Between the Distributions of Genes</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.12</link>
     <description>The potential of microarray-gene-expression data is only partially explored due to the limited number of samples in individual studies. This limitation can be surmounted by integrating datasets originating from independent experiments, which are designed to study the same biological problem. However this process is hindered by batch-effects which are study-dependent and result in random data distortion. Our contribution is two-fold: first we propose GENESHIFT, a non-parametric batch effect removal method based on two key elements from statistics: empirical density estimation and the inner-product as a distance measure between two probability density functions; second we introduce a new validation metric based on the observation that samples from two independent studies drawn from a same population should exhibit similar probability density functions. We compared GENESHIFT with four state-of-the-art methods: Batch-Mean-Centering, COMBAT, Distance-Weighted-Discrimination and Cross-Platform-Normalization. Several validation indices providing complementary information about the efficiency of batch effect removal methods have been employed for validation. The results show that none of the methods clearly outperforms the others. Moreover, most of the methods used here perform very well with respect to some validation indices while performing poorly with respect to others. GENESHIFT exhibits robust performances and its average rank is the highest among the average ranks of all methods used for comparison.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.12</guid>
  </item>
  <item>
     <title>PrePrint: Gelsius: A Literature-Based Workflow for Determining Quantitative Associations Between Genes and Biological Processes</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.11</link>
     <description>An effective knowledge extraction and quantification methodology from biomedical literature would allow the researcher to organize and analyze the results of high throughput experiments on microarrays and next generation sequencing technologies. Despite the large amount of raw information available on the Web, a tool able to extract a measure of the correlation between a list of genes and biological processes is not yet available. In this paper we present Gelsius, a workflow that incorporates biomedical literature to quantify the correlation between genes and terms describing biological processes. To achieve this target, we build different modules focusing on query expansion and document cononicalization. In this way we reached to improve the measurement of correlation, performed using a latent semantic analysis approach. To the best of our knowledge, this is the first complete tool able to extract a measure of genes-biological processes correlation from literature. We demonstrate the effectiveness of the proposed workflow on six biological processes and a set of genes, by showing that correlation results for known relationships are in accordance with definitions of gene functions provided by NCI Thesaurus. On the other side, the tool is able to propose new candidate relationships for later experimental validation. The tool is available at the following web site: http://bioeda1.polito.it:8080/medSearchServlet/</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.11</guid>
  </item>
  <item>
     <title>PrePrint: Normalized feature vectors: A Novel Alignment-Free Sequence Comparison Method Based on Numbers of Adjacent Amino Acids</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.10</link>
     <description>Based on all kinds of adjacent amino acids (AAA), we map each protein primary sequence into a 400 by (L-1) matrix M. In addition, we further derive a normalized 400-tuple mathematical descriptors D, which is extracted from the primary protein sequences via singular values decomposition (SVD) of the matrix. The obtained 400-D normalized feature vectors (NFV) further facilitate our quantitative analysis of protein sequences. Using the normalized representation of the primary protein sequences, we analyze the similarity for different sequences upon two datasets: a) ND5 sequences from nine species; b) Transferrin sequences of 24 vertebrates. We also compared the results in this study with those from other related works. These two experiments illustrate that our proposed NFV-AAA approach does perform well in the field of similarity analysis of sequence.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.10</guid>
  </item>
  <item>
     <title>PrePrint: 2D meets 4G: G-Quadruplexes in RNA Secondary Structure Prediction</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.7</link>
     <description>G-quadruplexes are abundant locally stable structural elements in nucleic acids. The combinatorial theory of RNA structures and the dynamic programming algorithms for RNA secondary structure prediction are extended here to incorporate G-quadruplexes using a simple but plausible energy model. With preliminary energy parameters we find that the overwhelming majority of putative quadruplex-forming sequences in the human genome are likely to fold into canonical secondary structures instead. Stable G-quadruplexes are strongly enriched, however, in the 5' UTR of protein coding mRNAs.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.7</guid>
  </item>
  <item>
     <title>PrePrint: Pareto Optimal Pairwise Sequence Alignment</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.2</link>
     <description>Sequence alignment using evolutionary profiles is a commonly employed tool when investigating a protein. Many profile-profile scoring functions have been developed for use in such alignments, but there has not yet been a comprehensive study of Pareto optimal pairwise alignments for combining multiple such functions. We show that the problem of generating Pareto optimal pairwise alignments has an optimal substructure property, and develop an efficient algorithm for generating Pareto optimal frontiers of pairwise alignments. All possible sets of two, three and four profile scoring functions are used from a pool of eleven functions and applied to 588 pairs of proteins in the ce\_ref dataset. The performance of the best objective combinations on ce\_ref is also evaluated on an independent set of 913 protein pairs extracted from the BAliBASE RV11 dataset. Our dynamic-programming-based heuristic approach produces approximated Pareto optimal frontiers of pairwise alignments which contain comparable alignments to those on the exact frontier, but on average in less than 1/58th the time in the case of four objectives. Our results show that the Pareto frontiers contain alignments whose quality are better than the alignments obtained by single objectives. However, the task of identifying a single high-quality alignment among those in the Pareto frontier remains challenging.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.2</guid>
  </item>
  <item>
     <title>PrePrint: A Transcript Perspective on Evolution</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2012.145</link>
     <description>Alternative splicing is now recognized as a major mechanism for transcriptome and proteome diversity in higher eukaryotes. Yet, its evolution is poorly understood. Most studies focus on the evolution of exons and introns at the gene level, while only few consider the evolution of transcripts. In this paper, we present a framework for transcript phylogenies where ancestral transcripts evolve along the gene tree by gains, losses, and mutation. We demonstrate the usefulness of our method on a set of 805 genes and two different topics. First, we improve a method for transcriptome reconstruction from ESTs (ASPic), then we study the evolution of function in transcripts. The use of transcript phylogenies allows us to double the specificity of ASPic, whereas results on the functional study reveal that conserved transcripts are more likely to share protein domains than functional sites. These studies validate our framework for the study of evolution in large collections of organisms from the perspective of transcripts; for this purpose, we developed and provide a new tool, TrEvoR.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2012.145</guid>
  </item>
  <item>
     <title>PrePrint: RANGI: A Fast List-Colored Graph Motif Finding Algorithm</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2012.167</link>
     <description>Given a multiset of colors as the query and a list-colored graph, i.e. an undirected graph with a set of colors assigned to each of its vertices, in the NP-hard list-colored graph motif problem the goal is to find the largest connected subgraph such that one can select a color from the set of colors assigned to each of its vertices to obtain a subset of the query. This problem was introduced to find functional motifs in biological networks. We present a branch-and-bound algorithm named RANGI for finding and enumerating list-colored graph motifs. As our experimental results show, RANGI's pruning methods and heuristics make it quite fast in practice compared to the algorithms presented in the literature. We also present a parallel version of RANGI that achieves acceptable scalability.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2012.167</guid>
  </item>
  <item>
     <title>PrePrint: Efficient Algorithms for Knowledge-Enhanced Supertree and Supermatrix Phylogenetic Problems</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2012.162</link>
     <description>Phylogenetic inference is a computationally difficult problem, and constructing high quality phylogenetic trees that can build upon existing phylogenetic knowledge and synthesize insights from new data remains a major challenge. We introduce knowledge-enhanced phylogenetic problems for both supertree and supermatrix phylogenetic analyses. These problems seek an optimal phylogenetic tree that can only be assembled from a user-supplied set of, possibly incompatible, phylogenetic relationships. We describe exact polynomial time algorithms for the knowledge-enhanced versions of the NP-hard Robinson Foulds, gene duplication, duplication and loss, and deep coalescence supertree problems. Further, we demonstrate that our algorithms can rapidly improve upon results of local search heuristics for these problems. Finally, we introduce a knowledge-enhanced search heuristic that can be applied to discrete character data sets using the maximum parsimony (MP) phylogenetic problem. Although this approach is not guaranteed to find exact solutions, we show that it also can improve upon parsimony solutions from commonly used MP heuristics.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2012.162</guid>
  </item>
  <item>
     <title>PrePrint: Extending the Algebraic Formalism for Genome Rearrangements to Include Linear Chromosomes</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2012.161</link>
     <description>Algebraic rearrangement theory, as introduced by Meidanis and Dias, focuses on representing the order in which genes appear in chromosomes, and applies to circular chromosomes only. By shifting our attention to genome adjacencies, we introduce the adjacency algebraic theory, extending the original algebraic theory to linear chromosomes in a very natural way, also allowing the original algebraic distance formula to be used to the general multichromosomal case, with both linear and circular chromosomes. The resulting distance, which we call algebraic distance here, is very similar to, but not quite the same as, DCJ distance. We present linear time algorithms to compute it and to sort genomes. We show how to compute the rearrangement distance from the adjacency graph, for an easier comparison with other rearrangement distances. A thorough discussion on the relationship between the chromosomal and adjacency representation is also given, and we show how all classic rearrangement operations can be modeled using the algebraic theory.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2012.161</guid>
  </item>
  <item>
     <title>PrePrint: Reconstruction of Transcriptional Regulatory Networks by Stability-based Network Component Analysis</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2012.146</link>
     <description>Reliable inference of transcription regulatory networks is a challenging task in computational biology. Network component analysis (NCA) has become a powerful scheme to uncover regulatory networks behind complex biological processes. However, the performance of NCA is impaired by the high rate of false connections in binding information. In this paper, we integrate stability analysis with NCA to form a novel scheme, namely stability-based NCA (sNCA), for regulatory network identification. The method mainly addresses the inconsistency between gene expression data and binding motif information. Small perturbations are introduced to prior regulatory network, and the distance among multiple estimated transcript factor (TF) activities is computed to reflect the stability for each TF&amp;amp;#8217;s binding network. For target gene identification, multivariate regression and t-statistic are used to calculate the significance for each TF-gene connection. Simulation studies are conducted and the experimental results show that sNCA can achieve an improved and robust performance in TF identification as compared to NCA. The approach for target gene identification is also demonstrated to be suitable for identifying true connections between TFs and their target genes. Furthermore, we have successfully applied sNCA to breast cancer data to uncover the role of TFs in regulating endocrine resistance in breast cancer.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2012.146</guid>
  </item>
  <item>
     <title>PrePrint: An Integer Programming Formulation of the Parsimonious Loss of Heterozygosity Problem</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2012.138</link>
     <description>A Loss of Heterozygosity (LOH) event occurs when, by the laws of Mendelian inheritance, an individual should be heterozygote at a given site but, due to a deletion polymorphism, is not. Deletions play an important role in human disease and their detection could provide fundamental insights for the development of new diagnostics and treatments. In this article we investigate the Parsimonious Loss of Heterozygosity Problem (PLOHP), i.e., the problem of partitioning suspected polymorphisms from a set of individuals into a minimum number of deletion areas. Specifically, we generalize Halldorsson et al. work by providing a more general formulation of the PLOHP and by showing how one can incorporate different recombination rates and prior knowledge about the locations of deletions. Moreover, we show that the PLOHP can be formulated as a specific version of the clique partition problem in a particular class of graphs called undirected catch-point interval graphs and we prove its general NP-hardness. Finally, we provide a state-of-the-art integer programming formulation and strengthening valid inequalities to exactly solve real instances of the PLOHP containing up to 9000 individuals and 3000 SNPs. Our results give perspectives on the mathematics of the PLOHP and suggest new directions on the development of future efficient exact solution approaches.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2012.138</guid>
  </item>
  <item>
     <title>PrePrint: Curvature Analysis of Cardiac Excitation Wavefronts</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2012.125</link>
     <description>We present theSpiral Classification Algorithm(SCA), a fast and accurate algorithm for classifying electrical spiral waves and their associated breakup in cardiac tissues. The classification performed by SCA is an essential component of the detection and analysis of various cardiac arrhythmic disorders, including ventricular tachycardia and fibrillation. Given a digitized frame of a propagating wave, SCA constructs a highly accurate representation of the front and the back of the wave, piecewise interpolates this representation with cubic splines, and subjects the result to an accurate curvature analysis. This analysis is more comprehensive than methods based on spiral-tip tracking, as it considers the entire wave front and back. To increase the smoothness of the resulting symbolic representation, the SCA uses weighted overlapping of adjacent segments which increases the smoothness at join points. SCA has been applied to a number of representative types of spiral waves, and, for each type, a distinct curvature evolution in time (signature) has been identified. Distinct signatures have also been identified for spiral breakup. These results represent a significant first step in automatically determining parameter ranges for which a computational cardiac-cell network accurately reproduces a particular kind of cardiac arrhythmia, such as ventricular fibrillation.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2012.125</guid>
  </item>
  <item>
     <title>PrePrint: Computational Reconstruction of Transcriptional Relationship from ChIP-Chip Data</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2012.102</link>
     <description>Eukaryotic gene transcription is a complex process, which requires the orchestrated recruitment of a large number of proteins, such as sequence-specific DNA binding factors, chromatin remodelers and modifiers, and general transcription machinery, to regulatory regions. Previous works have shown that these regulatory proteins favor specific organizational theme along promoters. Details about how they cooperatively regulate transcriptional process, however, remain unclear. We developed a method to reconstruct a Bayesian network model representing functional relationships among various transcriptional components. The positive/negative influence between these components was measured from protein binding and nucleosome occupancy data and embedded into the model. Application on S.cerevisiae ChIP-Chip data showed that the proposed method can recover confirmed relationships, such as Isw1-Pol II, TFIIH-Pol II, TFIIB-TBP, Pol II-H3K36Me3, H3K4Me3-H3K14Ac, etc. Moreover, it can distinguish co-locating components from functionally related ones. Novel relationships, e.g., ones between Mediator and chromatin remodeling complexes (CRCs), and the combinatorial regulation of Pol II recruitment and activity by CRCs and general transcription factors (GTFs), were also suggested. Conclusion: Protein binding events during transcription positively influence each other. Among contributing components, GTFs and CRCs play pivotal roles in transcriptional regulation. These findings provide insights into the regulatory mechanism.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2012.102</guid>
  </item>
  <item>
     <title>PrePrint: Rough-Fuzzy Clustering for Grouping Functionally Similar Genes from Microarray Data</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2012.103</link>
     <description>Gene expression data clustering is one of the important tasks of functional genomics as it provides a powerful tool for studying functional relationships of genes in a biological process. Identifying co-expressed groups of genes represents the basic challenge in gene clustering problem. In this regard, a gene clustering algorithm, termed as rough-fuzzy c-means, is proposed judiciously integrating the merits of rough sets and fuzzy sets. While the concept of lower and upper approximations of rough sets deals with uncertainty, vagueness, and incompleteness in cluster definition, the integration of probabilistic and possibilistic memberships of fuzzy sets enables efficient handling of overlapping partitions in noisy environment. The concept of possibilistic lower bound and probabilistic boundary of a cluster, introduced in rough-fuzzy c-means, enables efficient selection of gene clusters. An efficient method is proposed to select initial prototypes of different gene clusters, which enables the proposed c-means algorithm to converge to an optimum or near optimum solutions and helps to discover co-expressed gene clusters. The effectiveness of the algorithm, along with a comparison with other algorithms, is demonstrated both qualitatively and quantitatively on fourteen yeast microarray data sets.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2012.103</guid>
  </item>
  <item>
     <title>PrePrint: The Propagation Approach for  Computing Biochemical Reaction Networks</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2012.91</link>
     <description>We introduce propagation models, a formalism able to express several kinds of equations that describe the behavior of biochemical reaction networks. Furthermore, we introduce the propagation abstract data type, which separates concerns regarding different numerical algorithms for the transient analysis of biochemical reaction networks from concerns regarding their implementation, thus allowing for portable and efficient solutions. The state of a propagation abstract data type is given by a vector that assigns mass values to a set of nodes, and its next operator propagates mass values through this set of nodes. We propose an approximate implementation of the next operator, based on threshold abstraction, which propagates only "significant" mass values and thus achieves a compromise between efficiency and accuracy. Finally, we give three use cases for propagation models: the chemical master equation, the reaction rate equation, and a hybrid method that combines these two equations. These three applications use propagation models in order to propagate probabilities and/or expected values and variances of the model's variables.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2012.91</guid>
  </item>
  <item>
     <title>PrePrint: Matching Split Distance for Unrooted Binary Phylogenetic Trees</title>
     <link>http://doi.ieeecomputersociety.org/10.1109/TCBB.2011.38</link>
     <description>The reconstruction of evolutionary trees is one of the primary objectives in phylogenetics. Such a tree represents the historical evolutionary relationship between different species or organisms. Tree comparisons are used for multiple purposes, from unveiling the history of species to deciphering evolutionary associations among organisms and geographical areas. In the paper we propose a new method of defining distances between unrooted binary phylogenetic trees that is especially applicable to relatively large phylogenetic trees. Next, we investigate in details properties of one example of these metrics called Matching Split distance.</description>
     <guid isPermaLink="true">http://doi.ieeecomputersociety.org/10.1109/TCBB.2011.38</guid>
  </item>
   </channel>
</rss>