BITS Meetings' Virtual Library

BITS Meetings' Virtual Library:
Abstracts from Italian Bioinformatics Meetings from 1999 to 2013

766 abstracts overall from 11 distinct proceedings

Display Abstracts | Brief :: Order by Meeting | First Author Name

1. Bertoni A, Folgieri R, Ruffino F, Valentini G
Assessment of clusters reliability for high dimensional genomic data
Meeting: BITS 2005 - Year: 2005
Full text in a new tab
Topic: Computer algorithms and applications

Abstract: Discovering new subclasses of pathologies and expression signatures related to specific phenotypes are challenging problems in the context of gene expression data analysis. To pursue these objectives, we need to estimate the “natural” number and the stability of the discovered clusters. To this end, new approaches based on random subspaces and bootstrap methods have been recently proposed.

2. Bortolussi L, Fabris F, Policriti A
Bundled Suffix Trees
Meeting: BITS 2005 - Year: 2005
Full text in a new tab
Topic: Computer algorithms and applications

Abstract: A Suffix Tree (ST) is a --now classical-- data structure, computable in linear time, which represents the most algorithmically appropriate way to store a string, in order to face problems like the Exact String Matching Problem (ESM) or the Longest Common Exact Substring Problem (LCES). Even if very efficient in solving these problems, the ST data structure suffers from an important drawback, when dealing with an Approximate String Matching Problem (ASM) or with the harder Longest Common Approximate Substring Problem (LCAS), as only exact matching can be used in visiting a ST. In the approximate cases, a suitable notion of distance (most frequently Hamming or Levenshtein distances) must come into play. However, in the literature, there is no universally accepted data structure capable to deal with approximate searches just by performing algorithmic manipulations similar to ST’s. This makes necessary, when using ST's in an approximate context, taking into account errors by using unnatural and complicate strategies, inevitably leading to cumbersome algorithms.

3. Fariselli P, Martelli PL, Casadio R
The posterior-Viterbi: a new decoding algorithm for hidden Markov models
Meeting: BITS 2005 - Year: 2005
Full text in a new tab
Topic: Computer algorithms and applications

Abstract: Hidden Markov models (HMM) are powerful machine learning tools successfully applied to problems of computational Molecular Biology. In a predictive task, the HMM is endowed with a decoding algorithm in order to assign the most probable state path, and in turn the class labeling, to an unknown sequence. The Viterbi and the posterior decoding algorithms are the most common. The former is very efficient when one path dominates, while the latter, even though does not guarantee to preserve the automaton grammar, is more effective when several concurring paths have similar probabilities. A third good alternative is 1-best, which was shown to perform equal or better than Viterbi.

4. Menchetti S, Costa F, Frasconi P
Weighted decomposition kernels for protein subcellular localization
Meeting: BITS 2005 - Year: 2005
Full text in a new tab
Topic: Computer algorithms and applications

Abstract: Knowledge about the subcellular localization of proteins can provide important information about their function. A reliable automatic classification method for predicting subcellular localization from sequences is therefore a valuable tool to shed light on protein function and may help towards the solution of genomic scale level problems such as the identification of pharmaceutical targets.

5. Pavesi G, Stefani M, Mauri G, Pesole G
An algorithm for finding regulatory sequences of homologous genes
Meeting: BITS 2005 - Year: 2005
Full text in a new tab
Topic: Computer algorithms and applications

Abstract: One of the greatest challenges in modern molecular biology is the identification and characterization of the functional elements regulating gene expression. Two of the most important elements are transcription factors (TFs), and the sites of the genome where they can bind (TFBSs). The TF-DNA interactions, that are responsible for the modulation of gene transcription, are at the basis of many critical cellular processes, and their malfunction often involves the onset of genetic diseases. TFBSs are located either near the transcription start site of a gene (usually within 500-1000 bps), or alternatively at very large distance (often several kilobases) from it, either upstream or downstream. When the regulation of a single gene is investigated, the idea is to increase the signal/noise ratio by comparing its flanking regions (upstream and/or downstream) with homologous genome regions of the same or other organisms at different evolutionary distances. Those parts of the regions that are more conserved throughout the different species are more likely to have been preserved by evolution for their function, and thus could be (or contain) TFBSs. Most of the methods introduced so far first build a global alignment of the sequences (some pairwise, some multiple), and report the most conserved parts of the alignment (with or without further processing, for examples by looking for known TFBSs instances in them). While this approach can produce good results, since a highly conserved region can be a good candidate for a regulatory activity, some experiments have shown that real TFBSs are often mis-aligned, and fall outside the “best regions” of the alignment (that, anyway, becomes computationally problematic for long regions, especially in the case of multiple comparisons). In this work we present an algorithm that does not require a global alignment of the sequences, nor needs to be supported by matrices or instances of known TFBSs in order to detect potential regulatory motifs.

6. Sciortino M, Mantaci S, Restivo A, Rosone G
A new sequence distance measure based on the Burrows- Wheeler Transform
Meeting: BITS 2005 - Year: 2005
Full text in a new tab
Topic: Computer algorithms and applications

Abstract: The recent developments in sequencing genomes have given a new direction to bioinformatic research. Actually, the possibility of sequencing the whole genome has raised the question of discovering common features between biological sequences corresponding to different species, reflecting on common evolutionary and functional mechanisms. This reason has led researchers to look for a definition of a distance measure on sequences able to capture these common mechanisms. Most of the traditional methods for comparing biological sequences were based on the technique of sequence alignment. Nevertheless, sequence alignment considers only local mutations of the genome, therefore it is not suitable to measure events like segment rearrangements, that involve longer genomic sequences. For this reason some alignment free distance measures have been recently introduced (see [VinAl] for a survey) and most of them are based on the concept of information theory and data compression (cf. [OS, LV, CV, EMS, BCL]). Such measures are more suitable to deal with the problem of whole genome phylogeny. The intuitive idea is that the more similar two sequences are, the more effective their joint compression is than their independent compression. We introduce a new alignment free method for comparing sequences that, differently from other ones, is combinatorial by nature and does not make use of any compressor nor any information theoretic notion.

7. Leo P, Marinelli C, Pappadà G, Scioscia G, Zanchetta L
BioWBI: an Integrated Tool for building and executing Bioinformatic Analysis Workflows
Meeting: BITS 2004 - Year: 2004
Full text in a new tab
Topic: Computer algorithms and applications

Abstract: Building integrated bioinformatic platforms is one of the most challenging tasks which Bioinformatics community is dealing with in recent years [1-2]. Facing this task, a number of specific problems arises connected to data integration, integration of specialized tools and algorithms. The solution described in this paper goes in the direction to solve this challenge. It is characterized by two original assumptions: 1) a quite sharp division between the data realm of a bioinformatics analysis and its components in terms of algorithms and processes, 2) the conception of a rigorous algebra that allows researchers to formalize their analyses in terms of atomic process workflows. As a result of this approach two bioinformatics web tools, BioWBI and WEE, have been designed and prototyped by our group to provide researchers with a virtual collaborative workspace in which defining their data-sources, drawing graphically as well as executing analysis workflows. These tools constitute the basic components of a much more general bioinformatic e-workplace.

8. Merelli E, Romano P, Scortichini L
A Workflow Service for Biomedical Application
Meeting: BITS 2004 - Year: 2004
Full text in a new tab
Topic: Computer algorithms and applications

Abstract: The proposed work has been developed in the O2I (Oncology over Internet) project context. O2I aims to develop and prototyping an integrated platform suitable to support the biomedical and clinical research during the retrieval, from Internet, and the integration, in a standard format, of both structured and textual information. Usually biomedical researchers interact, step by step, with the Web to query, select and integrate information; during the daily work, a bioscientist would benefit from a powerful tool able to execute queries consisting in several interrelated activities. In this scenario, the biomedical research process can be formulated as a workflow of activities, whose execution must be supported by a suitable middleware. We propose a workflow service agent to support bioscientist during the creation of their own workflows, by also monitoring their execution. In particular, in the O2I context, we are experimenting BioAgent an agent-based middleware developed at Camerino University; the middleware can be configurated by plugging-in agent-services to support the tool/services integration for a specific domain.

9. Mishra B, Policriti A
Systems Biology, Automata, and Languages
Meeting: BITS 2004 - Year: 2004
Full text in a new tab
Topic: Computer algorithms and applications

Abstract: The central theme of our work is related to problem of formulating a “unitary step” that defines how a complex biological system makes a transition from one “state” or one “control mode” to another, as well as the conditions under which such transitions are enabled. This is because we recognize that automata (either discrete or hybrid, that is capable of modeling a mixed discrete/continuous behaviour), based on the formulation of these unitary steps, can elegantly model biological control mechanisms, allow us to reason about such mechanisms in a modal logic systems with modes constructed over a next-time operator, and can become the foundational framework for the emerging field of systems biology. These models can lead to more rigorous algorithmic analysis of large amounts of biological data, produced as (numerical) traces of in vivo, in vitro and in silico experiments—currently a central activity for many biologists and biochemists. Since modeling biological systems requires a careful consideration of both qualitative and quantitative aspects, our automata-based tools can effectively assist the working biologists to make predictions, generate falsifiable hypotheses and design wellfocused experiments—activities in which the time dimension and a properly designed query language cannot be left out of consideration. Thus, ultimately, the aim of our work is to elucidate the role played by automata in modeling biological systems and to investigate the potential of such tools when combined with more “classical” approaches used in the past to devise models and experiments in biology. Our discussion here is based primarily on our experience with a novel system that we introduced recently (called, XS-systems) and used it to implement algorithms and software tools (Simpathica). These conceptual tools have been integrated with prototype implementations, and are currently undergoing many interesting and growing sets of enhancements and optimizations

10. Bultrini E, Pizzi E
Linguistic analysis of promoter regions in eukaryotic genomes
Meeting: BITS 2004 - Year: 2004
Full text in a new tab
Topic: Computer algorithms and applications

Abstract: Promoter recognition is one of the most difficult tasks in annotating eukaryotic genomes. Binding sites for transcription factors are very short sequences (5-15 bp) and not very well preserved in sequence. In addition, other signals can be associated with a regulatory region. For instance in vertebrates, some classes of promoters are associated with compositionally characterised regions (CpG islands) and there is also evidence that molecular conformation of human promoters is involved in the transcription activity [1, 2]. Following a previous investigation [3, 4], in the present work we propose a new procedure, based on well established statistical methods, to extract a set of oligonucleotides specifically characterising intron sequences. Partitioning of genomic sequences, based on the accordance to the extracted “introns’vocabulary”, reveals that intergenic DNA appears as a patchwork of different elements. The majority of them adopt the “introns’ vocabulary”, whereas some others (a small percentage) do not. We hypothesise that the identified linguistic property is a sort of “background-noise” of a genome; in this perspective regions that play a functional and/or a structural role have probably to emerge from the background, adopting specific compositional properties. The analysis of promoter sequences for the four examined genomes (C. elegans, D. melanogaster, M. musculus, H. sapiens) appears to confirm our hypothesis, as regions immediately surrounding the transcritpion start site deviate from the introns’vocabulary usage. Furthermore, analyses on C+G composition, bendability propensity and torsional rigidity of promoter sequences are presented.

BITS Meetings' Virtual Library
driven by Librarian 1.3 in PHP, MySQL^TM and Apache environment.

For information, email to paolo.dm.romano@gmail.com .