BITS Meetings' Virtual Library:
Abstracts from Italian Bioinformatics Meetings from 1999 to 2013


766 abstracts overall from 11 distinct proceedings





Display Abstracts | Brief :: Order by Meeting | First Author Name
1. Guigò R
Finding genes by comparing genomes: the case of selenoproteins
Meeting: BITS 2004 - Year: 2004
Full text in a new tab
Topic: Comparative genomics

Abstract: Although the genome sequence and gene content are available for an increasing number of organisms, eukaryotic selenoproteins remain poorly characterized. In these proteins, selenium (Se) is incorporated in the form of selenocysteine (Sec), the 21st amino acid. Selenocysteine is cotranslationally inserted in response to UGA codons (a stop signal in the canonical genetic code). The alternative decoding is mediated by a stem-loop structure in the 3'UTR of selenoprotein mRNAs (the SECIS element). Selenium is implicated in male infertility, cancer and heart diseases, viral expression and ageing. In addition, most selenoproteins have homologues in which Sec is replaced by cysteine (Cys). Genome biologists rely on the high-quality annotation of genomes to bridge the gap from the sequence to the biology of the organism. However, for selenoproteins, which mediate the biological functions of selenium, the dual role of the UGA codon confounds both the automatic annotation pipelines and the human curators. In consequence, selenoprotein are misannotated in the majority of genome projects. Furthermore, the finding of novel selenoprotein families remains a difficult task in the newly released genome sequences. In the last few years, we have contributed to the exhaustive description of the eukaryotic selenoproteome (set of eukaryotic selenoproteins) through the development of a number of ad hoc computational tools. Our approach is based on the capacity of predicting SECIS elements, standard genes and genes with a UGA codon in-frame in one or multiple genomes. Indeed, the comparative analysis plays an essential role because 1) SECIS sequences are conserved between close species (eg. human-mouse); and 2) sequence conservation across a UGA codon between genomes at further phylogenetic distance strongly suggests a coding function (eg. human-fugu). Our analysis of the fly, human and fugu genomes have resulted in 8 novel selenoprotein families. Therefore, 19 distinct selenoprotein families have been described in eukaryotes to date. Most of these families are widely (but not uniformly) distributed across eukaryotes, either as true selenoproteins or Cys-homologues. The recent completion of the Tetraodon nigroviridis and Fugu rubripes genomes has allowed us to investigate the eukaryotic selenoproteome in a restricted and largely unexplored window within the vertebrate phylogeny. Our investigation has resulted in the identification of a novel selenoprotein family, currently under study, which appears to be restricted to actinopterygians among vertebrates. The correct annotation of selenoproteins is thus providing insight into the evolution of the usage of Sec. Our data indicate a discrete evolutionary distribution of selenoproteins in eukaryotes and suggest that, contrary to the prevalent thinking of an increase in the number of selenoproteins from less to more complex genomes, Sec-containing proteins scatter all along the complexity scale. We believe that the particular distribution of each family is mediated by an ongoing process of Sec/Cys interconversion, in which contingent events could play a role as important as functional constraints. The characterization of eukaryotic selenoproteins illustrates some of the most important challenges involved in the completion of the gene annotation of genomes. Notably among them, the increasing number of exceptions to our standard theory of the eukaryotic gene and the necessity of sequencing genomes at different evolutionary distances towards such a complete annotation.

2. Corà D, Herrmann C, Dieterich C, Di Cunto F, Provero P, Caselle M
Identification of human transcription factor binding sites by comparative genomics.
Meeting: BITS 2004 - Year: 2004
Full text in a new tab
Topic: Comparative genomics

Abstract: Understanding transcriptional regulation of gene expression is one of the greatest challenges of modern molecular biology. A central role in this mechanism is played by transcription factors (TF) which typically bind to specific, short DNA sequence motifs which are usually located in the upstream region of the regulated genes. We discuss here a simple and powerful approach for the identification of these cis-regulatory motifs based on human-mouse genomic comparison. By using the catalogue of conserved upstream sequences collected in the CORG database [1] we construct sets of genes sharing the same overrepresented motif in their upstream regions both in human and in mouse. We perform this construction for all possible words from 5 to 8 nucleotides in length and then filter the resulting sets looking for two types of evidence for coregulation: first, we analyse the Gene Ontology annotation of the genes in the set looking for statistically significant common annotation; second, we analyse the expression profiles of the genes in the set as measured by microarray experiments, looking for evidence of coexpression. The sets which pass one or both these filters are conjectured to contain a significant fraction of coregulated genes, and the upstream motifs characterizing the sets are thus good candidates to be the binding sites of the TF's involved in such regulation. In this way we find various known motifs (which we use to validate our approach) and also some new candidate binding sites.

3. Ambesi-Impiombato A, Di Bernardo D
Novel Computational Method for Human Cis Regulatory Elements Prediction
Meeting: BITS 2004 - Year: 2004
Full text in a new tab
Topic: Comparative genomics

Abstract: Introduction Biological mechanisms underlying the regulation of gene expression are not completely understood. It is known that they involve binding of transcription factors to regulatory elements on gene promoters. However, attempts to computationally predict such elements in DNA sequences of gene promoters typically yield an excess of false positives. Computational identification of CREs is currently based mainly on three different approaches: (1) identification of conserved motifs using interspecies sequence global alignments (Pennacchio 2001); (2) identification of conserved motifs in the promoters of coregulated genes (Hughes et al 2000, Sudarsanam et al 2002, Bussemaker et al 2001, Eskin et al 2002, Bailey et al 1994, Fujibuchi et al 2001, Palin et al 2002); (3) computational detection of known experimentally identified motifs in genes’ promoters for which binding factors are unknown (Kel et al 2003). The limitations of the first approach are caused by the high mutation, deletion and insertion rates in gene promoter regions (Ludwig 2002), that prevent a correct alignment of the promoter region. As experimental data is accumulating on known DNA binding elements, increasing amount of information can be used to search for similar elements in genes for which transcription factors are unknown. Our approach involves consensus pattern search of known regulatory elements in 5kb upstream of gene transcription start site against a background word distribution simulated by shuffling symbols in consensus, with the aim of minimizing false positives by using a background model of random matches of experimentally determined consensi, and integrating information from the promoters of ortholog genes.

4. Sironi M, Riva L, Menozzi G, Pozzoli U
Silencer elements as possible inhibitors of pseudoexon splicing
Meeting: BITS 2004 - Year: 2004
Full text in a new tab
Topic: Comparative genomics

Abstract: Introduction Production of functional mRNAs in eukaryotic organisms is critically dependent upon the accuracy of pre-mRNA splicing. The presence of well-defined cis-elements, namely the 5’ and 3’ splice sites and the branch point, is necessary but not sufficient to define intron-exon boundaries (1). Sequences within exon bodies have a prominent role in promoting exon definition; the best understood exonic elements are represented by exonic splicing enhancers (ESE) which represent binding sites for SR proteins (2). Sequences that act as exonic splicing silencer (ESS) have also been described but are less well characterized than ESEs. It has been reported that pseudoexons (i.e. intronic sequences displaying good 3’ and 5’ splice sites) outnumber real exons by an order of magnitude (3). Recent observations (4, 5) suggest that a subpopulation of pseudoexons might exist in the human genome requiring only subtle changes to become splicing competent. Here we have applied a biocomputational approach to address the question of why pseudoexons are ignored and to identify putative splicing repressor elements.

5. Pavesi G, Mauri G, Pesole G
Weeder Web: a Web-Based Tool for the Discovery of Transcription Factor Binding Sites
Meeting: BITS 2004 - Year: 2004
Full text in a new tab
Topic: Comparative genomics

Abstract: Understanding the complex mechanisms governing basic biological processes requires the characterization of regulatory motifs modulating gene expression at transcriptional and post-transcriptional level. In particular, the extent, chronology and cell-specificity of transcription are modulated by the interaction of transcription factors (TFs) with their corresponding binding sites (TFBS), located in the promoter regions of the genes. The ever growing amount of genomic data, complemented by other sources of information concerning gene expression opens new opportunities to researchers. Transcription factor binding sites are generally short (less than 12-14 bp long) and degenerate oligonucleotides, and this fact makes significantly harder their computational discovery and large-scale annotation. Hence, the need for efficient and reliable methods for detecting novel motifs, significantly over-represented in the regulatory regions of sets of genes sharing common properties (e.g. similar expression profile, biological function, product cellular localization, etc.), that in turn could represent binding sites for the some common TF regulating the genes. We present here a Web server that provides access to a previously developed enumerative pattern discovery method [1] that is able to carry out an exhaustive search of significantly conserved degenerate oligonucleotide patterns with remarkable computational efficiency. Also, the interface has been designed in order avoid the explicit definition of a large number of parameters that were included in the original general-case implementation of the algorithm, as well as to produce a simpler “user-friendly” output. The parameters have been set to default values suitable for capturing TFBSs. The interface Web address is: http://www.pesolelab.it:8080/weederWeb



BITS Meetings' Virtual Library
driven by Librarian 1.3 in PHP, MySQLTM and Apache environment.

For information, email to paolo.dm.romano@gmail.com .