Bioinformatics in the Classroom

Lots of Links

Lists of Links

ABIM

Sullivan

Symposium

ABIM

Centers

Baylor College of Medicine: BCM, Human Genome Sequencing Center (BCM Sequence Utilities)

BIOSCAN

E. coli Genome Center, University of Wisconsin-Madison.

European Molecular Biology Laboratory:EMBL

European Bioinformatics Institute: EBI

ExPASy

Genome Net WWW server Japan

Harvard Biological Laboratories

Johns Hopkins University

Lawrence Berkeley Laboratory Human Genome Center

Lawrence Livemore National Laboratory's Biology: LLNL

National Center for Biotechnology Information: NCBI

National Center for Genome Resources: NCGR

National Institute of Health: NIH

NIH GenoBase DataBase GateWay: BIMAS

SEQNET

SRS

Stanford Human Genome Center: SHGC

The Sanger Centre

Weizmann Institute of Science Genome and Bioinformatics

Whitehead Institute for Biomedical Research/MIT Center for Genome Research

Entrez

Databases
Multiple Organisms
GENBANK

EMBL

DDBJ

Inherited Disease Genes Identified by Positional Cloning

The Expressed Gene Anatomy Database: EGAD

Sequence Elements
EPD Eukaryotic promoter database

PROSITE

The CpG Isle database

TRANSFAC Transcription factor database

Genome Organism
Overview of Genome Sequencing Projects Terry Gaasterland

ACEDB (Sanger)

ACEDB C.Elegans (AGIS)

BSORF: Bacillus subtilis (Japan and PasteurInstitute, France)

Cyanobase Synechocystic sp., Kazusa DNA Research Institute, Japan

DictyDB--The Soil Amoebae Dictyosteliusm discoideum

E.coli K-12 Genome Sequence, Wisconsin-Madison University

E.coli Databank, NIST (Japan)

FlyBase: A Database of the Drosophila Genome

Human Genome DataBase GDB

Human sequences collections, UniGene, NCBI

MBGD - Microbial Genome Database, HGC, Univ. of Tokyo

Mouse Genome Database: MGD

MycDB -- Mycobacteria

Pyrococcus horikoshii OT3 Database, NITE, Japan

PathoGenes--Fungal pathogens of small-grain cereals

Rat Genome DataBase

Saccharomyces Genome Database (Stanford)

The TIGR Microbial Database

The TIGR Arabidopsis Thaliana Database

The TIGR Mouse Gene Index: MGI

Organism Classification
The TIGR Microbial Database

The TIGR Arabidopsis Thaliana Database

The TIGR Mouse Gene Index: MGI

Genome DataBase GDB

UniGene Human sequences collections

Saccharomyces Genome Database

ACEDB (Sanger)

ACEDB C.Elegans (AGIS)

DictyDB--The Soil Amoebae Dictyosteliusm discoideum

MycDB -- Mycobacteria

PathoGenes--Fungal pathogens of small-grain cereals

The Mouse Genome Database: MGD

The Rat Genome DataBase

FlyBase: A Database of the Drosophila Genome

Training Sequence
Benchmark Sequences from Banbury Cross

Sequence Test Set (Roderic Guigó i Serra)

Bibliography
Libraries
Global Bioinformatics New Service.

Computational Gene Recognition Bibliography.

PubMed Retrieval System Free MEDLINE.

BIO-JOURNALS/bionet.journals.contents Table of Contents Archive and Search.

Search Bio-wURLd for your favourite URL.

Clio server at CSHL Cold Spring Harbor Laboratory Press.

The NIH Grant Database Search

Biologists Search Palette Medline Search, Protein Database search, Nucleotide Database search, dbEST Search

Searchable Gopher Index: Biology-journal-contents

Searchable Gopher Index: Bibliography on sequence analysis

Electronic Journals
Bioinformatics (former CABIOS) on Line

Cell on Line.

EMBO J on Line

Gene Combis Computing for Molecular Biology Information Service (TM).

Genome Research on Line

JBC on Line

NATURE on Line

NAR on Line

SCIENCE on Line

Other Journals
Molecular Vision A peer-reviewed journal of Molecular and Cellular Vision Research

Oxford University Press (OUP)

Springer Journals Preview Service (SVJPS)

Academic Press Journals Gopher Menu

Bio/Chemical Journals from Pedro's BioMolecular Research Tools

Courses, Workshops, News
ADVANCED GENOME SEQUENCE & ANALYSIS Course

WWW Virtual Library: Biomolecules

BioMOO Electronic Conference transcript with invited lecturer William R.Person.

Human Genome News

Web Servers
Predicting Gene Structure

AAT (Analysis and Annotation Tool for Finding Genes in Genomic Sequences) Michigan (USA)

CDS (Search Coding Regions), Pasteur (France)

EcoParse (mailserver for finding genes in E. coli DNA) Info (Denmark)

Eukaryotic Promoter Prediction by Neural Network LBNL (USA)

FramePlot NIH-NET (Japan)

Gene Finder, (Human, Mouse, Arabidopsis, Fission Yeast) CSHL (USA)

GeneID-3 Search Form IMIM, Barcelona, (Spain)

Genie (gene finder based on Hidden Markov Models) LBLN (USA)

GenLang (tRNA, group I intron, protein gene : "Linguistic Methods") CBIL Pennsylvania (USA)

GenScan (Identification of complete gene structures in genomic DNA) Stanford (USA)

GenView (Protein-Coding Gene Prediction) ITBA (Italy)

GRAIL ORNL Oak Ridge (Exons, repeats, Poly A sites, CpG) (USA)

HCpolya (Hamming Clustering poly-A prediction in Eukaryotic Genes) ITBA (Italy)

HCtata (Hamming-Clustering Method for TATA Signal Prediction in Eukaryotic Genes) ITBA (Italy)

NetGene2 CBS (Denmark)

NetPlantGene V2.0 (neural network predictions of splice site prediction in Arabidopsis thaliana DNA) CBS (USA)

ORF Finder (graphical analysis tool) NCBI (USA)

ORFGene (Gene Structure Prediction using Homologous Proteins) ITBA (Italy)

PredictGenes CBRG Zurich (Switzerland)

Procrustes WWW server Gene Recognition via Spliced Alignment, USC (USA)

Proscan II (predicts putative eukaryotic Pol II promoter sequences) University of Minnesota, (USA)

Putative DNA Sequencing Errors Check EMBL-Bork (Germany)

RecSta (coding region (CDS or exon) prediction program using COA) Lyon (France)

Splice Site Prediction by Neural Network LBNL (USA)

SpliceView (Splice Prediction by using Consensus Sequences) ITBA (Italy)

tRNAscan (Search for transfer RNA genes in genomic sequence) Washington (USA)

Molecular Biological Server Variety of tools; Laboratory of Theoretical Genetics, Novosibirsk, Russia

Blast
Arabidopsis (blastn, tblastn, tblastx, blastp, blastx) Stanford, (blastn, tblastn) St Louis

Aspergillus nidulans (blastn, tblastn, tblastx) Oklahoma

dbEST, NCBI's EST (blastn, tblastn, tblastx) CBIL

C.briggsae (blastn, tblastn) St Louis

C.elegans (blastn, tblastn, tblastx, blastp, blastx) Sanger, (blastn, tblastn) St Louis, EST (blastn, tblastn) NIG

Complete Genomes (blastp), BMERC

Drosophila (blastn, tblastn, tblastx, blastp, blastx) (Berkeley), BDGP, (blastn, tblastn, tblastx, blastp, blastx) EBI

E.coli (blastn) Alces, (blastn, tblastn) ECDC

Fugu rubripes (blastn) HGMP

Human (blastn, blastx) Sanger, (blastn, tblastn) St Louis

HIV (blastn, blastx, tblastn, blastp tblastx, blast3) IGS, (blastn, blastx, tblastn, blastp) Stanford

Microbial genomes (NCBI) (tblastn, blastn) TIGR

M.leprae (blastn, tblastn, tblastx), Sanger

M.tuberculosis (blastn, tblastn, tblastx), Sanger

Neisseria gonorrhoeae (blastn, tblastn, tblastx), Oklahoma

N.meningitidis (blastn, tblastn, tblastx), Sanger

Parasite (blastn, tblastn, tblastx) EBI

Pseudomonas aeruginosa NCBI (tblastn, blastn, blastx) NCBI

P.falciparum (blastn, tblastn, tblastx), Sanger

S.cerevisiae (blastn, tblastn, tblastx), Sanger, (blastn, tblastn, tblastx, blastp, blastx) Stanford

S.coelicolor (blastn, tblastn, tblastx), Sanger

S.pombe (blastn, tblastn, tblastx), Sanger

Streptococcus pyogenes (blastn, tblastn, tblastx) Oklahoma

Synechocystic sp.(blastp, blastn) Kazusa

Yeast (tblastn, blastp) Alces

Miscellaneous Servers
CENSOR Web Server aligns sequences against a reference collection of human or rodent repeats, Genetic Information Research Institute, Palo Alto, California

Dst Alces internal human repeats (Minnesota)

Dot Plot Alces internal repeats (Minnesota)

Prodom with Blast INRA Toulouse (accès par e-mail)

Repeat Masker screens DNA sequences in fasta format against a library of repetitive elements (University of Washington)

Software
Commercial Packages
DNASTAR

GCG: The Wisconsin Package Software

GENQUEST

OMIGA PCGENE compatible (Oxford Molecular Group Product)

BIONET online service (Oxford Molecular Group Product)

GENESEQ A Protein and Nucleic Acid Database (Oxford Molecular Group Product)

Miscellaneous Packages
DELILA (Logos)

EGCG: (Sanger)

GLIMMER: Gene Locator and Interpolated Markov Modeler. (A system for finding of genes in microbial DNA)

H2W WWW interface to the GCG Sequence Analysis Software Package

Annotation

  • Gene family experts - these researchers are willing to provide assistance in the annotation of certain genes
  • Exon prediction analysis - how well do commonly used algorithms work?
  • Methods for identification of transposons, solo LTRs, and other genomic repeats
  • Scheme for naming genes encoded on the BAC clones

 

Annotation References and Bibliography

We use compositional and comparative methods in the analysis of genomic DNA sequence. The comparative methods primarily involve searches for homology within publicly available databases and multiple sequence analyses generated locally. Compositional analysis is concerned with the construction of gene models from computer-predicted exons and exon-splice sites.

Database Searching

Motif searching

Exon prediction and gene modelling programs

tRNA predictions are performed with tRNAScan-SE

Click here for a detailed discussion of our method for identfying putative autonomous and non-autonomous transposons, solo LTRs, and LINEs.

Short tandem repeats or microsatellites are located with RepeatMasker.

Local manipulations of the sequence are performed with the GCG package.

Multiple sequence analysis is also performed with CLUSTALW.

Resources for Arabidopsis phenotypes include Arabidopsis Information Management System (AIMS) and Nottingham Arabidopsis Stock Centre (NASC).

The Kyoto Encyclopedia for Genes and Genomes (KEGG) is a good resource for metabolic pathways.

We also utilize NCBI's Entrez to gather relevant information. Various scripts written in-house are also used.

And some good old fun at http://jura.ebi.ac.uk:8765/ext-genequiz/genomes/dm0006/

Mesquite: a modular system for evolutionary analysis; a new tool from the makers of the Tree of Life

Does genome size matter? One place to ponder this question is the Animal Genome Size Database, where a zoology graduate student has been compiling tables of all animal genomes for which he can find data. Look up chromosome numbers and genome sizes ("C-values") for over 2900 vertebrates and invertebrates, from spongi to Homo sapiens. Includes links to similar sites for animal and plant genomes.