via USDA - provides citations to agricultural literature
The broad mission of ChromDB is display, annotate, and curate sequences of two broad functional classes of biologically important proteins: chromatin-associated proteins (CAPs) and RNA interference-associated proteins. Plant proteins are the major focus of the work support by The Plant Genome Research Program (PGRP) of the National Science Foundation. Our intent is to produce intensively curated sequence information and make it available to the research and teaching community in support of comparative analyses toward understanding the chromatin proteome in plants, especially in important crop species. In order to take do a comparative analysis, it is necessary to include non-plant proteins in the database. Non-plant genes are not curated to the degree carried out for plants and to automate the process of data import, our non-plant genes are from the RefSeq database of NCBI. We reason that the inclusion of non-plant, model organisms will broaden the relevance and usefulness of ChromDB to the entire chromatin community and will provide a more complete data set for phylogenetic analyses in support of the evolution of the plant chromatin proteome.
Genomic Diversity and Phenotype Connection (GDPC)
The Genomic Diversity and Phenotype Connection (GDPC) simplifies access to genomic diversity and phenotype data, thereby encouraging reuse of this data. GDPC accomplishes this by retrieving data from one or more data sources and by allowing researchers to analyze integrated data in a standard format. GDPC provides access to genomic diversity data such as SNPs, SSRs, sequences, etc. and phenotypic data that may be collected in field, genetic, or physiological experiments.
Extensive research over the past two decades has shown there is a remarkably consistent conservation of gene order within large segments of linkage groups in agriculturally important grasses such as rice, maize, sorghum, barley, oats, wheat, and rye. Grass genomes are substantially colinear at both large and short scales with each other, opening the possibility of using syntenic relationships to rapidly isolate and characterize homologues in maize, wheat, barley and sorghum.
As an information resource, Gramene's purpose is to provide added value to data sets available within the public sector, which will facilitate researchers' ability to understand the grass genomes and take advantage of genomic sequence known in one species for identifying and understanding corresponding genes, pathways and phenotypes in other grass species. This is achieved by building automated and curated relationships between cereals for both sequence and biology. The automated and curated relationships are queried and displayed using controlled vocabularies and web-based displays. The controlled vocabularies (Ontologies), currently being used include Gene ontology, Plant ontology, Trait ontology, Environment ontology and Gramene Taxonomy ontology. The web-based displays for phenotypes include the Genes and Quantitative Trait Loci (QTL) modules. Sequence based relationships are displayed in the Genomes module using the genome browser adapted from Ensembl, in the Maps module using the comparative map viewer (CMap) from GMOD, and in the Proteins module displays. BLAST is used to search for similar sequences. Literature supporting all the above data is organized in the Literature database.
MAGI (Maize Assembled Genomic Island)
The MAGI website summarizes some of our investigations of the maize genome.
The MAGI website have assembled gene-enriched (MF and HC; Whitelaw et al., 2003; Palmer et al., 2003) and random Whole Genome Shotgun (WGS) GSSs (Genome Survey Sequences) of maize and sorghum into MAGIs (Maize Assembled Genomic Islands) and SAMIs (Sorghum Assembled genoMic Islands), respectively. Based on computational and biological quality assessments it appears that a very high percentage of genic MAGIs and SAMIs accurately reflect the structures of the maize (Fu et al., 2005) and sorghum genomes. We have similarly assembled maize ESTs into MECs (maize expressed contigs).
It is possible to Blast MAGIs, 454-ESTs, MECs, SAMIs and the 16,819 B73 maize BACs that as of 10/09/2009 have been at least partially sequenced by the maize genome sequencing project ( DBI-0527192; Rick Wilson, PI). MAGIs have been annotated via sequence similarity, repeats, alignments to Sanger and 454 ESTs, and using an ab-initio gene prediction tool. A repeatmasker is available to facilitate primer design and annotation.
Our latest genetic map, ISU_IBM Map7, contains ~6,000 genic markers integrated with ~4,000 additional markers from other projects. This map has been fully integrated into the MAGI web site. It is possible to blast sequences against the ~6,000 sequence-defined, genic, genetic markers generated by the ISU maize mapping project.
Maize Full-Length cDNA Project
This project will span three years and involve two academic institutions: the University of Arizona and Stanford. The overall goal is to sequence 30,000 FLcDNA clones from two cDNAlibraries of varied tissues and stress treatments. This project is using maize inbred B73 background for both clone libraries, the same inbred line being used for full genome sequencing. Specifically, the supporting aims of this project are:
- Sequence 5' and 3' ESTs from 130,000 random cDNA clones in library #1.
- Sequence 5' and 3' ESTs from 50,000 random cDNA clones in library #2.
- Select ~30,000 unique clones with both a 5' and 3' EST for full-length sequencing from libraries #1 and #2.
- Annotate the expression of these FLcDNAs using microarray hybridizations, locate FLcDNAs on the physical map of maize chromosomes, and display results using a web-based genome browser.
- Distribute clones and amplified FLcDNA libraries to the research community.
- Involve high school teachers and undergraduates in genomics projects and analysis; develop classroom exercises using maize genomics resources.
MaizeGDB is a community-oriented, long-term, federally funded informatics service to researchers focused on the crop plant and model organism Zea mays.
The MIPS plant genomics group focuses on the analysis of plant genomes, using bioinformatic techniques. To store and manage the data, we developed a database, PlantsDB, that aims to provide a data and information resource for individual plant species. In addition PlantsDB provides a platform for integrative and comparative plant genome research. Currently PlantsDB provides the following databases:
- The Triticeae genome project
- The maize genome database (MGSP)
- The rice genome database (MOsDB)
- The sorghum genome database
- The brachypodium genome database
- The MIPS Arabidopsis thaliana genome database
- The Medicago truncatula genome database
- The Lotus japonicus genome database
- The tomato genome database
- mips Repeat Element database (mips-REdat) mips Repeat Element catalog (mips-REcat)
Oryza Map Alignment Project (OMAP)
The Golden path to unlocking the genetic potential of Wild Rice Species.
The long term goal of this project is to develop an experimentally tractable and closed model system to globally unravel and understand the evolution, physiology and biochemistry of the genus Oryza.
The specific objectives of this proposal are to:
- Construct DNA fingerprint/BAC-end sequence physical maps from 11 deep coverage BAC libraries that represent the 11 wild genomes of Oryza (830,000 fingerprints; 1,659,000 BAC ends)
- align the 11 physical maps with the sequenced reference subspecies japonica and indica.
- construct high-resolution physical maps of rice chromosomes 1, 3 and 10 across the 11 wild genomes using a combination of hybridization and in silico anchoring strategies.
- provide convenient bioinformatics research and educational tools (FPC and web-based) to rapidly access and understand the collective Oryza genome.
Panzea is the bioinformatics arm of a project investigating the Genetic Architecture of Maize and Teosinte (NSF 0820619). The project is funded by the National Science Foundation.
The project is describing the genetic architecture of complex traits in maize and teosinte. We will identify genes that control domestication traits and three key agronomic traits: flowering time, plant height, and kernel quality. We will characterize allelic series at these genes, examine their epistatic and environmental interactions, and take a step toward the ultimate goal of predicting phenotype from genotype. The genetic, germplasm, and bioinformatic resources created by this project will help maize researchers worldwide to discover the genetic basis of any trait of interest.
The Panzea website provides access to the project database and bioinformatics module.
Rice Annotation Project Database
The Rice Annotation Project (RAP) was conceptualized upon the completion of the rice genome sequencing in 2004 with the aim of providing the scientific community with an accurate and timely annotation of the rice genome sequence. One of the major activities of RAP is to hold jamboree-style annotation meetings on a regular basis to facilitate the manual curation of all gene structures and functions in rice. Also part of the overall objective is to facilitate a comprehensive analysis of the sequence based on the results of annotation and the construction of a public database.
The Arabidopsis Information Resource (TAIR) collects information and maintains a database of genetic and molecular biology data for Arabidopsis thaliana, a widely used model plant.
TAIR is located at the Carnegie Institution for Science Department of Plant Biology, Stanford, California.
Wheat SNP Database
The primary goal of the project is to discover and map single nucleotide polymorphisms in tetraploid and hexaploid wheat and develop appropriate bioinformatic tools for public access to this resource. The secondary goal is to employ this resource in preliminary characterization of genetics structure of the genepools of tetraploid and hexaploid wheat and wheat diploid ancestors.
The Wheat SNP Database is now available. This database includes conserved and genome-specific PCR primers for amplification of STSs from genomic DNA of wheat and its diploid and tetraploid ancestors, DNA sequences, sequence annotations, intron/exon predictions, electropherograms, haplotypes, SNPs, and positions of SNP markers on wheat linkage maps.