TransPLANT: Trans-national Infrastructure for Plant Genomic Science

Description

Food and energy security are major challenges facing humanity in the coming decades. The falling costs of nucleotide sequencing are opening up significant opportunities for crop improvement through plant breeding and increased understanding of plant biology; in particular through interpreting the growing volume of plant genomics data in the context of phenotype. However, at present, there is no adequare infrastructre for plant genomic data. transPLANT will develop a new infrastructure for this data, leveraging the experience of medical informatics while addressing the particular challenges and opportunities of plant genomics.

Compared with vertebrate genomes, plant genomes may be large and have complex evolutinary histories, which makes their analysis a hard problem (both in terms of theory, and in terms of the compute resources required for data storage and analysis). Issues included genome size, polyploidy, and the quantity, diversity and dispersed nature of data in need of integration.

To address these problems, transPLANT developed distributed solutions, exploiting the expertise of the project partners in particular species and problems to provide a seamless set of computational and interactive services to the plant research community. These services were developed on top of the outputs of RTD activities designed to build new repositories and develop new algorithms, and with the input from the plant science and other related communities garnered through extensive networking activities. A series of training workshops educated the community in the use of transPLANT tools and data.

transPLANT was built on standard technologies for data exchange and representation, service provision, virtual compute infrastructure, and interface development; where such standards are currently lacking (as in phenotype description), they were developed in the context of the project.

Funding