"The source for European pharmaceutical biotechnology news..."
New Account

The Magazine

Issue 6

Why Boehringer Ingelheim’s Vice Chairman Andreas Barner sets his researchers free, and how Lundbeck is winning the R&D race. Read our interactive edition here.

E-magazine
  • Previous Issues

Blog

Spencer Green
Chairman, GDS International

Sales and the 'Talent Magnet'

A lot is written about being a ‘Talent Magnet’, either as a company, or as President. It’s all good practice – listen, mentor, reward, provide clear goals and career maps. Good practice for the employer, but what about the employee?
26 May 2011

Even plants catch viruses – rapid detection of invading pathogens using next-generation sequencing technologies

By GenomeQuest

GenomeQuest | www.genomequest.com

No Comments

In 2007, researchers from a prominent academic institution were facing a new type of disease that they suspected was caused by a previously uncharacterized RNA virus. The virus had not been identified among the hosts of other viral and bacterial infections that are present in healthy plants.

Viral infections in plants represent a severe problem in agriculture. The US Department of Agriculture estimates that 30-50% of crops are lost to disease pathogens in climates conducive to disease development. Viral infections in plants can drastically lower output, or put at risk the entire crop. Since treating a viral disease is usually not practical, the grower focuses instead on screening out sick plants and maintaining a population of healthy plants that can be used to replace sick plants. 

However, the virus hunters have the difficult task of being able to keep up with molecular biology screens where new viruses are appearing almost daily. The current screens are all based on detecting a specific viral genome sequence. However, such screens will not be able to identify new viruses whose genome sequence is unknown.

Challenge

When the academic researchers suspected a previously uncharacterized virus, they decided to try a new approach. Employing a Roche-454 next-generation DNA sequencer, they sampled and sequenced the overall RNA content of both the sick plant cells and a set of healthy controls.

Together, the two sequencing runs on the 454 sequencer generated an overwhelming amount of sequence reads. These reads represented all RNA inside the cells of both the healthy and the sick plants – including both self-expressed RNA transcripts as well as RNA originating from any bacteria or virus inside the cells.

Filtering the enormous volumes of data to identify a previously uncharacterized virus was their major challenge. Each of the sequences needed to be compared against all of the public sequence databases. Those sequences that could not be associated with any known organism could then become suspect transcripts for the virus of interest. Unfortunately, the search would create tens of millions of alignments, taking weeks on a typical computer, creating an essentially intractable problem for the academic researchers.

Considerations

The academic researchers were aware of the enormity of the informatics challenge. They needed a solution which would let them survey the entire set of reads from their two samples to find the suspect virus. They did not have access to a sufficiently large informatics resource to solve this problem on their own. In addition, researchers recognized that being the first research team to identify the virus would represent a significant achievement for the University, and they were looking forward to the possibility of publishing their results in a peer review journal. 454 Life Sciences saw the enormity of the informatics faced by the academic research team and was compelled to lend suggestions for a solution. 454 was aware that this task was easily within the capabilities of the GenomeQuest team. After an introduction to academic research team, the GenomeQuest team began to work with them to define a workflow that would give the research team the ability to accurately identify the suspect virus, and do so in a timely manner.

GenomeQuest

In just a matter of days, GenomeQuest was able to set up and execute a comparison of every sequence against all of the public databases. GenomeQuest accomplished this feat using its proprietary algorithms, compute cloud, team of bioinformaticist scientists, and its a priori experience of next-generation high-throughput sequencing data analysis. The search resulted in a huge number of alignments many of which were repetitive genomic elements, and the vast majority of which were indeed genetic sequence from the host plant.

However, GenomeQuest found that the remaining sequences had some level of alignment to known types of publicly characterized viruses. An additional group of sequences were simply unannotated – they did not align to anything in the known public data. The challenge was now to separate the known viruses from the possible new, but similar, viruses. Using it propritery alignment algorithms, GenomeQuest was able to select a very small number of reads that were similar, but not identical, to known viruses. GenomeQuest then assembled those reads together with the unannotated ones. This process resulted in several contiguous regions of sequence, or contigs. These contigs represent the partially assembled putative genomic sequence of the new virus currently afflicting the plant.

Back to the Lab

Ultimately, leveraging extensive experience in virology combined with GenomeQuest’s powerful analysis platform and experienced bioinformaticists, the academic researchers identified some relevant contigs from which the researchers designed PCR primers and amplified the entire genome of the new virus.

To further prove that this virus was the cause of the disease, they used the same process on the healthy plants and proved that this virus did not exist in the healthy plant tissue. Finally, using a combination of virology expertise and sophisticated computational tools, a new virus was found.

Conclusion

The GenomeQuest software platform was instrumental in reducing a time-prohibitive computational search and analysis problem into a very efficient and streamlined process that met both the client’s information needs as well as time frame. GenomeQuest and the academic research team continue to work together on projects to advance research and quicken the pace toward scientific discovery.

The lead researcher at the university, said, “This successful identification of the suspected organism is truly groundbreaking research which will be of great practical benefit – a fact that is a source of great pride to our research team.”

The above research effort is a model for how GenomeQuest’s On-Demand Informatics and your web lab research can be paired to take full advantage of next generation sequencing technology. Advances in sequencing technology have enabled scientists to generate trillions of bases of raw sequence data and invent virtually any kind of sequence-based experiment, ultimately to enable personalized genomics-based medicine. At the core of each of these opportunities is a massive challenge – the management and analysis of more and more sequence data. GenomeQuest, the leading provider of On-Demand Informatics capabilities to process, annotate, and assemble raw next-generation sequencing data, exactly addresses this need. GenomeQuest has encapsulated this analytical workflow, as well as many other next-generation sequencing analysis workflows, into its sequence informatics web portal, available at www.genomequestlive.com. For more information on GenomeQuest, please see www.genomequest.com/nowwhat.


More like this...

Disclaimer: All comments posted in a personal capacity
POST A COMMENT
In order to post a comment you need to be regsitered and signed in.
Register | Sign in
No Comments Have Been Submitted
Disclaimer: All comments posted in a personal capacity