iPlant User Creates Integrated Framework for High-Throughput Sequencing

Duitama with his team collaborating on the development of NGSEP (left to right): Jorge Duitama, Juan Fernando de la Hoz, Daniel Cruz, Claudia Perea, and Juan David Lobaton. (Image courtesy of Claudia Perea)

By Shelley Littin, iPlant Collaborative

iPlant has always been of, by, and for the community, crafting our products to suit the scientific community’s needs. Sometimes, community members become leaders, taking the initiative to create products to empower their colleagues.

Jorge Duitama an iPlant collaborator and bioinformatics researcher from the International Center for Tropical Agriculture, is one such scientist. Duitama’s research focuses on using bioinformatics to accelerate plant breeding. He recognized a setback for biologists seeking to compare variance in genetic traits, and set out to find a solution.

“A big problem for most biologists is that the tools available for genetic analyses are difficult to operate, and you have to kind of glue them together to achieve a complete analysis,” Duitama said. “It’s a real pain for biologists to integrate different tools written in different coding languages in different computing environments. The information doesn’t always transfer directly. Biologists now have to spend between four to six months dealing with technical issues, and that should not be so.”

To address the problem, Duitama’s team created NGSEP, the Next Generation Sequencing Eclipse Plugin. The tool’s main functionality is its variance detector, which enables it to simultaneously discover target genetic elements such as single nucleotide variants (SNVs), insertions or deletions, and copy number variations (CNVs). NGSEP also provides several functions to filter, export, and provide statistics on files with information on genomic variation.

Duitama has published a manuscript describing NGSEP’s functionalities in the journal Nucleic Acids Research, and expects to publish additional papers in coming months.

He approached iPlant about hosting the new tool in an effort to bring it to a larger community of researchers. “We wanted to make this tool available for more people. It’s free, it’s open-source, and it’s available to anybody with an Internet connection. We want people to use it.”

“The advantage of using NGSEP is that different approaches to analyze sequencing data are already integrated into a single framework,” Duitama explained. “The goal is to facilitate the work for biologists.”

And you don’t need to be a computer expert to use it: “Anyone with knowledge of biology, genetics, and sequencing can use this tool. The idea of integration with iPlant is to make it so that users could avoid technical complications as much as possible, so people without technical computing knowledge can still use the program to process data and also take advantage of the hardware resources provided by iPlant.”

“Of course, if you have technical skills it will go faster,” he added. Which is why Duitama has partnered with iPlant to offer a free webinar on using NGSEP this Friday, November 20 at 12:00 noon Eastern Daylight Time. To register, visit: http://www.iplantcollaborative.org/blog/events/webinar-analyzing-high-throughput-sequencing-reads-ngs-eclipse-plugin