Enabling High-throughput Image-based Phenotyping

By Shelley Littin, CyVerse

A key aspect of CyVerse infrastructure, in the eyes of its designers, is its flexibility – the extent to which community members can mix and match various CyVerse resources to suit their development, research, or educational agenda.

Nathan Miller, a scientist in the Spalding Lab at the University of Wisconsin–Madison’s Department of Botany and Center for High Throughput Computing, works closely with CyVerse as he writes computer code to develop software programs that analyze images and videos of plants to determine plant phenotype, the measurable physical characteristics of plants.

“Many aspects of modern phenotyping are now image-based,” Miller said. “Computers are more objective than humans, and also faster. Computers can measure with a precision that a human cannot achieve.”

Geneticists, agronomists, and botanists have found digital imaging a useful tool to provide detailed data on the measurable proportions of seeds, root structures, the shape of plants as they grow, and plant parts after they’ve been harvested.

Such data can be used to identify plant varieties exhibiting traits that are preferable for agriculture, such as larger fruits, more robust stems, or more efficient root systems. “Humans were selecting for preferable phenotypic traits long before they were aware that they were modifying the genetics by doing so,” Miller noted.

Now, modern technology has taken the Mendelian science of selectively breeding the best vegetables to the next level, in which every minute detail of a plant’s physical makeup is precisely recorded to identify and select for desirable traits.

Miller currently manages sixteen pieces of software for various image-based phenotyping projects.  

One such project is focused on analyzing ear, cob, and kernel characteristics of maize (corn) varieties. The researchers use flatbed image scanners to generate high-resolution images of ears of corn, individual kernels, and cobs that are placed onto the scanners, and Miller’s software makes minute measurements of the length and width of each item.

A flatbed image scanner is used to generate a high-resolution image of a carrot.

“There are around eight groups using this pipeline at various levels and for various reasons. The researchers extract features or information from the images, and then upload their data to CyVerse’s Data Store,” Miller said. Miller’s program connects CyVerse with the Open Science Grid (OSG), a National Science Foundation-funded project that facilitates access to distributed high-throughput computing resources across the United States.

“I’m enabling the two pieces of technology to communicate. CyVerse manages the users, the discs, and the data storage, and makes the data accessible through the Open Science Grid.” Miller’s high-throughput method for computing maize characteristics has been published in the Plant Journal.

Miller has noticed growing interest in the image-based phenotyping approach among researchers in the maize phenotyping and breeding communities. But, “collaborators don’t always use the program for its original purpose,” he observed. “CyVerse – Open Science Grid software enables researchers to find creative ways to repurpose these tools for different investigations.”

Another research group uses one of Miller’s software programs to measure gravitropism, the process by which roots bend down toward gravity, using roots of the plant Arabidopsis, which is related to cabbage and mustard. Gravitropism is a mysterious process controlled at the cellular level such that the plant structures grow in the optimum direction, which, when it comes to roots, is down.

“The plant is placed in front of a camera, which takes very high resolution pictures of the roots growing down and measures their growth rate and tip angle. CyVerse serves as a data repository for the numerous collaborators using this method,” Miller said.

Researchers using Miller’s gravitropism software are able to upload thousands of root images each day into the CyVerse infrastructure, where the data is stored securely until a researcher is ready to upload them to OSG for analysis.

Sarah Turner, a PhD candidate working with Philipp Simon in the department of horticulture at the University of Wisconsin–Madison, also works with roots. She is studying how carrot leaves impact root development. “One of the challenges I was having was that it takes a lot of time to collect the data, so we were looking for a quicker way to extract measurements from field-harvested carrots,” she said.

Turner began working with Miller to estimate biomass and measure root shape of carrots. “Carrots have a variety of root shapes, which determine how they’re marketed and what purposes they’re used for,” Turner explained.

Miller developed a custom algorithm to collect measurements on carrot phenotypes. Turner collaborates with Miller through the CyVerse infrastructure. “I upload images to him and he processes them, and then I retrieve the data,” she said. “This pipeline improves how we collect the data and allows us to better investigate the underlying genetics of carrot shape.”

Carrots are a relatively underfunded crop compared with maize, Turner added. “With a lower-value crop we have fewer resources, so it’s significant progress for us to be able to do this research.”

Gabriele Monshausen, an associate professor of biology at Penn State University, uses Miller’s software in a course for incoming graduate students, titled “Modern Techniques and Concepts in Plant Cell Biology.” The class introduces the students to automated phenotyping processes, and provides them with many skills that are becoming increasingly important for a career in plant biology, Monshausen said.

Students use Miller’s tools to collect image-based phenotyping data for Arabidopsis, and then go on to analyze the data using the CyVerse infrastructure ecosystem to share data and results. One of Miller’s tools allows the group to analyze the growth pattern of roots at a spatial and temporal resolution unachievable without computational technologies. Another tool can be used to analyze the size and shape of hundreds of Arabidopsis seeds based upon a single scanned image.

“It’s really incredible to have access to Miller’s programs, which allow us to achieve results that we wouldn’t be able to gain without these tools,” Monshausen said. She believes the course curriculum could be adapted to engage interested high school students in computational plant biology techniques as well.

“The ability for creative research engineers like Nate Miller to take components that are most relevant and suitable for their project to build platforms for communities is an important emphasis of our mission,” said Nirav Merchant, a CyVerse co-principal investigator. “It’s very exciting to see how Miller is giving scientists access to cutting-edge data and tools for research, education, and training.”

Images courtesy of Sarah Turner/University of Wisconsin-Madison.