J-AI-C
 

Artificial intelligence and machine learning is no longer confined to science fiction, and it is here to stay. We speak with our plant and microbial scientists for their reflections about its impact on research, its potential and its pitfalls.

Dr Haopeng Yu, postdoctoral scientist

Haopeng worked closely with Dr Yiliang Ding, John Innes Centre group leader, and computer scientists at the University of Exeter, to develop a Plant RNA large language model. Following a similar methodology to ChatGPT, this Artificial Intelligence (AI) model was trained on a dataset of 1,124 plant species. It ‘learnt’ the patterns and logic of plant RNA sequences and structures, essentially learning the ‘language’ of plants.

Haopeng said: “I’m excited about the potential of AI in plant and microbial science and we’re still in the early stages. Our models are a work in progress, and I’m lucky to work with a group leader keen to push the boundaries.

“AI is transforming medical bioimaging, for example, accurately predicting cancers; the tech is promising for plant science.

“Personally, I think the main blocker for AI development is scepticism. We need to develop our understanding and trust. AI is a powerful tool that can help accelerate discovery and show us where to look. I don’t think it will replace scientists! We need to validate predictions and ask the right questions.

“We are currently finalising a plant DNA foundation model, using the 157 plant genomes that have been sequenced across the world. This will aid crop genetics and transformation, to pinpoint where to look for useful genetic variations.

“In the future, my dream is for our models to help unlock the nucleotide sequence, and solve plant gene regulation puzzles. I would also like to develop a virtual cell, to simulate stresses and model cell development. To scientists considering working with AI, I say, ‘Try!’ It could surprise you.”

Dr Haopeng Yu

Miguel Gonzalez Sanchez, research software engineer, and Dr Martin Vickers, senior scientist

Miguel and Martin are part of the informatics platform at the institute. Martin is developing AI applications to help science work more efficiently. Miguel is exploring how to record data more consistently across the institute, to get the most out of the information collated.

Martin said: “AI is not magic, even though it often looks like it. It is a useful tool that can quickly process vast amounts of information based on probabilities. In science, we always need to check and validate results to ensure accuracy, as you can only train a model so far.

“One area in which I am using AI is our field trials at Church Farm. Researchers spend a lot of time manually recording data. Our drones capture huge amounts of data from these field trials. With the help of AI, we can train models to assist the researcher to measure, classify and score various crop traits.

“Millions of pounds of research and development efforts go into these field trials. We are not only capturing these images for posterity to provide a record, but they are also available for future advances in AI, whatever they may be.

 
I want to teach more of our colleagues how they can best utilise AI for themselves 

“Similarly, I have been training a model to classify cabbage stem flea beetle larvae. Whereas now a post-doc will spend months classifying larvae stages, our model is learning to accurately process the images of larvae in hours.

“The infrastructure to support AI is vital to this capability and is a challenge facing all research institutes. As sequencing technologies improve, data volumes increase. It is commonplace to work on projects that require hundreds, if not thousands, of terabytes of data, particularly as more researchers are embracing AI.”

Miguel said: “I want to teach more of our colleagues how they can best utilise AI for themselves, as well as future-proof their data collection. We need to upskill everyone on coding, and I anticipate that the huge amounts of data we’re collecting now could be used effectively in the future, but only if our institute becomes consistent in how it records and stores information.

“For example, we could collate data on wheat which spans years of field trials, and learn about it using AI. The possibilities are vast, but if we don’t start now, we will be behind and limit the advances we could make in the future.

“Therefore, I am building repositories to standardise the way we record and save scientific images and data. This will break down data silos, so data from different groups can be used together.”

Dr Martin Vickers, senior scientist, flying a drone at Church Farm

Dr Michael Webster, group leader

Michael leads a structural biology group at the John Innes Centre, and is optimistic about the capabilities of AI in his field, as well as its ability to make his area of science more accessible and support student learning.

Michael said: “Structural biology is being transformed by the availability of fast and accurate AI tools. Structural biologists have long been aware there is more information in our data than we have been able to access, and new algorithms are opening the door to greater insight.

“Structural biology, in essence, is the study of visualising molecules. We identify how molecules are physically built, and what their chemical architecture is, sometimes at atomic level precision. This can be DNA, or proteins, for example. We also try to understand how atoms come together to create a molecule, and how molecules ‘work’ through An image of the 3D structure of the chloroplast RNA polymerase their interactions with each other.

“For example, one of our more recent projects used an advanced microscopy method called cryo-EM to explore how photosynthetic proteins are made in chloroplasts (the organelles that make plants green). Photosynthesis is the process in plants that produces energy-rich sugar molecules as well as oxygen.

“To do this, we made an atomic model the chloroplast’s unique RNA polymerase. The polymerase is made up of more than 70,000 atoms in 21 proteins. This new understanding of a fundamental process in plants contributes to our wider understanding of plant growth and photosynthetic ‘robustness’, in crops and other plants.

An image of the 3D structure of the chloroplast RNA polymerase

“Our group’s method of visualising molecules, cryo-EM, is a technique using very powerful microscopes, capable of imaging at incredibly cold temperatures and high magnification. After we take images of molecules, AI tools can help us to reconstruct the 2D images into what the molecules look like in 3D, with a high degree of accuracy. We then use other AI tools to build an atomic model of the molecule that is consistent with the shape we observed.

“Determining the structure of a single protein can take years. New AI-based algorithms both accelerate the process and mean more molecules can be visualised in parallel; we looked at many 3D reconstructions in the past year, which have revealed how molecules move and perform their unique role. However, I feel the most far-reaching benefit of AI is that it has increased access to, and the general literacy of scientists in, structural biology. It can be seen as an intimidating field to understand.

“With new generations of scientists who are increasingly confident using machine learning to explore large datasets, I think the enthusiasm and energy being brought into structural biology is really exciting.

“Opportunities are now coinciding for plant structural biology. With the increase in plant genome resources, and the capabilities of AI in structural biology, we can now attempt things that before seemed too difficult. With access to a genome sequence and AI algorithms for interpreting 3D information we can now identify proteins directly by looking at them. This means we can look at more complex samples, including those isolated directly from plants.

“With advances in both AI and imaging, one of the next frontiers will be to build pictures of what molecules look like – and are doing – inside the cell. That is now within the realms of possibility.”

More Articles