Ph.D. position titled: Physics-inspired machine learning for plant-based food supply chains
Our groups: Empa's Simulating Biological Systems Group aims to reduce food loss in postharvest supply chains by understanding and steering these systems in-silico. We do this by pioneering physics-based modeling at multiple scales, bridging the virtual to the real world by multi-parameter sensing, and creating digital twins that can live together with their real-world counterparts. We are an interdisciplinary team of mechanical, biomedical, and agricultural engineers, food scientists, and environmental scientists.We are part of the Laboratory for Biomimetic Membranes and Textiles in Empa's Research Focal Area Health and Performance.
The SDSC has been working on the application and development of machine learning algorithms in collaboration with partners across ETH domains institutions, like ETHZ, EPFL, Empa, PSI, WSL and EAWAG. We have applied deep neural networks to dark matter distribution, as well as architecture design. We have worked on Natural Language Processing for analyzing political speeches. We are also working on time series modeling for understanding turbulence in drones, and many other research projects.
Background. Up to 30% of the world's fresh fruit and vegetables are lost on the way from farm to consumer. Minimizing this postharvest loss is essential to reduce greenhouse gas emissions from supply chains of fresh horticultural produce and avoid waste of natural resources embedded in these products. The key strategy to preserve fresh produce and extend its shelf life is to cool down the food after harvest and maintain the cold chain during transport and storage until it is consumed. In postharvest supply chains of fresh fruit and vegetables, each cold chain is unique in terms of temperature history and duration. As a result, every shipment has a different, yet unknown, loss of its quality attributes from farm to retailer. Knowledge of where, why, and how much fresh-produce quality is reduced is critical to further optimize cooling operations and cold chain logistics to maximize shelf life at the retailer's store and reduce food waste.
To do so, we have started a large project on data science for social impact. This data.org initiative was funded by MasterCard's Center for Inclusive Growth and the Rockefeller Foundation. Project partners are BASE (Basel Agency for Sustainable Energy) and the Swiss Data Science Center (SDSC). This team will work together to create an open access, data-science-based mobile application to enable smallholder farmers to access sustainable cooling facilities. The solution will provide smallholders with easy access to pre/postharvest expertise and market intelligence and will be made inclusive and accessible through the innovative servitization business model Cooling as a Service, supported by a scalable financial structure. The objective is to enable smallholders to make decisions on cooling based on lifecycle benefits rather than upfront costs and have access to easy-to-use information to make optimal decisions on produce & farm management. This will help smallholders break the negative cycle of poverty while also improving food security and minimizing food production's impact on the global climate.
Scientific and technical objective.The candidate will be responsible for developing supervised machine learning models with the following objectives (the importance of each of these subtasks will be decided together with the student, based on the background):
- A ML model to identify initial fruit quality based on images of fruit and vegetables
- Develop a physics-based digital food twin to estimate the shelf life from these initial quality estimates from ML.
- Develop a physics-inspired machine learning model that learns from our physics-based digital food models by relying on feature engineering using the physics-based twin data. By integrating our physics-based model results in a machine learning pipeline, we will increase the robustness of its quantitative predictions by guiding it to learn physical-based constraints, leveraging the benefits of both methods.
- Data pre-processing to ensure data homogeneity, quality, and adequate labeling.
- Develop new algorithms in unsupervised deep learning (mainly Normalizing flows and Variational Autoencoders) that are able to capture the distribution of the damaged/low-grade fruits and are able to provide accurate likelihood models.
- Apply an existing computational multiphysics model – a digital twin – for two plant-based foods to generate new data for the machine learning model. This includes hygrothermal transport in the food during cooling, convective exchange with the environment, biochemical quality attribute evolution within the products, and thermally-driven damage.
- Dissemination of your work in scientific publications and educational activities.
- Supervision of undergraduate students.
- Support and further develop the research activities on digital twins.
- Completed MSc degree in computer science, agricultural engineering, process engineering, or food technology.
- Proven experience in machine learning.
- Experience in ML-based computer vision is considered an advantage.
- Experience in food quality measurements (firmness, micronutrients, …), food process experiments, thermal measurements (temperature, infrared, anemometry) is considered an advantage.
- Experience in physics-based modeling (finite element modeling, computational fluid dynamics) is considered an advantage.
- Excellent communication skills and fluency in English (both written and oral) are mandatory.
Administration.A project duration of 4 years is envisaged to carry out the above research tasks in the form of a Ph.D. project. The project is supported by Empa under the supervision of Thijs Defraeye, but involves a joint affiliation with SDSC (Fernando Perez Cruz). The candidate will perform his research at Empa in St. Gallen but with regular on-site exchanges with SDSC. The desired starting date is 1st of May 2021 or upon mutual agreement.