An interview with Estibaliz Gómez de Mariscal

Posted by Helen Zenner, on 27 February 2024

At the end of 2023, DL4MicEverywhere was released. The platform aims to make deep learning methods in bioimage analysis more accessible and more reproducible. We caught up with Estibaliz Gómez de Mariscal (Esti), one of the co-leads of the project, to find out more about the platform, and to chat about Esti’s career so far and her future plans.

What first inspired you to become a scientist?

I think I didn’t conceive myself as a scientist so far, and it was more that I was in training to become a scientist. To be honest, there was not a specific trigger or something that I planned early on. I had already worked in industry when I decided to do a PhD, because I wanted to try to become a university professor. If you meet me in a course or a conference, you will quickly notice that I really love teaching! Very naively, I thought that getting a PhD was the main step to becoming a professor.

With some perspective, I think the true inspiration to become a scientist are the people I have worked with so far, especially my supervisors and closest collaborators. When you ask yourself, do you want to become a scientist, they are the closest reference and what you are truly asking is if you want to be in a similar place in the future. For me all started when I received a general email from Arrate Muñoz-Barrutia, my PhD supervisor, who was looking for someone to develop microscopy image processing methods in collaboration with Denis Wirtz. I consider myself lucky: the topic was quite interesting from many angles, Arrate and Denis were very accessible and open to trying different things, and they had a deep expertise. Their passion and the fact that they liked sharing it, sharing the momentum, was clear from the beginning. This let me enjoy the process and seeded a researcher spirit in my soul. That is why I wanted to be in a research institute for my postdoc, where all the activity was around research and lighter on teaching responsibilities. Remarkably, during my PhD and now in my postdoc with Ricardo Henriques, my supervisors have taken the time needed to guide my projects, they have promoted discussions and have taught me how to be critical in a friendly and inclusive manner, which I find quite stimulating. In the same way, they have introduced me to the grant writing, as well as making me aware of the system intricates in academia, which are tricky but fundamental. While deciding if you want to be a scientist, it’s important to understand what it really means, in all its different aspects, the good and the bad ones. Also, I constantly go through the mental exercise of comparing it with my work in the industry, which I very much enjoyed. The key differences for me, are the satisfaction of shaping your contributions and building something with them, the fact that I can keep teaching, the intellectual and personal flexibility, and the continuous stimulus to curiosity.

You mention working in industry before beginning your PhD, can you tell us a little about you career path from your undergraduate in Mathematics?

Yes, I worked in a bank before my masters, and in the pharmaceutical trials industry after it. For my masters, I really loved geometry and topology, but I found it too specific. I felt that I wasn’t smart enough and that going into theoretical math would be risky. On the other side, while being in the bank, I worked for the first time in a multidisciplinary team, and I enjoyed the approach of sharing knowledge and working with people with different backgrounds. That is why I enrolled in a master on applied math, where I discovered image processing, and I discovered a true passion. Then, in the pharmaceutical trial company, I felt a lot of satisfaction in bringing math and health sciences together. A PhD project with image processing and life-sciences sounded like a good match. Yet I only knew machine learning and I had to learn almost everything, microscopy, biology, image data analysis, which was a lot of fun and it felt that I could bring different perspectives. During my PhD, I visited Denis’ lab in John Hopkins and Daniel Sage in Michael Unser’s lab at EPFL, Switzerland. I also stayed in Thomas Brox’s computer vision group in Freiburg, Germany, where the U-Net was originally developed and was extremely novel back at that time. Those experiences were enriching and stimulating once again. In Thomas Brox’s lab, perspectives were completely different and I could see the future impact of computer vision in life sciences. It also helped me thinking about the type of group I wanted to join for my postdoc. For this, I spoke with different researchers, Ricardo and some of his team members among them. I had recently started collaborating with him within the BioImage Model Zoo and he was working on the sort of advances that were very interesting to me. I must admit that I also had a good inner feeling.

Was this when Ricardo moved to back to Portugal?

Yes, and actually I had some doubts. I could have moved to a more consolidated group, or to a research institution that was much more powerful, but I prioritise who I am going to work with. That is important for me, and it’s going to drive most of my motivation. At the time, Ricardo had recently been awarded an ERC Consolidator grant and he was expected to have quite a multidisciplinary team in an international biomedical research institution; now we have biologists, mathematicians, developers and physicists, all working together. This is something unique and I was convinced that could be a very good opportunity for my training. Indeed, I was wisely advised: If you want to be a scientist, you have to keep working on how to formulate good questions; if you don’t, you lose this ability. For this, I felt that I needed to be somewhere where we could understand each other, our skills could be complementary and the knowledge that I would gain would be out of my comfort zone but also stimulating. By coming to the Instituto Gulbenkian de Ciência (IGC) with him, I also learnt about growing a new lab and I got to work in an extremely collaborative atmosphere that enabled me to rapidly conceive my own project, for which I was granted an EMBO postdoctoral fellowship, and very recently, an individual project in Portugal. On the other hand, I very much enjoy the daily work life and there are always promising projects floating in the air. It seems like it was the right choice!

You’ve spoken a lot about collaboration, it’s obviously really important in your work. What are your recommendations for getting the most out of a collaboration? And to follow on, do you have to speak or communicate in a different way to engineers or computer scientists versus wet lab scientists or physicists, and how does that all work for you?

I will go with the second question first. Yes, you do communicate differently. With computer scientists or engineers, the objective is to develop code or infrastructure, or to design a new analytical method. This means that usually you discuss about optimising a backend code, parameters or new analytical functions and approaches, without paying much attention to the biological phenomena that is being observed. Thus, when supervising computer scientists or engineers, I try to explain them what they are looking at in the images, why their work can be important or how obtaining those images is experimentally so impressive. With biologists, the analysis needs to be contextualised. Most discussions are about conceptualizing robustly the biological events that need to be quantified, so we can reduce the ambiguity and process the data automatically. Communicating with different people is a lot of fun for me. For the first part of the question, what do you mean by getting the most out of a collaboration?

I guess I meant, how can make your collaboration most efficient in terms output and hopefully a publication?

In collaboration with biologists, many times, they already have the data, but it might not be optimal for the type of analysis needed to answer their question. Here, I think it is helpful to have some interest in the issue they want to solve. If you understand each other, they can sometimes reacquire some data in a more tailored way or work with you to acquire it; you can reshape the project and come up with new proposals. Redesigning the analytical part with combined expertise can enable the extraction of more robust information and improve the quality of results, which is, in the end, a strong point for publications. If you can visit your collaborators and see their working routines, it’s also great! Seeing how an imaging acquisition is done, or how the experiments with living specimens are conducted gives you a rich perspective about the limitations and possibilities.

How about the other side of the collaboration; the personal side and building a network and a career?

In all my collaborations, and especially now in the lab, I have learned biology and microscopy, which is crucial for writing projects. For the EMBO proposal, I read a lot about the biological problem I wanted to address and spent a lot of time discussing about it with other researchers. This helped me understanding how I could contribute to the field with my expertise. I knew what to do with image analysis, but in biology, there are many open questions! You need to decide what to study, the right model for it and how to actually plan the experiments. This was different with respect to the PhD and the perspective of experts in different fields is essential in this case. As an early-career researcher (ECR) finding your way in a multidisciplinary field, your curiosity is key in getting this expert knowledge from collaborators or researchers from different fields. I think that by knocking on their door to discuss different ideas and by listening to them you can learn how to pose hypotheses and phrase good questions. Moreover, all this can give you a better idea of the impact of the methods and technology that you are developing, which is important in conceiving more competitive project proposals and sometimes can open the door to new publications. Personally, I love to learn from people and to share the excitement of the different daily activities – it’s my driving force. Luckily, Arrate and Ricardo introduced me to their collaborative networks, where I’ve met very inspiring researchers! Likewise, the IGC community is quite vibrant in bringing international visitors and scheduling informal meetings with them, which impressed me at the beginning. It is a privilege.

A lot of researchers are interested in using deep learning methods to analyse their images, but are there any common pitfalls and what are your top tips for getting started?

Don’t be scared and be patient! The researcher should first ask if they really need to use deep learning. Besides requiring a high computational power with negative environmental impact, it’s a data-driven method and data is expensive, in terms of acquisition, annotation and curation. I would invest some time searching for useful tools. There’s a huge ecosystem and picking the right one can sometimes be difficult. I strongly recommend Image.sc forum to see if someone else has tried solving the same issue, and if not, don’t hesitate to ask for advice there. There are also recent reviews and perspectives online. For example, being biased, there’s live-cell imaging in the deep learning era from Joanna Pylvänäinen et al., Virginie Uhlmann has also some interesting publications around bioimage analysis, and we just published an extended perspective about the use of these methods to prevent phototoxicity. These publications give technical advice to solve different tasks and point to useful tools. I would be very careful about learning adequate evaluation metrics and results quality checks, as here both the biological and computational aspects are equally important.

Can you tell us about your new platform, DL4MicEverywhere?

DL4MicEverywhere is a new project conceived between the labs of Guillaume Jacquemet and Ricardo, that builds upon their experience with ZeroCostDL4Mic. It brings deep learning in a user-friendly manner, ensuring that the methods are portable and reproducible in any workstation, laptop, or the cloud. That’s the key component. To do so, we use Docker containers and zero-code, user-friendly Jupyter Notebooks. Setting up the containers means that we automatically ensure that the user does not need to deal with any programmatic installation and the same configuration can be ensured across systems.

UPDATE: Check out DL4MicEverywhere v2.0.0!

🚀 Exciting news! We've just released #DL4MicEverywhere v2.0.0! 🎉 This update brings a lot of new features including automatic requirements installation, smoother interaction with the GUI, improved documentation and videos, and more! 🌟Check it out now 👉 https://t.co/wstxmkn8Vk pic.twitter.com/YsBLr3hvgc
— Iván Hidalgo Cenalmor (@IvanHCenalmor) February 21, 2024

Estibaliz Gómez-de-Mariscal, Iván Hidalgo-Cenalmor, Guillaume Jacquemet and Ricardo Henriques

For people that are unfamiliar with these terms, what is a Jupyter Notebook and what do you mean by containers?

Jupyter Notebooks are electronic notebooks to interact with code written in Python. There you can write plain text and code, so it provides a way to disseminate programmatic methods in a more digested, documented and appealing manner for a non-expert user. It provides an easier layer to interact with coding. Normally, one installs the required software dependencies and then open a notebook to interact with a more basic part of the code without struggling with underlying complex functions. Containers are very similar to virtual machines, where we can install any requirement needed in an isolated manner without affecting the configuration of the entire computer. They are like a mini-machine inside your computer, which has already installed the specific software requirements, packages and versions for a code to work. The key advantage is that once these containers are setup, they become static and portable across systems, which means that the same method will be run in the same way everywhere, so the method becomes reproducible.

What are the problems with reproducibility when using AI? And does that link with the FAIR Findability, Accessibility, Interoperability, and Reusability principles?

AI relies on machine learning algorithms, which aren’t closed, unique mathematical solutions. They are experimental and the whole process, especially the training, has a lot of stochasticity in it. You can train two networks for segmentation with the same architecture, and while they both do a good job segmenting an image, they will not provide the same result. This means that obtaining exactly the same trained solution in two different computers is, in my opinion, methodologically impossible. Yet another bigger but solvable issue is that the programs use configurations with an extended list of versioned packages that are not always completely reported. From version to version, the operations can vary, leading to methods that may perform in a slightly different way or simply stop working. With DL4MicEverywhere we enhance the reproducibility by freezing all the dependencies in a container. There, we ensure that once the method is working, it will always do, and the sources of variation will only come from the stochasticity of machine learning algorithms.

It is noticeable that the bioimage analysis community in making a big effort in ensuring that AI is used in ways that are reproducible, which, yes, is related to FAIR principles. Here, accessibility is probably my favourite one! A lot of people compare accessibility with open access, but in my opinion it’s not the same. For example, I may have some code to my method freely available, but if nobody is able to make it work and exploit it, it’s inaccessible. This is particularly important when we develop methods for life-sciences because it is often non-expert programmers who need to access the whole potential of the technology. Of course, remarkable works in this direction do exist, but it’s still a challenge as it implies user-friendly tools and also a dedicated transfer of knowledge. Indeed, accessibility in these terms might not be a pure scientific requirement, but rather an important principle to tackle at institutional or funding level. I personally believe it must always be considered and so far, I have been happy to put a lot of energy towards it.

Why is it such an important issue for you?

There are many reasons, I’ll share a few! Research on methods development advances at a very high speed. We have new methodologies published almost every hour! The fact is that scientific progress relies on building upon someone’s previous research work and often methods comparison is needed. How can I do this without access to this recent technology? So, accessibility is important for speeding up research and it facilitates methodological benchmarks. It also supports the contribution of researchers from under-represented backgrounds, or less powerful institutions, who bring different points of view into the scientific discovery. And in research, the more diverse the community the richer it becomes, as creativity does also rely on such diversity. Additionally, from a very practical point of view as well, you want your work to be recognised and become part of the state-of-the-art. Compared to inaccessible methods, making your work accessible, increases the probability of it being used and cited.

Going back to DL4MicEverywhere, you have a lot models that are part of the platform, and you mentioned training the models on your own data – is there a typical amount of data required or does this depend on what you are looking at?

That’s the golden question of this century! We don’t know. I always recommend starting with a small dataset. So, you acquire the images, annotate a few, start training these models or fine tuning them, and look at the results. You can see if the method is learning or if it learnt something that doesn’t make sense, which could be because the annotations are wrong or insufficient. There’s not a specific number. Sometimes 30 images might be enough if they are large enough and contain quite an extended field of view or gather most of the variability of the problem. Of course, if you have access to 1 million annotated images, it would be great!

So, would you describe it as an iterative process?

Yes, definitely! Many times, the errors come from the quality of the annotation rather than the difficulty of the analysis. Indeed, images can be terrible, not visually, but for the task! For example, they may hamper determinism with what you want to quantify or they can be unsuitable for robust annotation. Actually discussing the results with a colleague or showing them to experts in the imaging facility can be extremely helpful. I think facilities have an essential role in this matter as well and I really hope that image analysis facilities start proliferating everywhere.

Can you give us some examples of deep learning approaches that are available in DL4MicEverywhere?

You can do semantic segmentation that allows you to distinguish, for example, cell membranes and nuclei in the image; instance segmentation, with Cellpose or StarDist, to identify each independent cell in an image. There’s virtual labelling, which is one of my favourites, and image denoising and restoration. There are also some notebooks for virtual super-resolution. We are now working on the integration of more advanced ones, that I hope will be ready soon.

Estibaliz Gómez-de-Mariscal and Iván Hidalgo-Cenalmor

How can the wider community contribute to the platform?

As a developer, you can contribute your method. You don’t necessarily have to follow the ZeroCostDL4Mic standard format, as long as it is an easy-to-use notebook. If you have a deep learning method that requires some packaging, you can go to the to the GitHub repository, and follow the guidelines for contributions. We will directly help you to do this. And there’s my favourite way to contribute: the best way to contribute to any open-source initiative is by using it and providing feedback. Indeed, in this spirit, recently I heard Robert Haase saying that he calls end-users, collaborators rather than ‘users’. If I know where an end-user struggles, or what they need, I can improve the experience with the tool and even get new ideas. That is why showcasing the tools in courses and conferences is an opportunity to improve them. Using the platform, giving feedback, opening issues in GitHub is also a great way to contribute!

Perfect! Is there anything else you’d like to say about DL4MicEverywhere that we haven’t covered?

I’ve often been asked how we know that we’re making a tool user-friendly. While developing DL4MicEverywhere for example, Ricardo and Guillaume gave a lot of input in this direction to Iván Hidalgo-Cenalmor, the main developer of DL4MicEverywhere. Both have a remarkable experience with image processing for microscopy imaging in collaborative projects. I think that’s key to develop the intuition about where and when non-expert researchers will struggle. Another interesting fact is that working in projects such as this one is usually intense and quite dynamic. On one side, we are teams with quite different expertise working together, which requires a fluent and good communication. On the other side, it implies a lot of interaction with users with different backgrounds through different platforms, which is exciting and chaotic at the same time!

What else are you working on now and what’s next for you?

Within Ricardo’s group, I’m working in smart microscopy, phototoxicity and how to overcome it using deep learning augmented microscopy. I’m still very much involved in the BioImage Model Zoo, and the AI4Life project, which I was working on before it was even funded! For the future, I’m spending a bit of time with Ricardo this year, but I will already try applying for some independent positions. I’m still deciding on the specific topic, whether for example, I want to go back to studying cell migration. Certainly, I know that I don’t want to purely develop AI mathematical methods. Ideally, and motivated by my current position, one day I would love to lead a multidisciplinary team running image-driven discovery, where we can keep fusing life sciences, math and technology.

So, what do you call yourself? Are you a developer, a bioimage analyst, a biologist, or just a scientist!?

I think I’m just a scientist if I must choose one! I do most of them. I’m a developer and I’m a bioimage analyst, but most of it is driven by biological curiosity! I really want to understand the different cellular behaviours I’ve been observing so far while analysing images. Indeed, the analysis is much more than just processing the images. It has a lot of statistics, and often mathematical modelling.

Do you see modelling in your future; it seems like the natural progression from where you are?

Being a mathematician, this is a very tricky question because modelling can get really complicated and scary! There are different types of modelling. For example, 3D dynamic modelling, like surface modelling, would bring me back to my beloved geometry and topology. There is also mechanical modelling, which helps identifying cell migration patterns or understanding tissue formation. This combined with some smart microscopy, sounds quite interesting, so I don’t close the door on it! Still, this would be in the future and will depend on the questions, opportunities and collaborations appearing at a time.

Is there anything else you’d like to add that we haven’t spoken about?

Yes! I’d like to highlight the session on careers at the JCS Imaging Cell Dynamics Journal Meeting with Melike Lakadamyali and Christophe Leterrier. As an ECR, listening to the different routes that others took and how they felt about it at the time was inspiring; it made me reconsider some preconceived ideas. Melike, for example, showed that you can always find a way to achieve your goals, even if it was not as you wished from the beginning or in the way planned. These testimonials can have an impact and are truly inspiring, especially if you have fewer chances to discuss these topics.

Finally, what do you like doing away from work?

I like to keep my private life separated from work. I’m a CrossFit addict, food lover and I very much enjoy visiting museums. I really like to learn about new cultures and languages, which means that living abroad is kind of a hobby!