Feature extraction in napari

Posted by Mara Lampert, on 3 May 2023

This blog post revolves around extracting and selecting features from segmented images. We will define the terms feature extraction and selection. Also, we will learn how to categorize features and can look up specific features in a glossary. Furthermore, we will explore how to extract features in napari.

Definition of feature extraction

During feature extraction, quantitative measurements are assigned to image sets, images or image regions. Feature type involves both feature detection and feature
description. Importantly, features should be independent
of location, rotation, spatial scale or illumination levels (Gonzalez and Woods, 2018).

Nevertheless, there are connections, e.g between location and position, spatial scale and size as well as illumination levels and intensity that one needs to be aware of.

Categories of features

We can differentiate features into different categories:

Schematic of different feature categories

Size (siz)-based parameters relate to magnitude or number of dimensions. Examples are area and volume.

Intensity (int)-based parameters are all parameters related to the power transferred per unit area. Examples are minimum, mean, maximum intensity.

Shape (sh) -based parameters refer to the geometry of the object and relate to the outline of the object and surface. Examples are roundness, aspect ratio and solidity.

Position (po)-based parameters refer to where an object is at a particular time. This is specified relative to a frame of reference.

Moment (mom)-based parameters refer to a physical quantity which has a certain distance from a reference point. Hereby, the moment relates to the location or arrangement of the physical quantity. Typically, these physical quantities are forces, masses or electric charge distributions. One example is torque which is the moment of force.

Texture (tex)-based parameters relate to the crystallographic orientation within a sample. It can be described as random, weak, moderate or strong texture depending on the amount of objects with the preferred orientation. An example is standard deviation of intensity.

Note: In the Bio-image Analysis Notebooks, there is a chapter on feature extraction.

Glossary

This is a collection of features provided by napari-skimage-regionprops (nsr) and napari-simpleitk-image-processing (n-SimpleITK). Hereby, note that some of the mentioned parameters may be implemented differently in both libraries.

Feature extraction in napari

Now, we use napari to extract features using an image and a label image.

Requirements

Note: I would recommend to install devbio-napari. It is a collection of Python libraries and Napari plugins maintained by the BiAPoL team, that are useful for processing fluorescent microscopy image data. Importantly, it contains napari-plugins we use in this blog post. If you need help with the installation, check out this blogpost.

In the following paragraphs, we are going to use the following napari-plugins:

Now, we will explore a dataset of the marine annelid Platynereis dumerilii from Ozpolat, B. et al licensed by CC BY 4.0. When we open our image and our label image in napari in the gallery view, it looks like this:

Napari graphical user interface with label image (left) and image (right)

As you can see, the image is 3D and we are concentrating on a rescaled single timepoint and channel for our feature extraction.

Napari-skimage-regionprops

First, we select Tools → Measurement tables → Regionprops (nsr):

Structure of the Tools menu in napari (when following the installation instructions under “Requirements”)

Next, we can select the image, the labels as well as the feature categories that we want to measure. Note that perimeter

-based parameters are not supported in 3D, so we cannot measure them in this example dataset:

Our output is a table with all parameters:

Output table of napari-skimage-regionprops (nsr)

We can save the table by clicking Save as csv.... Let us close the Regionprops (scikit-image, nsr) window by clicking on the eye symbol. That way we have more space for the table. Personally, I like to increase the table window to have a better overview:

Closing widgets in napari allows to keep an overview

Next, we are interested in individual labels. To visualize them individually, we can

Activate the Pick mode (4)
Make the image visible again
Tick the show selected checkbox
Click on an object we are interested in

Now, we see only the label we selected and the row in the table it corresponds to:

The Pick mode allows to visualize individual labels and the corresponding table cell

The cool thing is that if we click on another label, it will be automatically updated in the image and table:

Interestingly, the first object we clicked on seems to be way bigger than the rest. Lets visualize the area of all labels to investigate this. Therefore, we can double click the table header and get a visualization.

For better understanding and visibility, let us:

Go into gallery view
Untick the show selected checkbox. In this way, we can see again all labels in the label image
Double click on the table header of the column we want to investigate

The default colormap is called jet which I personally find sometimes confusing, so I will change the colormap to plasma. But you have plenty of colormaps to choose from under layer controls → colormap:

The layer controls allows for a variety of colormap choices.

Personally, choosing a colormap that is easy to understand helps me to explore features more easily.

Analyzing timelapse data with napari-skimage-regionprops

If we are analyzing a timelapse dataset with several frames, we need to select Tools → Measurement tables → Regionprops of all frames (nsr):

Now, you have a table with a column named “frame”. Hereby, note that if you want to import a custom table into napari, you also need to provide this “frame” column to specify the timepoint:

The “frame” column is needed to specify timepoints

Note: For more in-depth information, see also the documentation of napari-skimage regionprops.

Napari-simpleitk-image-processing

You can get the measurement table under Tools → Measurement tables → Measurements (n-SimpleITK) :

Again, perimeter-based parameters are not supported in 3D. So, for this example we are measuring all other feature categories.

Napari-simpleitk-image-processing widget

The following steps are exactly the same as for napari-skimage-regionprops (see above)

Analyzing timelapse data with napari-simpleitk-image-processing

If you are working with timelapse data, then select Tools → Measurement tables → Measurements of all frames (n-SimpleITK) :

Note: This plugin also provides filters and segmentation algorithms. You can find them under Tools. They have the suffix (n-SimpleITK). See the documentation of napari-simpleitk-image-processing for more detailed information

Morphometrics

You can install morphometrics for example via mamba:

Installation instruction: mamba install morphometrics -c conda-forge

Morphometrics allows us to get the parameters from these two plugins and the plugin napari-pyclesperanto-assistant at the same time. You can find this option under Tools → Measurement tables → Region properties (morphometrics):

Region properties widget from morphometrics

After you selected all measurements you want to derive, just hit the Run button and you will get a table with all measurements like in the examples above.

Using correlation matrices to reduce the number of features

If we now have a table with lots of parameters, it might make sense to investigate whether they are similar to reduce the number of features. We can investigate the similarity of features using correlation filtering. Hereby, our aim is to reduce the number of dimensions for consideration in a given dataset which is called dimensionality reduction.

In napari, we can use napari-accelerated-pixel-and-object-classification (APOC) to get a Feature correlation matrix. We can find it under Tools → Measurement tables → Feature correlation matrix (pandas, APOC). Then, we need to select our Label image as layer:

Feature correlation matrix (pandas, APOC) widget

After pressing the run button, we receive a correlation matrix:

Feature correlation matrix in napari-APOC: strongly correlating features (green) and weakly/ not correlating features (magenta)

Now, we can investigate which features are strongly correlating. By setting a threshold (e.g. to 0.95), we could select one of these strongly correlating measurements for downstream analysis. For example area and bbox_area are correlation with a factor of 0.994, so it would make sense to only choose one of them .

If you are later on interested to plot your measurements in napari, you can read the FocalPlane post “Explorative image data science with napari” by Robert.

Handling csv-files in napari

If you want to open a csv-file in napari, use Tools → Measurements → Load from csv (nsr).

Furthermore, you can reopen closed tables with Tools → Measurements → Show table (nsr).

A few things I would like to share with you along the way

Know the features you are working with.
Features can be implemented differently depending on the library.
When you don’t understand what a feature means, the best way to figure out is writing a jupyter notebook to test and visualize it.
It is important to know which range your feature can have, so you avoid measuring nonsense.

Feedback welcome

Some of the napari-plugins used above aim to make intuitive tools available that implement methods, which are commonly used by bioinformaticians and Python developers. Moreover, the idea is to make those accessible to people who are no hardcore-coders, but want to dive deeper into Bio-image Analysis. Therefore, feedback from the community is important for the development and improvement of these tools. Hence, if there are questions or feature requests using the above explained tools, please comment below, in the related github-repo or open a thread on image.sc. Thank you!

Acknowledgements

I want to thank Dr. Marcelo Zoccoler, Dr. Robert Haase and Dr. Kevin Yamauchi as the developers behind the tools shown in this blogpost. This project has been made possible by grant number 2022-252520 from the Chan Zuckerberg Initiative DAF, an advised fund of the Silicon Valley Community Foundation. This project was supported by the Deutsche Forschungsgemeinschaft under Germany’s Excellence Strategy – EXC2068 – Cluster of Excellence “Physics of Life” of TU Dresden.

Reusing this material

This blog post is open-access, figures and text can be reused under the terms of the CC BY 4.0 license unless mentioned otherwise.

(3 votes, average: 1.00 out of 1)

Tags: napari, feature extraction, glossary, microscopy, bioimage analysis, image data science
Categories: Bio-image Analysis with Napari, How to, Blog series

Get involved

Create an account or log in to post your story on FocalPlane.

feature	schematics	category	explanation	formula	implementation
area (or volume in 3D)		siz	= sum of pixels/ voxels of the region scaled by pixel-area/ voxel-volume.		nsr
aspect_ratio		sh	= the ratio between object width and object length.		nsr
bbox		po	= minimum range in each spatial dimension		nsr
bbox_area (or bbox_volume in 3D)		siz	= number of pixels/ voxels of bounding box scaled by pixel-area/ voxel-volume		nsr
centroid		po	= geometric center. It is the arithmetic mean position of all points in the surface of the figure. Therefore, the x and y coordinates of all image pixels are averaged.		nsr
circularity		sh	=the area-to-perimeter-ratio. It takes local irregularities into account.		nsr
convex_area		siz	= area of the convex hull of the region. (only available in 2D)		nsr
convex hull		siz	= smallest region that is convex and contains the original region.
eccentricity		sh	= describes how “elongated” a shape is compared to a perfect circle.		nsr
elongation		sh	= ratio between length and width of the object bounding box		n-SimpleITK
equivalent_diameter		siz	= diameter of a circle with same area as region.		nsr
equivalent_ellipsoid_diameter		siz	= diameter of an ellipsoid that has the same volume as the object.	uses principal axes for computation	n-SimpleITK
equivalent_spherical_perimeter		siz	= comparison of the perimeter of the object with the perimeter of a sphere with similar geometric properties.		n-SimpleITK
equivalent_spherical_radius		siz	= comparison of the radius of the object with the equivalent radius of a sphere with similar geometric properties		n-SimpleITK
extent		po/ siz	= Ratio of pixels in region to pixels in the total bounding box.		nsr
feret_diameter		sh	= distance between the two parallel planes restricting the object perpendicular to that direction		n-SimpleITK
feret_diameter_max		sh	= longest distance between points around a region’s convex hull contour (only available in 2D)		nsr
flatness		sh	= degree to which the surface of the object approximates a mathematical plane		n-SimpleITK
local_centroid		po	= average location of all the points within bbox of the region.		nsr
major_axis_length		sh	= longest diameter. It is a line segment that passesthrough the center as well as both foci and terminates at the two points on the perimeter that are the furthest apart. Therefore, it is a measure of object length.		nsr
max_intensity/ maximum		int	= highest intensity value in the region		nsr/ n-SimpleITK
mean_intensity/ mean		int	= mean intensity value in the region		nsr/ n-SimpleITK
median		int	= median intensity value in the region		n-SimpleITK
min_intensity/ minimum		int	= lowest intensity value in the region		nsr/ n-SimpleITK
minor_axis_length		sh	= shortest diameter. It is a line segment that passes through the center and terminates at the two points on the perimeter that are closest to one another. Therefore, it is a measure of object width.		nsr
moments		mom	= Spatial moments up to 3rd order	`m_ij = sum{ array(row, col) * row^i * col^j }`	nsr
moments_central		mom	= Central moments (translation invariant) up to 3rd order.	`mu_ij = sum{ array(row, col) * (row - row_c)^i * (col - col_c)^j }`	nsr
moments_hu		mom	= Hu moments (translation, scale and rotation invariant)		nsr
moments_normalized		mom	= Normalized moments (translation and scale invariant) up to 3rd order	`nu_ij = mu_ij / m_00^[(i+j)/2 + 1]`	nsr
number_of_pixels (or number of voxels in 3D)		siz	= count of pixels/ voxels within a label		n-SimpleITK
number_of_pixels_on_border (or number of voxels on border in 3D		siz	= count of pixels/ voxels within a label that is located at the image border		n-SimpleITK
orientation		po	= overall direction of shape. It is calculated as the angle between the major axis of an ellipse that has same second moments as the region and a reference axis, such as the x-axis.		nsr
perimeter		siz/ sh	= uses a 4- connectivity to represent the contour as a line through the center of border pixels (only available in 2D)		nsr/ n-SimpleITK
perimeter_crofton		sh	= perimeter approximated by Crofton formula in 4 directions (only available in 2D)		nsr
perimeter_on_border		sh	= number of pixels in the objects which are on the border of the image (only available in 2D)		n-SimpleITK
perimeter_on_border_ratio		sh	= describes the ratio between the number of pixels at the image border divided by the number of pixels on the object’s perimeter (only available in 2D)		n-SimpleITK
principal_axes		mom	= Principal (major and minor) axes are those axes passing through the centroid, about which the moment of inertia of the maximal or minimal region. It is the axis around which the object rotates the easiest or the most stable.		n-SimpleITK
principal_moments		mom	= values that describe how much the object resists rotation around each of its principal axes.		n-SimpleITK
roundness		sh	= describes the area-to-perimeter- ratio. In contrast to circularity it excludes local irregularities by using the convex perimeter instead of the perimeter.		nsr/n-SimpleITK
sigma (intensity)		int	= information on structure scale or local contrast (higher values → smoother/ more globally homogeneous region; lower values → sharper/ more locally distinct structures.		n-SimpleITK
solidity		sh	= measures the density of an object. (only available in 2D)		nsr
standard_deviation_intensity		tex	= standard deviation of gray values used to generate the mean gray value.		nsr
sum (intensity)		int	= sum of the intensity values in the region		n-SimpleITK
variance (intensity)		int	= measure of dispersion of numbers from their average value		n-SimpleITK
weighted_centroid		po	= some parts of an object get higher ‘weight’ than others. Therefore, the centroid coordinate is weighted with the intensity image.		nsr

Feature extraction in napari

Definition of feature extraction

Categories of features

Glossary

Feature extraction in napari

Requirements

Napari-skimage-regionprops

Analyzing timelapse data with napari-skimage-regionprops

Napari-simpleitk-image-processing

Analyzing timelapse data with napari-simpleitk-image-processing

Morphometrics

Using correlation matrices to reduce the number of features

Handling csv-files in napari

A few things I would like to share with you along the way

Further reading

Feedback welcome

Acknowledgements

Reusing this material

Leave a Reply Cancel reply

Get involved

More posts like this

Filter by

Definition of feature extraction

Categories of features

Glossary

Feature extraction in napari

Requirements

Napari-skimage-regionprops

Analyzing timelapse data with napari-skimage-regionprops

Napari-simpleitk-image-processing

Analyzing timelapse data with napari-simpleitk-image-processing

Morphometrics

Using correlation matrices to reduce the number of features

Handling csv-files in napari

A few things I would like to share with you along the way

Further reading

Feedback welcome

Acknowledgements

Reusing this material

Share this:

Leave a Reply Cancel reply

Get involved

More posts like this

Filter by

Share this: