List of Data Sets

identifier Preview has imaging details has biological terms has format has Ground truth

Segmentation of membrane of mouse, sea urchin and human oocytes from transmitted light images

1908 Oocyte in transmitted light Oocyte, Label free, Maturation, cell membrane, mouse, human oocyte, sea urchin oocyte

Binary images associated with each source image of the oocyte cytoplasmic contour.

Movies of mouse oocyte maturation in transmitted light

1907 Oocyte image

2D images, 1 channel (transmitted light).

Temporal resolution: every 3 min.

Spatial resolution: 0.227 µm/pixel

Oocyte, Meiosis, Maturation, mouse tiff

ShareLoc

1834 Smlmshareloc

Dna-paint 

Storm

 

microtubules, Nuclear pores , Actin, Vimentin, cytoskeleton txt

LIVECell

1824 LIvecelldatabase

Several wells for each cell type were seeded in 96-well plates (Corning) and imaged over the course of 3–5 d, every 4 h using an Incucyte S3 Live-Cell Analysis system (Sartorius) equipped with its standard CMOS camera (Basler acA1920-155um). The Incucyte HD phase-contrast imaging algorithm allows visualization of phase delays produced by cells without the phase annulus or phase ring found in conventional Zernike phase images. Because of this, LIVECell phase-contrast images are characterized by less pronounced halo artifacts and more high-frequency content than other phase-contrast modalities. Phase-contrast images were acquired using a ×10 objective from two positions in each well adding up to a total of 1,310 images (1,408 × 1,040 pixels corresponding to 1.75 × 1.29 mm2) that were each cropped into four equally sized images (704 × 520 pixels corresponding to 0.875 × 0.645 mm2) that were then annotated.

cells, glioblastoma, breast cancer, Microglia, neuroblastoma, ovarian cancer tif

Annotations are in JSON format and follow the COCO - Common Objects in Context (cocodataset.org) format.

Nanotomy: Large-scale electron microscopy (EM) datasets

1783 nanoscopy

Includes data from CLEM, TEM and STEM, EDX, ...

brain, cells, cell organelles jpg

No segmentation mask available, but all data associated papers are available.

SNEMI3D: 3D Segmentation of neurites in EM images

1719 SNEMI3D

Volume EM

neuron, neurites tif

All labels (2D membrane probabilities and 3D labels) for 3D segmentation are provided for one data set, the other data sets has not known ground truth since it is used for the competition.

Cell-IDR

1718 Cell-IDR

Large range of imaging modalities, ranging from wide-field, super resolution,.. to spatial DNA sequencing 

cells tif

Tissue-IDR

1717 Tissue-IDR

Both Brightfield with stains and fluorescence datasets of tissues

tissue tiff

EMPIAR

1716 empiar

Contains 3D cryo-EM tomographs and maps, volume-em data sets , or hard and soft X-Ray tomography. FIB-SEM, SBF-SEM

It also contains CLEM Correlative Light Electron Microscopy experiments... See the experiment list EMDB < Search results (ebi.ac.uk) for full list of modalities.

cells, proteins tiff

Cytomine Open Collection

1715 cytomine collection

From Brightfield slide scanning, with different stains. Available in tiff, svs or ndpi.

tissue tiff

No annotations associated

CXIDB

1714 CXIDB

Only from Coherent X-Ray imaging

cells, yeast, Virus

No ground truth associated

Cell Image Library

1713 CIL

All Microscopy format accepted

cells, cell organelles, tissue tif

not associated to ground truth (such as segmentation mask)

Cell Tracking Challenge Dataset

1667 exemple of 3D cell tracking data set

Microscopy settings for each dataset are available by clicking on the "More Details Button"

cell motility, cell migration tif

Reference Annotations – Cell Tracking Challenge

Systems Science of Biological Dynamics database (SSBD:database)

1644

Diverse microscopes (mostly light microscopy, but some electronic microscopy as well). SSBD tries to store and curate 4D datasets, ie images that are 3D together with time element. 

It also stores all the qualitative data (i.e. segmented data or ROIs) separately as numerical datasets. Quantitative data are represented by using a unified data format, the Biological Dynamics Markup Language.

Nuclei segmentation in histopathology images

1602 histopathology, nuclei

The dataset contains ground truth annotation for the segmentation of the nuclei.

MoNuSeg - Multi-organ nuclei segmentation challenge

1601 histopathology, nuclei

FAIRsharing Euro-BioImaging collection

1579 FAIRsharing logo

Diverse

Some have ground truth available, such as the Broad Bioimage Benchmark Collection (BBBC, https://doi.org/10.25504/FAIRsharing.j766zb, https://data.broadinstitute.org/bbbc). 

muscle cross-sections

1577 mosaick

Immunofluorescent sections were imaged on a Nikon AR1 confocal or Nikon Widefield CCD Microscope. Each confocal image is a composite of maximum projections, derived from stacks of optical sections.

Muscle Stem Cells, Muscle tiff

NO. Data used to exemplify muscleQNT: Muscle fiber counting

Mouse embryos

1576 embryo DIC

There are 15 images. The images were acquired using a Nikon Eclipse TE200 microscope with a 20x, 0.45 NA objective lens and a 0.52 NA condenser lens, and are provided courtesy of the W.M. Keck 3D Fusion Microscope Facility at Northeastern University. Each image contains 640 x 480 pixels with an approximate size of 0.42 x 0.42 μm.

embryo, cells tiff

For the purpose of collecting ground truth, the samples were Hoechst-stained and imaged by confocal microscopy, and the cells were counted by a simple human. A tab-delimited text file contains cell counts in each of the 15 images.

two-photon images of dendritic spines

1575 MIP example

Two-photon imaging was performed using a galvanometer-based scanning system (Prairie Technologies, acquired by Bruker Inc.) on an Olympus BX61WI equipped with 60X water immersion objective (0.9 NA), using a Ti:sapphire laser (Coherent Inc.) controlled by PrairieView software at 910 nm. Z-stacks (0.3 μm axial spacing) from secondary or tertiary dendrites from CA1 neurons were collected every 5 min for up to 4 h. The field of view was 19.8 × 19.8 μm at 1024 × 1024 pixels.

Dendritic Spine

Annotated data and mask labels provided

Human colon tissue

1574 Colon tissue

The dataset was generated using the virtual microscope imitating the microscope Zeiss S100 (objective Zeiss 63x/1.40 Oil DIC) attached to confocal unit Atto CARV and CCD camera Micromax 1300-YHS.

tissue, cells tiff

Ground Truth is provided as binary masks.

Clustered Cell Nuclei Data

1573 HL60 cells

The dataset was generated using the virtual microscope imitating the microscope Zeiss S100 (objective Zeiss 63x/1.40 Oil DIC) attached to confocal unit Atto CARV and CCD camera Micromax 1300-YHS

nuclei, HL60 cells tiff

Ground Truth is provided as segmentation mask.

CSIRO science image library

1571 plant, textiles, minerals

Breast Cancer Histopathological Database (BreakHis)

1569

histopathology images  - 700X460 pixels,RGB 8-bit images stored as PNG

breast cancer, histopathology

Image patch classified as benign or malignant

Zebrafish larvae - Widefiled/Brightfield

1559 zebrafish tif

Medaka embryo in 96 well plate - Widefield Brightfield

1558 Preview medaka, embryo jpg

ANHIR: Automatic Non-rigid Histological Image Registration

1472 Logo pathology

The Ground truth is denoted using landmarks - key points that are marked consistently for each set of images with different stains.

BBBC002v1

1434 Drosophila Kc167 cells

There are 10 fields of view of each sample, for a total of 50 fields of view. The images were acquired on a Zeiss Axiovert 200M microscope. The images provided here are a single channel, DNA. The image size is 512 x 512 pixels. The images are provided as 8-bit TIFF files.

Drosophila melanogaster, RNAi tiff

A tab-delimited text file contains the number of cells in each image, as determined by two different human counters. To compare an algorithm's results to these, first compute for each sample the algorithm's mean cell count over the 10 images of the sample. Next, calculate the absolute difference between this mean and the average of the humans' counts for the sample, then divide by the latter to obtain the deviation from ground truth (in percent). The mean of these values over all 5 samples is the final result.

Note: The two human observers vary by 16% for this image set.

Diadem challenge

1191 Diadem challenge

Imaging varies by neuron type and species, includes:

  1. Transmitted light brightfield
  2. Confocal
  3. in-vivo 2-photon laser scanning
neurite arbor, neuron, fiber

Manually traced digital neural reconstructions,

CREMI challenge

1182 cremi samples A and B

Serial section Transmission Electron Microscopy (ssTEM).

hdf5
  • Neuron segmentation
  • Synaptic cleft segmentation
  • Synaptic partner annotation

Arabidopsis plants (low resolution)

1146 arabidopsis plant (low resolution) Arabidopsis thaliana jpg

Arabidopsis plants (high resolution)

1145 arabidopsis plant (high resolution) Arabidopsis thaliana

leaves stained with gfp and rfp

1143 leave infection image plant virurs, leaf infection

Artemia color images

1139 artemia color image Artemia tif

Microtubules 3D

1137 3D microtubules microtubules, cytoskeleton

Arabidopsis thaliana seedlings

1131 thumbnail of Arabidopsis thaliana seedlings Arabidopsis thaliana seedlings jpg

2D bright field yeast cell images with ground truth annotations

41 2D bright field yeast cell images

Bright field microscopy

yeast

yes.

  1. binary segmentation masks
  2. in-focus cells' center point coordinates text files