Documentation: Roles, Entry standard and how to curate (read first)

Documentation: Roles, Entry standard and how to curate (read first) lascholz Mon, 02/04/2019 - 18:30

Roles

Newbie tagger When you create an account, you become a newbie tagger. Newbie taggers can create new content, and only edit their own content. If newbie taggers wish to edit an existing entry, not created by themselves, they may ask to become a confirmed tagger.

Confirmed tagger can revert revisions, delete content, edit any content, and publish comments submitted to publication. To parse submitted comments and publish it for confirmed taggers, go to shortcut, validate/edit comments.

ALL: Please use the forum to give us feedback on additional features, report problems in user experience, suggest use cases...

Entry information standard

Software entries (not datasets or training materials) will be classified into 3 main tiers relating to its degree of completeness.

The standard described here (still under discussion) provides guidelines to support BIII.eu webtool curators to monitor the webtool content and tagging. This standard was adapted from the Tool information standard documentation from ELIXIR bio.tools. Thanks to Jon Ison for referencing this documentation.

This standard comprises a list of entry attributes to be specified for a software entry to be classified in a 5 tier rating of entry completeness and quality in BIII.eu.

BIII.eu includes two ontologies on its framework: Bise core ontology and EDAM-Bioimaging.

In addition, we provide curation guidelines describing how each attribute should be specified to ensure the quality of BIII.eu entries. These guidelines are not limited to the syntatic and semantic constraints defined by EDAM-bioimaging ontology and BISE-core-ontology

The standard provides a basis for monitoring of content and labelling of BIII.eu entries initially by:

Entry informatio standard in 3 TiersEntry information standard attribute groups

The standard is applied to BIII.eu as follows (condensed information available from the tables):

Guidelines to BIII.eu taggers and curators

BIII.eu is a web-based database that includes bioimage analysis tools, such as Software, Training material and Datasets. The guidelines presented here will help newbie taggers and confirmed taggers to add and curate software entries into BIII.eu. The software entry of the webtool include from simple components (e.g. gaussian filter), to image processing libraries, collections of components, and workflows (e.g. single particle tracking).

The detailed description of the types of tools that can be included in BIII.eu webtool are still under discussion. However, we ask curators and taggers to include only tools that can be used (e.g we do not seek a publication without an implementation of the code publicly available. Commercial software can be accepted if specified as so) and relate to image analysis problems in biology (a.k.a bioimage analysis).

The tools are described using two ontologies, BISE-core-ontology and EDAM-Bioimaging. BISE-core-ontology contains the structure of description of entries in BIII.eu, not only software, and includes entry properties such as author, reference publication, curator and so on. EDAM-Bioimaging is used as a source of terms to describe the entry with Bioimaging related vocabulary.

In BISE, a software entry describes a bioimage analysis tool, which can be classified as a component, a collection or a workflow. A component is an implementation of certain image or data processing / analysis algorithms. Each component alone does not solve a Bioimage Analysis problem. These problems can be addressed by combining such components into workflows. On the other hand, a workflow is a set of components assembled in some specific order to process bioimages and estimate some numerical parameters relevant to the biological system under study. Workflows take image data as input and output either processed images or other type of data (usually numeric values). Workflows can be a combination of components from the same or different software packages. Finally, a collection is a software that encapsulates a set of bioimage components and/or workflows, e.g. libraries such as imglib2 and scikit-image or general purpose software such as Fiji, Icy.

OBSERVATION: These curation guidelines were inspired by biotoolsDocs documentation. Part of it has been adapted or used as is from the original biotoolsDocs. Thanks to Jon Ison for referencing biotools documentation efforts.

If you wish to suggest changes or additions to this documentation, please raise an issue to begin a discussion.

How and what to curate?

Try to fill as many fields as possible. If one field definition is unclear, report in the BIII.eu forum or raise an issue in Github bise documents repository. Also mind that there is an Entry Information Standard documentation for BISE, where you can check how your entry will be interpreted and curated by confirmed taggers. The entry information standard divides the entry based on its degree of completeness, from 'sparse' to 'comprehensive'.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119:

Before you start

When you add a new content, you will notice if a similar named tools exist. Please check if it is the same tool(s) you wanted to tag, if not, name it differently (example: Erosion (Icy) vs Erosion (MorpholibJ)). The purpose is to have a unique title for each entry. Whenever possible, try to add both the collection and an entry by component(s): e.g. MorpholibJ is an ImageJ plugin, and an entry by function (erosion in MorpholibJ, watershed in MorpholibJ etc.. are all components inside MorpholibJ). The purpose is then to help analysts to find the component they need when constructing a bioimage analysis workflow.

Consider the following before adding a software entry in BIII.eu

  1. Are one or more entries required to describe the software?
  1. What if the software is already registered?
  1. Are there version-specific considerations?
  1. Plan how to describe the entry functions (under construction).
  2. Read the general EDAM annotations guidelines (under construction).

Attribute Guidelines

The guidelines below are organized into sections as they appear in the create software page in BIII.eu webtool.

Name (Software)

Canonical software name assigned by the tagger, preferably the software developer or service provider, e.g. "Fiji"

1. MUST use name in common use, e.g. in the tool homepage or publication.

2. MUST use short form if available e.g. MaMuT not MaMuT: A Fiji plugin for the annotation of massive, multi-view data.

3. MUST NOT include general or technical terms ("software", "application", "server", "service", "plugin", "app", "add-on" etc.) unless these are part of the common name

4. MUST NOT misappropriate the names of other tools, e.g. there are many Erosion implementations; Calling any of them "Erosion" would be wrong

5. MUST NOT include version information unless this is part of common name (under discussion - there is still no field for version)

6. SHOULD preserve original capitalisation e.g. MaMuT not mamut.

7. SHOULD follow the naming patterns (see below)

Naming Patterns

For components that are part of a collection, use the pattern {collectionName} toolName For tools that simply wrap or provide an interface to some other tool (NOTE: this is still uncommon in bioimage analysis and was a particular situation in biotools. It needs more discussion), use the pattern

{collectionName} toolName {API|WS}{(providerName)} e.g. EMBOSS water API (ebi)

where:

If in exceptional cases (i.e. when registering, as separate entries, versions of a tool with fundamental differences, substitute for toolName in the pattern above:

toolname versionID e.g. ilastik 0.5

where versionID is the version number.

Tip: * in case of mulitple related entries be consistent, e.g. Open PHACTS and Open PHACTS API * be wary of names that are very long (>25 characters). If shortening the name is necessary, don't truncate it in a way (e.g. within the middle of a word) that would render it meaningless or unintuitive

Description

Textual description of the software, e.g. "The neuTube is a collection of neuron reconstruction tools from fluorescence microscope images. It has an interactive system with a 3D viewer, which can be clicked in 3D and perform neuron tracing automatically and semi-automatically. It can automatically recognize branching points as junctions. Traced neurons can be exported to swc format, which could be imported by various software packages. neuTube has Win and Mac OS standalone executable builds and may also be installed by manual compilation."

example 2: "All-path-pruning 2.0 (APP2) is neuron tracing (fully automated) component of Vaa3D. APP2 prunes an initial reconstruction tree of a neuron’s morphology using a long-segment-first hierarchical procedure instead of the original termini-first-search process in APP. APP2 computes the distance transform of all image voxels directly for a gray-scale image, without the need to binarize the image before invoking the conventional distance transform. APP2 uses a fast-marching algorithm to compute the initial reconstruction trees without pre-computing a large graph. This method allows to trace large images. This method can be used with default parameters or user-defined parameters."

1. MUST provide a concise summary of purpose / function of the tool. we RECOMMEND the description bo te of 1-2 short paragraphs.

2. MUST begin with a capital letter and end with a period ('.')

4. SHOULD NOT include any of the following, unless essential to distinguish the tool from other entries:

  • provenance information e.g. software provider, institute or person name

5. SHOULD NOT describe how good the software is (mentions of applicability are OK)

6. SHOULD NOT include URLs

Author

One or more strings that identify the author(s) of the tool.

1. Each author item MUST correspond to a single individual. In case an individual is not known to be the author, the name of an institution is RECOMMENDED.

2. Each author item MUST follow the following pattern: LastName, FirstName, where LastName MUST NOT be abbreviated and FirstName SHOULD NOT be abbreviated (usually when the first name is long it is RECOMMENDED to use full FirstName abbreviations or partial abbreviations (e.g. SCHOLZ, LEANDRO A. (orcid.org/0000-0002-2411-0429)).

3. SHOULD (if available) include the author's ORCID ID with the following template: LastName, FirstName (orcid.org/xxxx-xxxx-xxxx-xxxx) so they can be contacted more easily (to get the Orcid , google orcid + author name to get it. orcid.org/xxxx-xxxx-xxxx-xxxx).

Illustrative image

An illustrative image that represents the main functionality of the software entry.

1. It SHOULD represent the main software functionality or a screenshot of the UI in use. In cases where a single image cannot show the main functionality of the software entry (usually happens for general purpose software and libraries) the illustrative image SHOULD be the logo. The software entry will not be promoted on the front page without an illustrative image.

License/Openness

There are only 4 discrete values for this attribute: Commercial, Free and open source, Free but not open source and I do not know. - Commercial is used when the software needs to be purchased in order to be used. - Free and open source is selected when the source code is available and the software does not need to be purchased in order to be used. - Free but not open source is used when the source code is not available (closed source) but the software is free to be used. - I do not know is used when the License/Openness is not know. This value SHOULD be avoided.

1. If an entry is a Shareware software with different License/Openness values (e.g. a commercial and a free version with fewer features), it SHOULD have both values selected. However, we discourage users to add entries of such type.

Entry curator

A link to a BIII.eu user who is either: (1) the user who added the entry, but who is not the rightful owner of the tool (When the tool is first added), (2) a confirmed tagger, who checked the entry and curated it or (3) the rightful owner of the entry (i.e the tool developer or provider of the online service). When the tool added in BIII.eu, the value of Entry curator will change upon curation (either to the confirmed tagger who curated the tool or the rightful owner of the tool)

Download page

Homepage of the software, from which is possible to download the software or some URL that best serves this purpose, e.g. "http://icy.bioimageanalysis.org/"

1. MUST resolve to a web page from the developer / provider that most specifically provides a downloadable version of the software or has a link to its source code.

2. MUST be restricted to http(s?)://[^\s/$.?#].[^\s]*

3. The link to a Download page that does not work anymore SHOULD be removed and replaced by a new, working link.

TIP: In case a tool lacks its own website, a URL of its code repository is OK. Do not use a general URL such as an institutional homepage.

Reference publication

An url that links to a reference publication that presents the tool.

1. MUST resolve to a web page of a journal article or web page of a preprint server that most specifically links to a reference publication.

2. SHOULD preferably be a DOI link in the form https://doi.org/+DOI (e.g. https://doi.org/10.1371/journal.pbio.1002128). Normal URL links to the reference publication are OK but discouraged.

3. SHOULD preferably resolve to a publication where the tool was first introduced.

4. MAY receive more than one Reference Publication (use Add another item button).

5. The link to the reference publication that does not work anymore SHOULD be removed and replaced by a new, working link.

Documentation

An URL that links to a source of information about the use, installation and applications of the software. Accepts more than one documentation attribute entry.

1. MUST resolve to a web page from which one can obtain information about how to use the software. For example, a link to a user’s manual (e.g. Neural Circuit Tracer main page, or the the link to the User guide itself), a wiki page (e.g. Anamorf Wiki, or other link from which similar information can be obtained (example, Using Fiji page or a readme page with detailed information about the software).

2. From all the options above, it is RECOMMENDED that, if only one Documentation attribute is given, the URL to the documentation resolves to the most used and comprehensive source of information about the software, no matter its type (wiki, pdf file, web page linking to other pages, etc..).

3. MAY receive more than one Documentation link (use Add another item button).

4. The link to a documentation page that does not work anymore SHOULD be removed and replaced by a new, working link.

Has usage example

An URL that links to a usage example, sch as a case study document (pdf, web page, video or other types of media), training material (also of any type of media, but preferably existing in BIII.eu), a workflow in which the tool is used (for components).

1. MAY link to an existing node in BIII.eu database (e.g. a workflow, a training material, a dataset).

2. MAY receive more than one usage example link (use Add another item button).

3. The link to a usage example that does not work anymore SHOULD be removed and replaced by a new, working link.

Has comparison

An URL that links to a document, preferably in written media (pdf, slide deck), showing a comparison of the tool against other similar tools that perform the same job or very similar job. Examples: link to a research paper that benchmarks several tools, link to the web page of a Challenge in which the tool is included, reference to other web pages were the comparison is available. BIAFLOWS, the benchmarking webtool from Neubias WG5 could be a source of such attribute. However, we are still discussing how to interact with it.

1. MUST resolve to a web page that shows results of a comparison of the tool against other similar tools.

2. It is RECOMMENDED that the description (Link text) of the URL indicates the part of the document (a figure, a page or a reference in that document) where the results of the comparison are.

3. MAY link to an existing node in BIII.eu database (e.g. a training material).

4. MAY receive more than one comparison link (use Add another item button).

5. The link to a comparison that does not work anymore SHOULD be removed and replaced by a new, working link.

A single DOI link to the software, related to the correct version of the tool described in the entry. There are many options out there (see this blog post from Datacite), but the most commonly used is Zenodo.

1. MUST be a DOI link in the form https://doi.org/+DOI (e.g. https://doi.org/10.5281/zenodo.30769 Image removed.).

Has Training material

A link to an existing training material node in the BIII.eu database.

1. MUST link to an existing training material node in BIII.eu database (e.g http://biii.eu/node/1366).

2. MAY receive more than one training material link (use Add another item button).

The following entry attributes (Has function, Has Topic, Has biological terms) may be considered the most important in BIII.eu. It is with them that the database will be able to connect the tools and make them searchable by bioimage analysts, developers and biologists.

Has function (EDAM-Bioimaging)

Details of a function the tool provides, expressed in concepts from the EDAM-Bioimaging Operation ontology, e.g. image classification and model-based segmentation.

1. MUST correctly specify operations performed by the tool, or (if version indicated), those specific version(s) of the tool.

2. MAY receive more than one Function, especially when the tool has multiple modes of operation.

3. SHOULD describe all the primary operations and SHOULD NOT describe secondary or minor operations. In case there are any questions, start discussion in BIII.eu forum.

Has Topic (EDAM-Bioimaging)

General scientific domain the tool serves or other general category (EDAM Bioimaging Topic), e.g. Tissue image analysis, Microscopy, Machine Learning.

1. MUST specifiy the MOST IMPORTANT and relevant scientific topics, although we RECOMMEND to refer at least to the single most important scientific topic.

2. MUST correctly specify Topics the tool relates to, or (if version indicated), those specific version(s) of the tool.

3. MAY receive more than one Topic. (use Add another item button)

4. SHOULD NOT exhaustively specify all the topics of secondary relevance. Include Topic(s) that include the tool into the pool related tools (with the same Topic). And, if applicable, include one or more Topics that distinguish the tool from the others in the pool. (see discussion in EDAM-Bioimaging.

Has biological terms

1. MUST specify the most important and relevant biological terms.

2. MAY receive more than one biological term (use Add another item button).

Additional keywords

A string in which the user may add keywords to the entry in case she/he did not find existing keywords in EDAM-Bioimaging functions or topics. This field is important to support further improvements and discussions on new versions of EDAM-Bioimaging or modifications in BIII.eu.

1. MUST be a concise keyword and comprise of the most commonly used keyword that relates to the intended theme/subject (to the best of the user's knowledge, for we do not expect the user to know the best term, which is also not always agreed upon by the scientific community).

2. MAY receive more than one keyword (use Add another item button).

Requires

A link to an existing software node in BIII.eu to show in which platform it can be run or the dependencies of the tool. For example tool 3D intensity profile requires ImageJ to be run. On the other hand, DeconvolutionLab2 ImageJ plugin, not only requires ImageJ but also other libraries, such as (under construction).

Execution platform

A discrete attribue that defines in which main execution platforms the tool can be used. There are only 4 values for this attribute: Linux, Mac, Windows, Unsure.

Implementation type

A discrete attribute that defines the type of implementation of the tool. A more detailed description on the discussion of implementation types is in "Workflows and Components of Bioimage Analysis: The NEUBIAS Concept" Image removed.. The 4 values for this attribute are Collection, Component, Workflow and I do not know

1. Component is an implementation of an image or data analysis/processing algorithm that may be used as a part of an image analysis workflow. A component alone does not solve a Bioimage Analysis problem.

2. Workflow is a set of components assembled in some specific order to process bioimages and estimate some numerical parameters relevant to the biological system under study. Workflows take image data as input and output either processed images or other type of data (usually numeric values). Workflows can be a combination of components from the same or different software.

3. Collection is a software comprising a group of Components or Workflows. Collectios are often image analysis platforms or libraries, e.g. scikit-image library and ImageJ platform.

4. I do not know is a tool whos type is unknown. Ideally, the user adding the entry SHOULD identify the tool type prior to adding it to BIII.eu.

License

Software or data usage license, e.g. "GPL-3.0"

1. MUST acurately describe the license used.

2. SHOULD use "Proprietary" in cases where the software is under license whereby it can be obtained from the provider (e.g. for money), and then owned, i.e. definitely not an open-source or free software license.

3. SHOULD use "Unlicensed" for software which is not licensed and is not "Proprietary".

4. SHOULD either use "Other" or the License name (if known) if the software is available under an uncommon license not listed below and which is not "Proprietary".

Most Common Licenses (if you know of an important one to add feel free to suggest):

Has programming language

Comprises both programming language used for the implementation of the entry and the programming languages supported by the entry. We still do not have a controlled vocabulary, so if you type a new language, which was not previously added in BIII.eu, it will create a new node with that name.

is compatible with

A link to an existing software node in BIII.eu for which the tool was not originally developed, but that can be called from.

Supported image dimension

A discrete attribute value that defines with which image dimensions the tool can be used. The four discrete values are 2D, 3D, Multi-channel and time-series. OBSERVATION: Some may understand there is an overlap with these values, for a 2D RGB image would also be a Multi-channel image and a 3D image could be a 2D + time-series image, etc.

Interaction level

A discrete attribute value that defines the interaction level between user and the tool. There are four discrete values, which are:

1. Automated a tool that returns the output with a single command call (selection in a GUI) that may oy may not accept definition of parameters.

2.Manual a tool constructed in a way such that it uses a user interface, most commonly a Graphic User Interface (GUI) to help users perform manual image analysis tasks. ImageJ multi-point tool.

3.Semi-automateda tool that requires multiple calls or user interactions in order to deliver the output. For example, tracing filaments individually until all filaments of an image are traced or identifying central points of cells in order to segment them. The interaction may occur prior to the execution of the tool or several times while the tool is being used (e.g. Simple Neurite Tracer)

4.I do not know a tool that does not have a known interaction level. Ideally, the tagger adding the entry to BIII.eu SHOULD identify the interaction level prior to adding the entry.

Comments

Comments are a valuable part of BIII.eu. The comments allow users to talk more specifically about a certain tool beyond the forum.

Users may leave comments in each page of BIII.eu entries. The content of the comments may include an opinion about the tool perfomance (how good the tool does what was said to do), use cases or an update on the current status of the tool (deprecated, legacy, etc..). Users MUST be polite and avoid rude, violent comments.

The information contained here is available in NEUBIAS github repository under bise-documents