The CellOntology plugin distributed with FlowJo v10 allows you to determine the name and etiology of a gated cell or population in question.
Almost all applications in flow cytometry rely on phenotype markers to identify specific cell types and their associated functions, etiologies or biological contexts. The CellOntology database maintains a record of published definitions of cell types with a myriad of phenotypic markers. Using the CellOntology plugin, you can query the database to identify the cell or population you have in a FlowJo workspace.
The plugin utilizes the flowCL package1 within an R statistical computing environment. The plugin directs a query to the CellOntology database and retrieves suggested nomenclature based on a ranking score. Two outputs come from the database query: (1) a single CSV file containing suggested name(s) for your cell type and (2) a cladogram detailing your cell’s relationship to other known cell types.
This page will show you how to use the CellOntology plugin and explore an example of its use for discovery.
- If you have never used a FlowJo plugin, please see the Installing Plugins page for detailed information on plugin setup before continuing
- This plugin requires R and the package “flowCL“. To install, enter the following commands in your R console:
- If you are already familiar with Plugins:
- The CellOntology.jar file is distributed in the installation package of FlowJo v10.1r7.
- Verify CellOntology.jar is located in your plugins folder.
- Verify the path to your plugins folder is specified in Diagnostics preferences.
- Restart FlowJo and CellOntology should appear as an option within the Plugins menu.
- See Installing Plugins for further details.
- Open a workspace and construct a gating tree using standard nomenclature for the antigens you are staining for (ex. CD3, CD4, CCR7, etc).
- If the workspace has not been saved previously, then save the workspace prior to running CellOntology (see note on plugin derivatives below for details why).
- Select/Highlight a population within a FlowJo gating hierarchy you wish to run CellOntology on.
- Navigate to the Workspace tab, populations band and click on CellOntology within the Plugins menu.
- A prompt will appear, with radio buttons indicating the positive “+” or negative “-” nature of each marker with respect to the selected population.
- (Optional) Check the box to display ontology names. Note this will show the ontological name as opposed to the short name.
- (Optional) Check the box to show output of Bioconductor flowCL execution. Checking this box will show the request by R to run the flowCL package (see Options section below for details). This can be useful for troubleshooting instructions being passed or not to R.
- Review the +/- designations for the markers that will be input into flowCL and click OK. The plugin code will call to the locally installed version of R and initiate the CL query.
- A node will be created on the population highlighted within the FlowJo workspace gating tree. Initially this node will say (calculating). When the program completes its inquiry with the database and returns results, the (calculating) will disappear.
- Open the Layout Editor and drag the node from step 9 into it.
- Additionally, two external derivative files will be created: 1. listPhenotypes.Markers.CSV –> a scored list of the top cell types returned from the CL query; and 2. tree_Markers.png –> the tree diagram result.
Note: In FlowJo 10.1r7, Plugin derivatives are written to a folder with the same name as the the FlowJo workspace, in the same directory as the workspace. For example, if your workspace is called My Workspace.wsp, then when you run the CellOntology plugin, a folder called My Workspace will be created in the same directory/location as the workspace file itself. Within that My Workspace folder, a CellOntology folder will be present and contain all output files/derivatives from the CellOntology plugin that are run on data contained within that workspace.
Display ontology names – When Checked, a list of possible names for the queried cell population will be reported
Show output of Bioconductor flowCL execution – When checked, user will receive a notification window containing information entered into the R session to execute the flowCL program.
The CellOntology plugin is extremely useful for discovery work, especially when investigating new populations of cells that you may be unfamiliar with. Consider the following example:
We have a multi-parametric data set with a panel that includes the standard T, B, and monocyte subsets. In addition to the standard panels, we have included markers that we are less familiar but would like to know if their expression pattern follows a particular lineage. We would also like to know if these markers, in conjunction with the standard panels define a known cell type or not.
Step 1 – Clean up gates
To begin our analysis, we will create a series of gates to remove debris, doublets and dead cells. We will begin our “discovery” analysis on the cleaned up live cells population.
Step 2 – Terminal populations
We will continue creating our gating hierarchy defining broad categories (T-cells, B-cells) and any familiar terminal populations we have markers for (E.g. CD3+/HLADR-/CD45RA+/CD8+ and CD3-/HLADR-/CD19+ cells). The broad and/or terminal populations will help identify the context of the cells we are “discovering” when compared in an overlay or NxN plot.
Step 3 – Downsample
We intend to use a clustering algorithm to reduce the dimensional space of my data set. Since it is computationally expensive, we will reduce the number of events that will be fed into the clustering algorithm to 15,000. We begin our data reduction by using the downsample plugin on the live cells population.
Step 4 – Dimensionality reduction/ Clustering
With the remaining 20+ parameters we would like cluster cells into groups that share common features. There are several algorithms that we can choose to perform this function, but we will stick with tSNE (T-distributed Stochastic Neighbor Embedding) for this example. We begin the tSNE algorithm by selecting the downsample population from Step 3, and selecting the tSNE plugin from the populations band.
Step 5 – Define the tSNE space
The data now contain tSNE-X and tSNE-Y parameters. Using these in a graph window, we can see how our data have separated (quite nicely) into distinct “continents”. The distance between continents implies a difference in the phenotype of the cells in one group versus another. At this point, we would like to know where our known subsets of cells map onto the tSNE space. To do this, we will create an overlay in the Layout Editor:
- Open the Layout Editor.
- Drag in the population containing the tSNE parameters (e.g. Downsample-Live cells).
- The tSNE plot should appear.
- Drag a defined population (e.g. CD8 T-cells, B-cells, Naive CD4) over the tSNE plot in the Layout editor.
- The overlay will depict where the defined population exists in the context of the reduced dimensionality space (continents).Unknown populations remain as red clusters, a representative group indicated by the black arrows (red clusters) in the figure.
At this point the phenotype of the unknown populations depicted in the overlay are: CD3-/CD4-/CD8-/CD45RA-/CD19-/CD14-
Step 6 – Identify the unknown continents
To identify which markers are responsible for creating the remaining continents (indicated by black arrows in the figure above; also any red clusters) we will create a gate on a continent and explore the remaining markers’ expression using an NxN plot. To do this:
- Drag the Downsample-Live population into a fresh layout.
- Right click on the graph in the Layout Editor and select “Multigraph Overlays”, then select “NxN” plot.
- Half of a 10×10 plot should appear next to the plot.
- Right-click/ control-click on the NxN plot to modify the relevant markers. (Remove markers that have already defined continents from analysis Step 5 – define the tSNE space)
- Return to the graph window from step 1.
- Create a gate on an “unknown” continent within the downsample population and give it a name. (Note: it may be easiest to use the autogating tool).
- Drag this population over the plot from step 1. The graph and the NxN plot should become an overlay.
- Interrogate the NxN plot for expression of markers that seem to correlate with one another. (For the sake of simplicity, I have removed extraneous negative markers from the NxN).
In this example, one of the continents was found to be CD11c+/CD38+/CD14-/CD16-
Step 7 – Identify the cell type
From the example above our collective marker set is: CD3-/CD4-/CD8-/CD19-/CD14-/CD16-/CD11c+/CD38+. To identify what this cell type might be, we will return to the FlowJo workspace and create a generic gate at any level in the hierarchy. We will name the gate “CD3-CD4-CD8-CD19-CD14-CD16-CD11c+CD38+”. Next we will use the CellOntology plugin to define the name of this of this population (if it exists).
At this point we will open the CSV file produced from activation of the CellOntology plugin. The CSV contains several columns indicating whether your query matched an item in the database, the gene and protein names of the markers, a ranking score and a Cell ID among others.
Format the CSV to view the column contents. Ensure that short marker names match your generic gate name (leftmost column). Next, check out the score. The closer the value is to 1, the higher the probability that your cell/population type is the one found in the Cell ID column. Copy the Cell ID corresponding to the highest ranking score and paste into your web browser. Links containing your Cell ID and to the CellOntology database will appear. Click on the link to see what is known about your cell type.
In our example, it appears the CL_0001026 Cell ID corresponds to a common myeloid progenitor, with some relationship to dendritic cells.
We can repeat the procedure from Steps 6 & 7 to identify more unknown cell types.
Functionally, flowCL decomposes gated population names from a FlowJo gating hierarchy (ex. CD3+CD4+CD8-) into its individual markers (CD3, CD4, CD8) and translates their relative abundance into a relation used in the cell ontology database (CL)2 (such as + for has plasma membrane part), then performs the following steps:
- A SPARQL (http://www.w3.org/TR/rdf-sparql-query/) query against the CL fetches the labels and IDs corresponding to the input markers by text matching to the label or synonyms fields in the CL;
- The marker labels are used to retrieve a list of cell types that contain (or lack) the marker labels;
- The set of markers that make up each cell type is then retrieved;
- A final query retrieves all parents up to the root of the CL for each cell type to build a tree diagram of the results.
- Courtot, M. et al. (2014) flowCL: ontology-based cell population labelling in flow cytometry. Bioinformatics, 31(8):1337-1339.
- Diehl, A.D. et al. (2011) Hematopoietic cell types: prototype for a revised cell ontology. J. Biomed. Inf., 44, 75-79.
For more information on installing and running specific Plugins:
Questions about plugins or FlowJo? Send us an email at TechSupport [at] FlowJo [dot] com