The subset_ord_plot
function is a “convenience function” intended to make it easier to retrieve a plot-derived data.frame
with a subset of points according to a threshold
and method
. The meaning of the threshold
depends upon the method
.
Load the necessary packages and data.
library("phyloseq"); packageVersion("phyloseq")
## [1] '1.22.3'
library("ggplot2"); packageVersion("ggplot2")
## [1] '2.2.1'
data(GlobalPatterns)
ggplot2 package theme set. See the ggplot2 online documentation for further help.
theme_set(theme_bw())
Some subsetting and light massaging of the GlobalPatterns
dataset.
Clean zeros.
GP <- GlobalPatterns
GP <- prune_species(taxa_sums(GP)>0, GP)
## Warning: 'prune_species' is deprecated.
## Use 'prune_taxa' instead.
## See help("Deprecated") and help("phyloseq-deprecated").
Add "human"
variable to GP
to indicate human-associated samples.
sample_data(GP)$human <- get_variable(GP, "SampleType") %in% c("Feces", "Mock", "Skin", "Tongue")
Subset to just Bacteroidetes phylum.
GP <- subset_taxa(GP, Phylum=="Bacteroidetes")
Perform a correspondence analysis and then create some plots to demonstrate using subset_ord_plot
. We want to make species topo with a subset of points layered. Start by performing the correspondence analysis.
gpca <- ordinate(GP, "CCA")
Now make a basic plot of just the species points in the correspondence analysis.
p1 = plot_ordination(GP, gpca, "species", color="Class")
p1
Re-draw this as topo without points, and facet
p0 = ggplot(p1$data, p1$mapping) + geom_density2d() + facet_wrap(~Class)
p0
Re-draw this but include points
p1 = p1 + geom_density2d() + facet_wrap(~Class)
p1
Add a layer of a subset of species-points that are furthest from origin.
p0 + geom_point(data=subset_ord_plot(p1, 0.7, "square"), size=1)
## Warning: Removed 8 rows containing missing values (geom_point).
p0 + geom_point(data=subset_ord_plot(p1, 0.7, "farthest"), size=1)
## Warning: Removed 6 rows containing missing values (geom_point).
p0 + geom_point(data=subset_ord_plot(p1, 0.7, "radial"), size=1)
## Warning: Removed 8 rows containing missing values (geom_point).
Here is what the data retreived by subset_ord_plot
actually looks like
head(subset_ord_plot(p1, 0.7, "radial"))
## CA1 CA2 Kingdom Phylum Class
## 554668 1.5118124 1.008842 Bacteria Bacteroidetes Sphingobacteria
## 154451 1.5385689 1.080271 Bacteria Bacteroidetes Sphingobacteria
## 244965 0.9385059 0.364721 Bacteria Bacteroidetes Sphingobacteria
## 62487 1.5268418 1.044843 Bacteria Bacteroidetes Sphingobacteria
## 139513 1.5244189 1.037502 Bacteria Bacteroidetes Sphingobacteria
## 330672 0.8197026 0.149123 Bacteria Bacteroidetes Sphingobacteria
## Order Family Genus Species
## 554668 Sphingobacteriales Balneolaceae Balneola <NA>
## 154451 Sphingobacteriales Balneolaceae Balneola Balneolaalkaliphila
## 244965 Sphingobacteriales Balneolaceae Balneola Balneolaalkaliphila
## 62487 Sphingobacteriales Balneolaceae Balneola <NA>
## 139513 Sphingobacteriales Balneolaceae Balneola <NA>
## 330672 Sphingobacteriales Balneolaceae <NA> <NA>