幻域竞界 - 全球网游活动速递站

首页 > 全球速报 > 空间转录组数据注释分析:RCTD反卷积(Nature Biotechnology IF: 33.1)

空间转录组数据注释分析:RCTD反卷积(Nature Biotechnology IF: 33.1)

RCTD(Robust Cell Type Decomposition),是一种用于将单细胞RNA测序数据中的细胞类型注释转移到空间转录组学数据上的方法,于2021年2月18号发表在Nature Biotechnology杂志上,标题为《Robust decomposition of cell type mixtures in spatial transcriptomics》。RCTD 通过整合单细胞和空间转录组学数据,能够较为精确地为空间 spots 分配细胞类型,以便更好地理解空间组织结构中的基因表达情况。

参考教程链接:https://github.com/dmcable/spacexr/tree/master/vignettes

0.环境准备代码语言:javascript复制# 加载R包

library(spacexr)

library(Matrix)

library(doParallel)

library(ggplot2)

knitr::opts_chunk$set(

collapse = TRUE,

comment = "#>",

cache = TRUE,

out.width = "100%"

)

1.读取数据本次使用数据为 hippocampus Visium 数据,首先,使用一个已经注释好的单细胞数据海马体数据集对空间数据用RCTD进行细胞注释,然后做区域差异表达分析

full mode可以对每个spots反卷积成任意细胞数,对于Visium空间数据比较合适。

1.1 读取空间数据空间转录组数据需要的是counts矩阵与空间坐标

counts:矩阵的行名应为基因,列名应代表 barcode/pixel 名称。counts应为未经转换的原始counts数据。coords:一个数值型数据框(或矩阵),用于表示空间像素的位置。行名是barcode/像素名称,且应包含两列,分别用于表示“x”坐标和“y”坐标。nUMI:可选,每个空间位置的总umi数代码语言:javascript复制# 包中的示例数据

# 读取counts

file.path(datadir,"counts.csv")

counts <- as.data.frame(readr::read_csv(file.path(datadir,"counts.csv"))) # load in counts matrix

counts[1:10,1:5]

rownames(counts) <- counts[,1];

counts <- counts[,-1] # Move first column to rownames

# 读取坐标

file.path(datadir,"coords.csv")

coords <- read.csv(file.path(datadir,"coords.csv"))

head(coords)

rownames(coords) <- coords[,1];

coords <- coords[,-1] # Move first column to rownames

# 计算每个位置的umi

nUMI <- colSums(counts) # In this case, total counts per pixel is nUMI

head(nUMI)

1.2 构建对象代码语言:javascript复制# 构建一个对象

puck <- SpatialRNA(coords, counts, nUMI)

puck

barcodes <- colnames(puck@counts) # pixels to be used (a list of barcode names).

plot_puck_continuous(puck, barcodes, puck@nUMI,

ylimit = c(0,round(quantile(puck@nUMI,0.9))),

size = 2,

title ='plot of nUMI')

空间每个spot中表达的nUMI值:

看一下构建好的SpatialRNA对象

代码语言:javascript复制str(puck)

Formal class 'SpatialRNA' [package "spacexr"] with 3 slots

..@ coords:'data.frame': 313 obs. of 2 variables:

.. ..$ x: int [1:313] 4125412546043047412530474005412536464484...

.. ..$ y: int [1:313] 5477300041013895602848585684396457523069...

..@ counts:Formal class 'dgCMatrix' [package "Matrix"] with 6 slots

.. .. ..@ i : int [1:85720] 0123456789...

.. .. ..@ p : int [1:314] 0275551827110113771650192622022478...

.. .. ..@ Dim : int [1:2] 276313

.. .. ..@ Dimnames:List of 2

.. .. .. ..$ : chr [1:276] "Rpl7""Cox5b""Rpl31""Rpl37a"...

.. .. .. ..$ : chr [1:313] "AAAGGGATGTAGCAAG-1""AAAGGGCAGCTTGAAT-1""AACAACTGGTAGTTGC-1""AACCCAGAGACGGAGA-1"...

.. .. ..@ x : num [1:85720] 8178117864915...

.. .. ..@ factors : list()

..@ nUMI : Named num [1:313] 481813002715656125844...

.. ..- attr(*, "names")= chr [1:313] "AAAGGGATGTAGCAAG-1""AAAGGGCAGCTTGAAT-1""AACAACTGGTAGTTGC-1""AACCCAGAGACGGAGA-1"...

1.3 读取单细胞参考数据单细胞需要的是counts矩阵,以及做好的细胞注释信息,和nUMI数据。

代码语言:javascript复制### 2.读取单细胞参考数据

refdir <- system.file("extdata",'Reference/Visium_Ref',package = 'spacexr') # directory for the reference

# count值

file.path(refdir,"counts.csv")

counts <- read.csv(file.path(refdir,"counts.csv")) # load in cell types

rownames(counts) <- counts[,1];

counts <- counts[,-1] # Move first column to rownames

counts[1:5,1:5]

简单看一下数据:原始counts

细胞类型:

代码语言:javascript复制# 细胞类型:

file.path(refdir,"cell_types.csv")

cell_types <- read.csv(file.path(refdir,"cell_types.csv")) # load in cell types

cell_types <- setNames(cell_types[[2]], cell_types[[1]])

cell_types <- as.factor(cell_types) # convert to factor data type

head(cell_types)

nUMI数据:

代码语言:javascript复制# umi

file.path(refdir,"nUMI.csv")

nUMI <- read.csv(file.path(refdir,"nUMI.csv")) # load in cell types

nUMI <- setNames(nUMI[[2]], nUMI[[1]])

head(nUMI)

参考对象:

代码语言:javascript复制# 构建对象

reference <- Reference(counts, cell_types, nUMI)

str(reference)

Formal class 'Reference' [package "spacexr"] with 3 slots

..@ cell_types: Factor w/ 17 levels "Astrocyte","CA1",..: 1111111111...

.. ..- attr(*, "names")= chr [1:510] "P60HippoRep6P2_GACTTGAAATGC""P60HippoRep2P2_ACTGAAGAATTN""P60HippoRep2P2_CCTTAGCCCTGG""P60HippoRep5P3_GTCAAGGCGGGC"...

..@ counts :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots

.. .. ..@ i : int [1:69901] 1249111216222326...

.. .. ..@ p : int [1:511] 011221538456666874688010321163...

.. .. ..@ Dim : int [1:2] 307510

.. .. ..@ Dimnames:List of 2

.. .. .. ..$ : chr [1:307] "Acta2""Actb""Ahi1""Aldoa"...

.. .. .. ..$ : chr [1:510] "P60HippoRep6P2_GACTTGAAATGC""P60HippoRep2P2_ACTGAAGAATTN""P60HippoRep2P2_CCTTAGCCCTGG""P60HippoRep5P3_GTCAAGGCGGGC"...

.. .. ..@ x : num [1:69901] 416212410111...

.. .. ..@ factors : list()

..@ nUMI : Named int [1:510] 9961199231339298845861378176814861331...

.. ..- attr(*, "names")= chr [1:510] "P60HippoRep6P2_GACTTGAAATGC""P60HippoRep2P2_ACTGAAGAATTN""P60HippoRep2P2_CCTTAGCCCTGG""P60HippoRep5P3_GTCAAGGCGGGC"...

2.运行RCTD数据都准备好了之后开始运行RCTD:需要注意这里的mode选择full模式

代码语言:javascript复制# 运行

myRCTD <- create.RCTD(puck, reference, max_cores = 10)

myRCTD <- run.RCTD(myRCTD, doublet_mode = 'full')

saveRDS(myRCTD,'myRCTD_visium_full.rds')

看看myRCTD的结构:

代码语言:javascript复制str(myRCTD)

Formal class 'RCTD' [package "spacexr"] with 9 slots

..@ spatialRNA :Formal class 'SpatialRNA' [package "spacexr"] with 3 slots

.. .. ..@ coords:'data.frame': 313 obs. of 2 variables:

.. .. .. ..$ x: int [1:313] 4125412546043047412530474005412536464484...

.. .. .. ..$ y: int [1:313] 5477300041013895602848585684396457523069...

.. .. ..@ counts:Formal class 'dgCMatrix' [package "Matrix"] with 6 slots

.. .. .. .. ..@ i : int [1:37939] 0123456789...

.. .. .. .. ..@ p : int [1:314] 01222443664886107308529741096...

.. .. .. .. ..@ Dim : int [1:2] 122313

.. .. .. .. ..@ Dimnames:List of 2

.. .. .. .. .. ..$ : chr [1:122] "Aldoc""Apoe""Atp1a2""Atp5f1"...

.. .. .. .. .. ..$ : chr [1:313] "AAAGGGATGTAGCAAG-1""AAAGGGCAGCTTGAAT-1""AACAACTGGTAGTTGC-1""AACCCAGAGACGGAGA-1"...

.. .. .. .. ..@ x : num [1:37939] 101887625725504...

.. .. .. .. ..@ factors : list()

.. .. ..@ nUMI : Named num [1:313] 481813002715656125844...

.. .. .. ..- attr(*, "names")= chr [1:313] "AAAGGGATGTAGCAAG-1""AAAGGGCAGCTTGAAT-1""AACAACTGGTAGTTGC-1""AACCCAGAGACGGAGA-1"...

..@ originalSpatialRNA:Formal class 'SpatialRNA' [package "spacexr"] with 3 slots

.. .. ..@ coords:'data.frame': 313 obs. of 2 variables:

.. .. .. ..$ x: int [1:313] 4125412546043047412530474005412536464484...

.. .. .. ..$ y: int [1:313] 5477300041013895602848585684396457523069...

.. .. ..@ counts:Formal class 'dgCMatrix' [package "Matrix"] with 6 slots

.. .. .. .. ..@ i : int [1:85720] 0123456789...

.. .. .. .. ..@ p : int [1:314] 0275551827110113771650192622022478...

.. .. .. .. ..@ Dim : int [1:2] 276313

.. .. .. .. ..@ Dimnames:List of 2

.. .. .. .. .. ..$ : chr [1:276] "Rpl7""Cox5b""Rpl31""Rpl37a"...

.. .. .. .. .. ..$ : chr [1:313] "AAAGGGATGTAGCAAG-1""AAAGGGCAGCTTGAAT-1""AACAACTGGTAGTTGC-1""AACCCAGAGACGGAGA-1"...

.. .. .. .. ..@ x : num [1:85720] 8178117864915...

.. .. .. .. ..@ factors : list()

.. .. ..@ nUMI : Named num [1:313] 481813002715656125844...

.. .. .. ..- attr(*, "names")= chr [1:313] "AAAGGGATGTAGCAAG-1""AAAGGGCAGCTTGAAT-1""AACAACTGGTAGTTGC-1""AACCCAGAGACGGAGA-1"...

..@ reference :Formal class 'Reference' [package "spacexr"] with 3 slots

.. .. ..@ cell_types: Factor w/ 17 levels "Astrocyte","CA1",..: 1111122222...

.. .. .. ..- attr(*, "names")= chr [1:85] "P60HippoRep3P1_ATTATACTTGCG""P60HippoRep1P2_TCACCCTCAAGC""P60HippoRep2P2_CCTTAGCCCTGG""P60HippoRep4P3_CTCCCCTTAATC"...

.. .. ..@ counts :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots

.. .. .. .. ..@ i : int [1:12442] 12410121316212325...

.. .. .. .. ..@ p : int [1:86] 095262431653756905109712981551...

.. .. .. .. ..@ Dim : int [1:2] 30785

.. .. .. .. ..@ Dimnames:List of 2

.. .. .. .. .. ..$ : chr [1:307] "Acta2""Actb""Ahi1""Aldoa"...

.. .. .. .. .. ..$ : chr [1:85] "P60HippoRep3P1_ATTATACTTGCG""P60HippoRep1P2_TCACCCTCAAGC""P60HippoRep2P2_CCTTAGCCCTGG""P60HippoRep4P3_CTCCCCTTAATC"...

.. .. .. .. ..@ x : num [1:12442] 2110216111121...

.. .. .. .. ..@ factors : list()

.. .. ..@ nUMI : Named int [1:85] 755193623133614119914142512400282535677...

.. .. .. ..- attr(*, "names")= chr [1:85] "P60HippoRep3P1_ATTATACTTGCG""P60HippoRep1P2_TCACCCTCAAGC""P60HippoRep2P2_CCTTAGCCCTGG""P60HippoRep4P3_CTCCCCTTAATC"...

..@ config :List of 22

.. ..$ gene_cutoff : num 0.000125

.. ..$ fc_cutoff : num 0.5

.. ..$ gene_cutoff_reg : num 2e-04

.. ..$ fc_cutoff_reg : num 0.75

.. ..$ UMI_min : num 100

.. ..$ UMI_min_sigma : num 300

.. ..$ max_cores : num 2

.. ..$ N_epoch : num 8

.. ..$ N_X : num 50000

.. ..$ K_val : num 100

.. ..$ N_fit : num 1000

.. ..$ N_epoch_bulk : num 30

.. ..$ MIN_CHANGE_BULK : num 1e-04

.. ..$ MIN_CHANGE_REG : num 0.001

.. ..$ UMI_max : num 2e+07

.. ..$ counts_MIN : num 10

.. ..$ MIN_OBS : num 3

.. ..$ MAX_MULTI_TYPES : num 4

.. ..$ CONFIDENCE_THRESHOLD: num 10

.. ..$ DOUBLET_THRESHOLD : num 25

.. ..$ RCTDmode : chr "full"

.. ..$ doublet_mode : chr "full"

..@ cell_type_info :List of 2

.. ..$ info :List of 3

.. .. ..$ :'data.frame': 307 obs. of 17 variables:

.. .. .. ..$ Astrocyte : num [1:307] 2.31e-052.53e-032.09e-047.12e-048.77e-03...

.. .. .. ..$ CA1 : num [1:307] 0.002.85e-039.55e-041.08e-038.91e-05...

.. .. .. ..$ CA3 : num [1:307] 00.0018910.0003620.0010630.0001...

.. .. .. ..$ Cajal_Retzius : num [1:307] 0.002.51e-037.78e-041.25e-033.78e-05...

.. .. .. ..$ Choroid : num [1:307] 1.96e-061.53e-037.45e-059.64e-043.55e-05...

.. .. .. ..$ Denate : num [1:307] 0.003.95e-031.06e-031.43e-039.47e-05...

.. .. .. ..$ Endothelial_Stalk : num [1:307] 0.006.15e-037.55e-052.48e-042.73e-04...

.. .. .. ..$ Endothelial_Tip : num [1:307] 00.0032040.0003120.0002230.000186...

.. .. .. ..$ Entorihinal : num [1:307] 00.0018530.0012460.0010980.000276...

.. .. .. ..$ Ependymal : num [1:307] 0.0009250.0033460.0002470.000410.000506...

.. .. .. ..$ Interneuron : num [1:307] 00.0017790.0012510.0010120.000114...

.. .. .. ..$ Microglia_Macrophages: num [1:307] 00.0080270.0003610.0005020.000234...

.. .. .. ..$ Mural : num [1:307] 0.0104680.0098880.0001450.0010870.000281...

.. .. .. ..$ Neurogenesis : num [1:307] 00.0075950.0003140.0001040.000133...

.. .. .. ..$ Neuron.Slc17a6 : num [1:307] 3.99e-052.01e-031.54e-031.60e-031.12e-04...

.. .. .. ..$ Oligodendrocyte : num [1:307] 0.005.35e-035.36e-056.72e-044.75e-04...

.. .. .. ..$ Polydendrocyte : num [1:307] 00.0042450.000230.0002650.000367...

.. .. ..$ : chr [1:17] "Astrocyte""CA1""CA3""Cajal_Retzius"...

.. .. ..$ : int 17

.. ..$ renorm:List of 3

.. .. ..$ :'data.frame': 122 obs. of 17 variables:

.. .. .. ..$ Astrocyte : num [1:122] 0.039460.061510.020670.002470.00384...

.. .. .. ..$ CA1 : num [1:122] 0.0004010.0004830.0003930.0007710.000516...

.. .. .. ..$ CA3 : num [1:122] 4.52e-042.87e-046.84e-057.67e-041.66e-04...

.. .. .. ..$ Cajal_Retzius : num [1:122] 0.000170.0003040.0009330.00190.001271...

.. .. .. ..$ Choroid : num [1:122] 0.000160.016030.001050.002760.00112...

.. .. .. ..$ Denate : num [1:122] 0.0004260.001250.0001610.0018550.000682...

.. .. .. ..$ Endothelial_Stalk : num [1:122] 0.001230.0057320.0006460.0014260.001425...

.. .. .. ..$ Endothelial_Tip : num [1:122] 0.0008380.0288240.0168660.0009220.002654...

.. .. .. ..$ Entorihinal : num [1:122] 1.24e-033.07e-047.69e-051.43e-035.16e-04...

.. .. .. ..$ Ependymal : num [1:122] 0.002280.027730.001430.001010.00151...

.. .. .. ..$ Interneuron : num [1:122] 0.0005150.000660.0001940.0011440.000779...

.. .. .. ..$ Microglia_Macrophages: num [1:122] 0.0010520.0128690.0001310.0011450.003646...

.. .. .. ..$ Mural : num [1:122] 0.001260.002830.00840.00150.00267...

.. .. .. ..$ Neurogenesis : num [1:122] 0.0006010.0019410.0003440.0016370.000988...

.. .. .. ..$ Neuron.Slc17a6 : num [1:122] 0.0005050.0003720.0001150.0015640.00067...

.. .. .. ..$ Oligodendrocyte : num [1:122] 0.0021390.0101660.0004550.0013870.003402...

.. .. .. ..$ Polydendrocyte : num [1:122] 0.001650.013030.002250.001530.00394...

.. .. ..$ : chr [1:17] "Astrocyte""CA1""CA3""Cajal_Retzius"...

.. .. ..$ : int 17

..@ internal_vars :List of 8

.. ..$ gene_list_reg : chr [1:102] "Aldoc""Apoe""Atp1a2""Cd81"...

.. ..$ gene_list_bulk : chr [1:122] "Aldoc""Apoe""Atp1a2""Atp5f1"...

.. ..$ proportions : Named num [1:17] 1.21e-012.54e-072.60e-015.61e-019.39e-03...

.. .. ..- attr(*, "names")= chr [1:17] "Astrocyte""CA1""CA3""Cajal_Retzius"...

.. ..$ class_df :'data.frame': 17 obs. of 1 variable:

.. .. ..$ class: chr [1:17] "Astrocyte""CA1""CA3""Cajal_Retzius"...

.. ..$ cell_types_assigned: logi TRUE

.. ..$ sigma : num 0.21

.. ..$ Q_mat : num [1:103, 1:2536] 1.001.43e-051.60e-069.72e-076.88e-07...

.. .. ..- attr(*, "dimnames")=List of 2

.. .. .. ..$ : chr [1:103] "V1""V2""V3""V4"...

.. .. .. ..$ : NULL

.. ..$ X_vals : num [1:2536] 1.00e-052.83e-055.20e-058.00e-051.12e-04...

..@ results :List of 1

.. ..$ weights:Formal class 'dgCMatrix' [package "Matrix"] with 6 slots

.. .. .. ..@ i : int [1:5321] 0123456789...

.. .. .. ..@ p : int [1:18] 0313626939125215651878219125042817...

.. .. .. ..@ Dim : int [1:2] 31317

.. .. .. ..@ Dimnames:List of 2

.. .. .. .. ..$ : chr [1:313] "AAAGGGATGTAGCAAG-1""AAAGGGCAGCTTGAAT-1""AACAACTGGTAGTTGC-1""AACCCAGAGACGGAGA-1"...

.. .. .. .. ..$ : chr [1:17] "Astrocyte""CA1""CA3""Cajal_Retzius"...

.. .. .. ..@ x : num [1:5321] 0.04360.01130.04730.03940.0574...

.. .. .. ..@ factors : list()

..@ de_results : list()

..@ internal_vars_de : list()

3.探索一下RCTD结果RCTD full mode的结果保存在@results$weights中。接下来使用normalize_weights函数标化这个权重,使得每个spots中的各种细胞类型比例加起来等于1

代码语言:javascript复制## 探索一下RCTD结果

## result

barcodes <- colnames(myRCTD@spatialRNA@counts)

weights <- myRCTD@results$weights

norm_weights <- normalize_weights(weights)

dim(norm_weights)

norm_weights[1:5,1:6]

head(rowSums(norm_weights))

看一下norm_weights:

每一行代表一个spots,每一列代表一个细胞类型,值表示这个spot中细胞类型所占的比例,每一行的和为1

4.可视化反卷积结果有了这个结果,再加上空间对象的rds文件,可以借助其他反卷积的绘图函数绘制出其他反卷积结果中一样的那种饼图了,比如SPOTlight、STdeconvolve 软件可以绘制的饼图:

教程中可以展示某一个细胞类型在空间数据中反卷积出来的比例可视化:

总共有17中细胞类型,这里展示Denate

代码语言:javascript复制# plot Dentate weights

p <- plot_puck_continuous(myRCTD@spatialRNA,

barcodes, norm_weights[,'Denate'], ylimit = c(0,0.5),

title ='plot of Dentate weights', size=4.5, alpha=0.8)

p

ggsave("Spaital_weights.png", width=8, height=6, plot=p,bg="white")

结果图:

借用STdeconvolve绘制一下饼图看看:

代码语言:javascript复制## 借用STdeconvolve绘制一下饼图看看:

library(STdeconvolve)

library(ggplot2)

library(ggsci)

packageVersion("STdeconvolve")

m <- as.matrix(norm_weights)

p <- coords

plt <- vizAllTopics(theta = m,

pos = p,

topicOrder=seq(ncol(m)),

topicCols=rainbow(ncol(m)),

groups = NA,

group_cols = NA,

r = 56, # size of scatterpies; adjust depending on the coordinates of the pixels

lwd = 0.3,

showLegend = TRUE,

plotTitle = "scatterpies")

## function returns a `ggplot2` object, so other aesthetics can be added on:

plt <- plt +

ggplot2::guides(fill=ggplot2::guide_legend(ncol=2))

plt

ggsave(Spaital_scatterpies.png, width=12, height=6, plot=plt, bg="white")

结果图:

这个反卷积结果可以看到,某些区域细胞类型占比比较大,具有区域特征。绘图的时候还有个小技巧:同种类型的细胞的不同亚型,可以设置同种色系,可以方便更清晰的看出来区域特征。这里就不展示了,毕竟还要去找色系来对应。




央视历届春晚小品大全打包下载【84部/mp4视频】全集下载 历届央视春晚经典小品 – 素材兔
旧钱包丢弃前,清理财气,如何放置才不破财?