Parkinson’s disease (PD) is a neurodegenerative disorder. It consists of neuronal loss in the substantia nigra, which causes striatal dopamine deficiency. Important hallmarks of PD are aggregates of α-synuclein (α-syn). In this project, RNA-seq data from mice with induced α-syn accumulation is analyzed in order to learn more about microglia role in PD (Poewe et al. 2017).
It is known that activated microglia is usually present in PD and it can promote α-syn aggregation. Nevertheless, it is not yet completely understood how microglia participates in α-syn cell-to-cell transfer.
In order to study so, a research group analyzed α-syn propagation between cells in a mouse model in conditions that activated or depleted microglial cells (George et al. 2019). They stimulated microglia with either lipopolysaccharide (LPS) or with interleukin-4 (IL-4), and used PBS as a control. By doing so, they generated situations of depleted (PBS), low-activated (IL-4) and high-activated (IL-4) microglia.
I analyzed data generated during the research previously mentioned. The data is available at https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE130683.
Stimulation of microglia affects α-syn transfer between neuronal cells.
To analyze RNA-seq data from mice with induced α-syn accumulation in order to determine the effect of microglia stimulation on α-syn transfer between cells.
First, the libraries needed for the analysis are imported. They include libraries for data manipulation, normalization, and visualization.
The data is selected from the available projects in recount3. The project selected is SRP194918, which contains data from mice with induced α-syn accumulation.
## 2025-02-07 22:27:47.837752 caching file sra.recount_project.MD.gz.
# Extract the project information
proj_info <- subset(
mouse_projects,
project == "SRP194918" & project_type == "data_sources"
)
# Create a recount object for the SRP058181 project
rse_gene_SRP194918 <- create_rse(proj_info)
## 2025-02-07 22:27:51.966258 downloading and reading the metadata.
## 2025-02-07 22:27:52.57307 caching file sra.sra.SRP194918.MD.gz.
## 2025-02-07 22:27:53.205212 caching file sra.recount_project.SRP194918.MD.gz.
## 2025-02-07 22:27:53.920731 caching file sra.recount_qc.SRP194918.MD.gz.
## 2025-02-07 22:27:54.629545 caching file sra.recount_seq_qc.SRP194918.MD.gz.
## 2025-02-07 22:27:55.377235 caching file sra.recount_pred.SRP194918.MD.gz.
## 2025-02-07 22:27:55.59794 downloading and reading the feature information.
## 2025-02-07 22:27:56.246265 caching file mouse.gene_sums.M023.gtf.gz.
## 2025-02-07 22:27:56.728348 downloading and reading the counts: 17 samples across 55421 features.
## 2025-02-07 22:27:57.258379 caching file sra.gene_sums.SRP194918.M023.gz.
## 2025-02-07 22:27:57.570603 constructing the RangedSummarizedExperiment (rse) object.
## class: RangedSummarizedExperiment
## dim: 55421 17
## metadata(8): time_created recount3_version ... annotation recount3_url
## assays(1): raw_counts
## rownames(55421): ENSMUSG00000079800.2 ENSMUSG00000095092.1 ...
## ENSMUSG00000096850.1 ENSMUSG00000099871.1
## rowData names(11): source type ... havana_gene tag
## colnames(17): SRR9007214 SRR9007215 ... SRR9007216 SRR9007217
## colData names(177): rail_id external_id ...
## recount_pred.curated.cell_line BigWigURL
## DataFrame with 55421 rows and 11 columns
## source type bp_length phase gene_id
## <factor> <factor> <numeric> <integer> <character>
## ENSMUSG00000079800.2 ENSEMBL gene 1271 NA ENSMUSG00000079800.2
## ENSMUSG00000095092.1 ENSEMBL gene 366 NA ENSMUSG00000095092.1
## ENSMUSG00000079192.2 ENSEMBL gene 255 NA ENSMUSG00000079192.2
## ENSMUSG00000079794.2 ENSEMBL gene 255 NA ENSMUSG00000079794.2
## ENSMUSG00000094799.1 ENSEMBL gene 366 NA ENSMUSG00000094799.1
## ... ... ... ... ... ...
## ENSMUSG00000095366.2 HAVANA gene 1181 NA ENSMUSG00000095366.2
## ENSMUSG00000095134.2 HAVANA gene 1248 NA ENSMUSG00000095134.2
## ENSMUSG00000096768.8 HAVANA gene 4501 NA ENSMUSG00000096768.8
## ENSMUSG00000096850.1 ENSEMBL gene 309 NA ENSMUSG00000096850.1
## ENSMUSG00000099871.1 HAVANA gene 548 NA ENSMUSG00000099871.1
## gene_type gene_name level mgi_id
## <character> <character> <character> <character>
## ENSMUSG00000079800.2 protein_coding AC125149.3 3 NA
## ENSMUSG00000095092.1 protein_coding AC125149.5 3 NA
## ENSMUSG00000079192.2 protein_coding AC125149.1 3 NA
## ENSMUSG00000079794.2 protein_coding AC125149.2 3 NA
## ENSMUSG00000094799.1 protein_coding AC125149.4 3 NA
## ... ... ... ... ...
## ENSMUSG00000095366.2 lncRNA Gm21860 2 MGI:5434024
## ENSMUSG00000095134.2 unprocessed_pseudogene Mid1-ps1 2 MGI:5780070
## ENSMUSG00000096768.8 lncRNA Gm47283 2 MGI:6096131
## ENSMUSG00000096850.1 protein_coding Gm21748 3 MGI:5433912
## ENSMUSG00000099871.1 unprocessed_pseudogene Gm21742 2 MGI:5433906
## havana_gene tag
## <character> <character>
## ENSMUSG00000079800.2 NA NA
## ENSMUSG00000095092.1 NA NA
## ENSMUSG00000079192.2 NA NA
## ENSMUSG00000079794.2 NA NA
## ENSMUSG00000094799.1 NA NA
## ... ... ...
## ENSMUSG00000095366.2 OTTMUSG00000074801.1 NA
## ENSMUSG00000095134.2 OTTMUSG00000047373.1 NA
## ENSMUSG00000096768.8 OTTMUSG00000074820.2 NA
## ENSMUSG00000096850.1 NA NA
## ENSMUSG00000099871.1 OTTMUSG00000047374.1 NA
## GRanges object with 55421 ranges and 11 metadata columns:
## seqnames ranges strand | source type
## <Rle> <IRanges> <Rle> | <factor> <factor>
## ENSMUSG00000079800.2 GL456210.1 9124-58882 - | ENSEMBL gene
## ENSMUSG00000095092.1 GL456210.1 108390-110303 - | ENSEMBL gene
## ENSMUSG00000079192.2 GL456210.1 123792-124928 + | ENSEMBL gene
## ENSMUSG00000079794.2 GL456210.1 135395-136519 - | ENSEMBL gene
## ENSMUSG00000094799.1 GL456210.1 147792-149707 + | ENSEMBL gene
## ... ... ... ... . ... ...
## ENSMUSG00000095366.2 chrY 90752427-90755467 - | HAVANA gene
## ENSMUSG00000095134.2 chrY 90753057-90763485 + | HAVANA gene
## ENSMUSG00000096768.8 chrY 90784738-90816465 + | HAVANA gene
## ENSMUSG00000096850.1 chrY 90838869-90839177 - | ENSEMBL gene
## ENSMUSG00000099871.1 chrY 90837413-90844040 + | HAVANA gene
## bp_length phase gene_id
## <numeric> <integer> <character>
## ENSMUSG00000079800.2 1271 <NA> ENSMUSG00000079800.2
## ENSMUSG00000095092.1 366 <NA> ENSMUSG00000095092.1
## ENSMUSG00000079192.2 255 <NA> ENSMUSG00000079192.2
## ENSMUSG00000079794.2 255 <NA> ENSMUSG00000079794.2
## ENSMUSG00000094799.1 366 <NA> ENSMUSG00000094799.1
## ... ... ... ...
## ENSMUSG00000095366.2 1181 <NA> ENSMUSG00000095366.2
## ENSMUSG00000095134.2 1248 <NA> ENSMUSG00000095134.2
## ENSMUSG00000096768.8 4501 <NA> ENSMUSG00000096768.8
## ENSMUSG00000096850.1 309 <NA> ENSMUSG00000096850.1
## ENSMUSG00000099871.1 548 <NA> ENSMUSG00000099871.1
## gene_type gene_name level
## <character> <character> <character>
## ENSMUSG00000079800.2 protein_coding AC125149.3 3
## ENSMUSG00000095092.1 protein_coding AC125149.5 3
## ENSMUSG00000079192.2 protein_coding AC125149.1 3
## ENSMUSG00000079794.2 protein_coding AC125149.2 3
## ENSMUSG00000094799.1 protein_coding AC125149.4 3
## ... ... ... ...
## ENSMUSG00000095366.2 lncRNA Gm21860 2
## ENSMUSG00000095134.2 unprocessed_pseudogene Mid1-ps1 2
## ENSMUSG00000096768.8 lncRNA Gm47283 2
## ENSMUSG00000096850.1 protein_coding Gm21748 3
## ENSMUSG00000099871.1 unprocessed_pseudogene Gm21742 2
## mgi_id havana_gene tag
## <character> <character> <character>
## ENSMUSG00000079800.2 <NA> <NA> <NA>
## ENSMUSG00000095092.1 <NA> <NA> <NA>
## ENSMUSG00000079192.2 <NA> <NA> <NA>
## ENSMUSG00000079794.2 <NA> <NA> <NA>
## ENSMUSG00000094799.1 <NA> <NA> <NA>
## ... ... ... ...
## ENSMUSG00000095366.2 MGI:5434024 OTTMUSG00000074801.1 <NA>
## ENSMUSG00000095134.2 MGI:5780070 OTTMUSG00000047373.1 <NA>
## ENSMUSG00000096768.8 MGI:6096131 OTTMUSG00000074820.2 <NA>
## ENSMUSG00000096850.1 MGI:5433912 <NA> <NA>
## ENSMUSG00000099871.1 MGI:5433906 OTTMUSG00000047374.1 <NA>
## -------
## seqinfo: 45 sequences from an unspecified genome; no seqlengths
The data is transformed to read counts and important data for analysis is added. The data types are converted to factors and the data is summarized.
# Transform nucleotide counts to read counts
assay(rse_gene_SRP194918, "counts") <- compute_read_counts(rse_gene_SRP194918)
# Add important data for analysis
rse_gene_SRP194918 <- expand_sra_attributes(rse_gene_SRP194918)
# Check the data format
rse_gene_SRP194918$sra.sample_attributes[1:3]
## [1] "age;;10-14 weeks old|injected with;;PBS|source_name;;brain-striatum|strain;;C57BL/6J|tissue;;ipsilateral striatum|treatment;;AAV over expression, neural graft, inflammatory injection"
## [2] "age;;10-14 weeks old|injected with;;PBS|source_name;;brain-striatum|strain;;C57BL/6J|tissue;;ipsilateral striatum|treatment;;AAV over expression, neural graft, inflammatory injection"
## [3] "age;;10-14 weeks old|injected with;;LPS|source_name;;brain-striatum|strain;;C57BL/6J|tissue;;ipsilateral striatum|treatment;;AAV over expression, neural graft, inflammatory injection"
# Review data of interest
colData(rse_gene_SRP194918)[
,
grepl("^sra_attribute", colnames(colData(rse_gene_SRP194918)))
]
## DataFrame with 17 rows and 6 columns
## sra_attribute.age sra_attribute.injected_with
## <character> <character>
## SRR9007214 10-14 weeks old PBS
## SRR9007215 10-14 weeks old PBS
## SRR9007218 10-14 weeks old LPS
## SRR9007219 10-14 weeks old LPS
## SRR9007220 10-14 weeks old LPS
## ... ... ...
## SRR9007228 10-14 weeks old IL4
## SRR9007229 10-14 weeks old IL4
## SRR9007230 10-14 weeks old IL4
## SRR9007216 10-14 weeks old PBS
## SRR9007217 10-14 weeks old PBS
## sra_attribute.source_name sra_attribute.strain sra_attribute.tissue
## <character> <character> <character>
## SRR9007214 brain-striatum C57BL/6J ipsilateral striatum
## SRR9007215 brain-striatum C57BL/6J ipsilateral striatum
## SRR9007218 brain-striatum C57BL/6J ipsilateral striatum
## SRR9007219 brain-striatum C57BL/6J ipsilateral striatum
## SRR9007220 brain-striatum C57BL/6J ipsilateral striatum
## ... ... ... ...
## SRR9007228 brain-striatum C57BL/6J ipsilateral striatum
## SRR9007229 brain-striatum C57BL/6J ipsilateral striatum
## SRR9007230 brain-striatum C57BL/6J ipsilateral striatum
## SRR9007216 brain-striatum C57BL/6J ipsilateral striatum
## SRR9007217 brain-striatum C57BL/6J ipsilateral striatum
## sra_attribute.treatment
## <character>
## SRR9007214 AAV over expression,..
## SRR9007215 AAV over expression,..
## SRR9007218 AAV over expression,..
## SRR9007219 AAV over expression,..
## SRR9007220 AAV over expression,..
## ... ...
## SRR9007228 AAV over expression,..
## SRR9007229 AAV over expression,..
## SRR9007230 AAV over expression,..
## SRR9007216 AAV over expression,..
## SRR9007217 AAV over expression,..
# Convert data types to factors
rse_gene_SRP194918$sra_attribute.age <-
factor(rse_gene_SRP194918$sra_attribute.age)
rse_gene_SRP194918$sra_attribute.injected_with <-
factor(rse_gene_SRP194918$sra_attribute.injected_with)
rse_gene_SRP194918$sra_attribute.source_name <-
factor(rse_gene_SRP194918$sra_attribute.source_name)
rse_gene_SRP194918$sra_attribute.strain <-
factor(rse_gene_SRP194918$sra_attribute.strain)
rse_gene_SRP194918$sra_attribute.tissue <-
factor(rse_gene_SRP194918$sra_attribute.tissue)
rse_gene_SRP194918$sra_attribute.treatment <-
factor(rse_gene_SRP194918$sra_attribute.treatment)
# Summarize data
summary(as.data.frame(colData(rse_gene_SRP194918)[
,
grepl("^sra_attribute", colnames(colData(rse_gene_SRP194918)))
]))
## sra_attribute.age sra_attribute.injected_with sra_attribute.source_name
## 10-14 weeks old:17 IL4:8 brain-striatum:17
## LPS:5
## PBS:4
## sra_attribute.strain sra_attribute.tissue
## C57BL/6J:17 ipsilateral striatum:17
##
##
## sra_attribute.treatment
## AAV over expression, neural graft, inflammatory injection:17
##
##
There are 17 samples. While the only relevant difference is the injection, I analyzed the effects that it has in gene expression.
The data is filtered to remove genes with low expression.
# Check quality of the data
rse_gene_SRP194918$assigned_gene_prop <-
rse_gene_SRP194918$recount_qc.gene_fc_count_all.assigned /
rse_gene_SRP194918$recount_qc.gene_fc_count_all.total
summary(rse_gene_SRP194918$assigned_gene_prop)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.6815 0.6975 0.7143 0.7105 0.7232 0.7448
# Observe differences among samples injected with different substances
with(colData(rse_gene_SRP194918),
tapply(assigned_gene_prop,
sra_attribute.injected_with,
summary))
## $IL4
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.6975 0.7089 0.7163 0.7147 0.7231 0.7241
##
## $LPS
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.6815 0.6839 0.6878 0.7025 0.7143 0.7448
##
## $PBS
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.6862 0.6974 0.7133 0.7122 0.7282 0.7360
# Plot the distribution of the assigned gene proportion
hist(rse_gene_SRP194918$assigned_gene_prop,
col = "#fd8d3c",
main = "Assigned Gene Proportion",
xlab = "Assigned Gene Proportion")
# Save the unfiltered data
rse_gene_SRP194918_unfiltered <- rse_gene_SRP194918
# Filter genes with low expression
rse_gene_SRP194918 <-
rse_gene_SRP194918[edgeR::filterByExpr(rse_gene_SRP194918),]
## Warning in filterByExpr.DGEList(y, design = design, group = group, lib.size =
## lib.size, : All samples appear to belong to the same group.
# Plot the distribution of the assigned gene proportion after filtering
hist(rse_gene_SRP194918$assigned_gene_prop,
col = "#fd8d3c",
main = "Filtered Assigned Gene Proportion",
xlab = "Assigned Gene Proportion")
# Check the expression of genes that were not filtered out
gene_means <- rowMeans(assay(rse_gene_SRP194918, "counts"))
summary(gene_means)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 12.5 93.4 507.2 2083.3 1720.7 1701927.8
# Check the percentage of genes that were not filtered out
round(nrow(rse_gene_SRP194918) / nrow(rse_gene_SRP194918_unfiltered) * 100, 2)
## [1] 39.88
The differential expression analysis is performed using the limma package.
Assigned gene proportion was plotted according to the injection group, showing considerable differences among them.
Model matrix was created according to the injection group and the assigned gene proportion.
Most differentially expressed genes were calculated using the Bayes method and were identified in a Volcano plot. Also a heatmap was created with the 50 top genes to analyze differential expression.
# Plot gene expression according to the injection group
ggplot(as.data.frame(colData(rse_gene_SRP194918)),
aes(y = assigned_gene_prop,
x = sra_attribute.injected_with)) +
geom_boxplot() +
ylab("Assigned Gene Prop") +
xlab("Injection Group") +
theme_classic()
# Create the model matrix according to the injection group
mod <- model.matrix(~ sra_attribute.injected_with + assigned_gene_prop,
data = colData(rse_gene_SRP194918)
)
colnames(mod)
## [1] "(Intercept)" "sra_attribute.injected_withLPS"
## [3] "sra_attribute.injected_withPBS" "assigned_gene_prop"
# Compute statistics by Bayes method
eb_results <- eBayes(lmFit(vGene))
# Extract the top genes
de_results <- topTable(
eb_results,
coef = c(2, 3),
number = nrow(rse_gene_SRP194918),
sort.by = "none"
)
# Calculate the number of significant genes
table(de_results$adj.P.Val < 0.05)
##
## FALSE TRUE
## 21829 274
# Plot the differentially expresed genes when injected with LPS
volcanoplot(eb_results, coef = 2, highlight = 3, names = de_results$gene_name)
# Review the top genes information
de_results[de_results$gene_name %in% c("Jchain", "Lox", "Slc2a5"), ]
## source type bp_length phase gene_id
## ENSMUSG00000024529.14 HAVANA gene 4778 NA ENSMUSG00000024529.14
## ENSMUSG00000028976.10 HAVANA gene 3402 NA ENSMUSG00000028976.10
## ENSMUSG00000067149.6 HAVANA gene 2517 NA ENSMUSG00000067149.6
## gene_type gene_name level mgi_id
## ENSMUSG00000024529.14 protein_coding Lox 2 MGI:96817
## ENSMUSG00000028976.10 protein_coding Slc2a5 2 MGI:1928369
## ENSMUSG00000067149.6 protein_coding Jchain 2 MGI:96493
## havana_gene tag sra_attribute.injected_withLPS
## ENSMUSG00000024529.14 OTTMUSG00000073373.1 <NA> 2.632772
## ENSMUSG00000028976.10 OTTMUSG00000010389.2 <NA> -1.372050
## ENSMUSG00000067149.6 OTTMUSG00000036864.2 <NA> 4.452633
## sra_attribute.injected_withPBS AveExpr F
## ENSMUSG00000024529.14 0.66166712 0.4088484 63.29376
## ENSMUSG00000028976.10 -0.09109408 1.1191055 46.69235
## ENSMUSG00000067149.6 1.51578719 2.0405254 40.01448
## P.Value adj.P.Val
## ENSMUSG00000024529.14 1.330165e-08 0.0002940064
## ENSMUSG00000028976.10 1.243272e-07 0.0013740021
## ENSMUSG00000067149.6 3.720316e-07 0.0020557536
# Heatmap of the 50 top genes
exprs_heatmap <- vGene$E[rank(de_results$adj.P.Val) <= 50, ]
# Create a dataframe with the injection group
df <- as.data.frame(colData(rse_gene_SRP194918)$sra_attribute.injected_with)
# Rename the columns and rows
colnames(df) <- c("Injection")
rownames(df) <- colnames(exprs_heatmap)
# Change the gene ID for the gene name
row.names(exprs_heatmap) <-
de_results$gene_name[rank(de_results$adj.P.Val) <= 50]
# Create the heatmap
pheatmap(
exprs_heatmap,
cluster_rows = TRUE,
cluster_cols = TRUE,
show_colnames = FALSE,
annotation_col = df,
scale = "row",
)
# MDS plot according to the injection group
col.group <- df$Injection
levels(col.group) <- brewer.pal(nlevels(col.group), "Set1")
col.group <- as.character(col.group)
plotMDS(vGene$E, labels = df$Injection, col = col.group)
In the assignment gene proportion boxplot, it is observed that the group injected with LPS has a median significantly different to other groups. This suggests that the injection with LPS has a significant effect on gene expression, and this is because LPS was a potent activator of microglia.
The differential expression analysis showed that the group injected with LPS was slightly more affected than the group injected with PBS. This is because LPS is a potent activator of microglia, and it is known that activated microglia can promote α-syn aggregation. It is imortant to mention that model was built using coefficients 2 and 3, because those were the columns indicating injections with either PBS or LPS (if none of them, that means that the sample was injected with IL-4) Also, the 3 top genes that were most differentially expressed in the group injected with LPS were Jchain, Lox, and Slc2a5. These genes are related to the immune response, oxidative stress, and glucose transport, respectively; and they are also involved in inflammation, which is a process that is usually present in PD.
Finally, the heatmap showed that the group injected with LPS had a different expression pattern compared to the group injected with PBS and IL-4. This suggests that the injection with LPS has a significant effect on expression of many genes related to inflammation and oxidative stress. The MDS plot also showed a clear separation between the group injected with LPS and the other groups.
With all of this, it could be concluded that activation of microglia does have an effect on α-syn transfer between cells, and that it could be a potential target for PD treatment.