Analysis of Slide-seqV2 mouse olfactory bulb dataset#

In this tutorial, we demonstrate SpaMetric on the analysis of Slide-seqV2 mouse olfactory bulb dataset including

Spatial reconstruction
Mini-batch metric learning
Spatial domain identification
Finding differentially expressed genes
Spatial domain annotation

The dataset is available at Single Cell Portal (Puck_200127_15).

[1]:

import numpy as np
import pandas as pd
import scanpy as sc
import matplotlib.pyplot as plt
import seaborn as sns

import SpaMetric as spm

Data loading and preprocessing#

We load the dataset and find top 8000 highly variable genes.

[2]:

adata = sc.read_h5ad('./data/Puck_200127_15.h5ad')
adata

[2]:

AnnData object with n_obs × n_vars = 21724 × 21220
    obsm: 'spatial'

[3]:

sc.pp.highly_variable_genes(adata, n_top_genes=8000, flavor='seurat_v3')

Spatial reconstruction#

We perform spatial reconstruction to aggregate expression from spatial neighbors.

[4]:

spm.spatial_reconstruction(adata, n_neighbors=30)

Mini-batch metric learning#

We select 4000 reference centers and then perform mini-batch metric learning on the reconstructed data.

[5]:

sc.pp.pca(adata)
spm.reference_centers(adata, n_centers=4000)

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\cluster\_kmeans.py:887: UserWarning: MiniBatchKMeans is known to have a memory leak on Windows with MKL, when there are less chunks than available threads. You can prevent it by setting batch_size >= 4096 or by setting the environment variable OMP_NUM_THREADS=1
  warnings.warn(

Since this step takes some time, we load previouly computed AnnData.

[6]:

# spm.metric_learning_minibatch(adata)

adata = sc.read_h5ad('./data/Puck_200127_15_results.h5ad')
adata

[6]:

AnnData object with n_obs × n_vars = 21724 × 21220
    obs: 'reference_centers'
    var: 'highly_variable', 'highly_variable_rank', 'means', 'variances', 'variances_norm'
    uns: 'hvg', 'metric', 'pca', 'reference_centers', 'spatial_reconstruction'
    obsm: 'X_pca', 'metric', 'spatial'
    varm: 'PCs'
    layers: 'counts', 'log1p'

Spatial domain identification#

We create a temporary AnnData using the sample-by-reference metric matrix and then interoperate with SCANPY to identify spatial domains.

[7]:

adata_V = sc.AnnData(adata.obsm['metric'])
adata_V.obs_names = adata.obs_names
adata_V.obsm['spatial'] = adata.obsm['spatial']
adata_V

[7]:

AnnData object with n_obs × n_vars = 21724 × 3933
    obsm: 'spatial'

[8]:

sc.pp.pca(adata_V)
sc.pp.neighbors(adata_V)
sc.tl.louvain(adata_V, resolution=0.4)

[9]:

adata.obs['louvain'] = adata_V.obs['louvain']

We exclude domain 1 and 7 from downstream analysis, since their average counts are lower than 100.

[10]:

sns.barplot(y=np.sum(adata.layers['counts'].toarray(), axis=1), x=adata.obs['louvain'])

[10]:

<AxesSubplot:xlabel='louvain'>

../_images/tutorials_slide_seq_olfactory_bulb_21_1.png

[11]:

adata_subset = adata[~adata.obs['louvain'].isin(['1', '7']),:]

[12]:

fig, axs = plt.subplots(figsize=(8, 7))

sc.pl.embedding(
    adata_subset,
    basis='spatial',
    color='louvain',
    size=50,
    palette=sc.pl.palettes.default_102,
    legend_loc='right margin',
    show=False,
    ax=axs,
)

plt.tight_layout()

Trying to set attribute `.uns` of view, copying.

../_images/tutorials_slide_seq_olfactory_bulb_23_1.png

Finding differentially expressed genes#

We find the differentially expressed (DE) genes across identified domains and show the domains and their DE gene expression patterns in spatial coordinates.

[13]:

sc.tl.rank_genes_groups(adata_subset, groupby='louvain', use_raw=False, layer='counts', method='t-test_overestim_var')

[14]:

de_genes = pd.DataFrame(adata_subset.uns['rank_genes_groups']['names']).iloc[:10,:]
de_genes

[14]:

	0	2	3	4	5	6	8	9	10	11
0	Pcp4	mt-Nd4	Nrsn1	S100a5	Ptgds	Mbp	Cpne6	Gap43	Rab3b	S100a5
1	Camk2n1	Doc2g	Cck	Fabp7	Hbb-bs	Plp1	Hap1	Snca	Doc2g	Omp
2	Pcp4l1	mt-Nd1	S100a5	Apod	Hba-a1	Mobp	Macrod2	Doc2g	Map1b	Stoml3
3	Camk2b	mt-Cytb	mt-Cytb	Gng13	Hbb-bt	Cnp	Nos1	Uchl1	Slc17a7	Atf5
4	Meis2	mt-Rnr1	Calb2	Omp	Igf2	Mal	Synpr	Fxyd6	Atp1b1	Scgb1c1
5	Calm2	mt-Nd2	Olfm1	Atf5	Hba-a2	Cldn11	Tac1	Stmn2	Olfm1	Vmo1
6	Gng4	mt-Nd5	mt-Nd1	Npy	Mgp	Fth1	Pnmal2	Calb2	Syp	Cbr2
7	Malat1	mt-Co1	mt-Nd4	Ptn	Aebp1	Trf	Atp2b4	Rasgrp1	Stmn3	Tmsb4x
8	Nrxn3	mt-Rnr2	Trh	Kctd12	Col1a2	Mog	Nrxn3	Tuba1a	Thy1	Gng13
9	Ppp3ca	Slc6a11	Sparcl1	Stoml3	Slc6a20a	Mag	Necab2	Map1b	Shisa3	Calm1

[15]:

fig, axs = plt.subplots(figsize=(8, 7))

sc.pl.embedding(
    adata_subset,
    basis='spatial',
    color='louvain',
    groups='0',
    layer='log1p',
    size=50,
    palette=sc.pl.palettes.default_102,
    legend_loc='right margin',
    show=False,
    ax=axs,
)

plt.tight_layout()

../_images/tutorials_slide_seq_olfactory_bulb_28_0.png

[16]:

fig, axs = plt.subplots(figsize=(8, 7))

sc.pl.embedding(
    adata_subset,
    basis='spatial',
    color=de_genes.iloc[0,0],
    layer='log1p',
    size=50,
    color_map='Reds',
    vmax='p99.9',
    show=False,
    ax=axs,
)

plt.tight_layout()

../_images/tutorials_slide_seq_olfactory_bulb_29_0.png

[17]:

fig, axs = plt.subplots(figsize=(8, 7))

sc.pl.embedding(
    adata_subset,
    basis='spatial',
    color='louvain',
    groups='4',
    layer='log1p',
    size=50,
    palette=sc.pl.palettes.default_102,
    legend_loc='right margin',
    show=False,
    ax=axs,
)

plt.tight_layout()

../_images/tutorials_slide_seq_olfactory_bulb_30_0.png

[18]:

fig, axs = plt.subplots(figsize=(8, 7))

sc.pl.embedding(
    adata_subset,
    basis='spatial',
    color=de_genes.iloc[0,3],
    layer='log1p',
    size=50,
    color_map='Reds',
    vmax='p99.9',
    show=False,
    ax=axs,
)

plt.tight_layout()

../_images/tutorials_slide_seq_olfactory_bulb_31_0.png

Spatial domain annotation#

We annotate the spatial domains based on the anatomical annotation from Allen Reference Atlas and marker genes.

[19]:

map_dict = {
    '0': 'GCL',
    '2': 'EPL',
    '3': 'GL_1',
    '4': 'ONL',
    '5': 'GL_2',
    '6': 'RMS',
    '8': 'AOBgr',
    '9': 'AOB',
    '10': 'MCL',
    '11': 'UNK',
}

[20]:

adata_subset.obs['annotation'] = adata_subset.obs['louvain'].map(map_dict).astype('category')

[21]:

fig, axs = plt.subplots(figsize=(8.3, 7))

sc.pl.embedding(
    adata_subset,
    basis='spatial',
    color='annotation',
    size=50,
    palette=sc.pl.palettes.default_102,
    legend_loc='right margin',
    show=False,
    ax=axs,
)

plt.tight_layout()

../_images/tutorials_slide_seq_olfactory_bulb_36_0.png