Contents

1 Welcome

Welcome to the SRTsim project! It is composed of:

The web application allows you to design spatial pattern and generate SRT data with patterns of interest.

2 Install SRTsim

R is an open-source statistical environment which can be easily modified to enhance its functionality via packages. SRTsim is a R package available via CRAN. R can be installed on any operating system from CRAN after which you can install SRTsim by using the following commands in your R session:

 install.packages("SRTsim")

3 Run Reference-Based Simulation

To get started, please load the SRTsim package.

library("SRTsim")

Once you have installed the package, we can perform reference-based Tissue-wise simulation with the example data.

## explore example SRT data 
str(exampleLIBD)
#> List of 2
#>  $ count:Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
#>   .. ..@ i       : int [1:241030] 1 2 8 9 10 11 13 14 15 16 ...
#>   .. ..@ p       : int [1:3612] 0 67 122 182 252 322 392 462 534 609 ...
#>   .. ..@ Dim     : int [1:2] 80 3611
#>   .. ..@ Dimnames:List of 2
#>   .. .. ..$ : chr [1:80] "ENSG00000175130" "ENSG00000159176" "ENSG00000168314" "ENSG00000080822" ...
#>   .. .. ..$ : chr [1:3611] "AAACAAGTATCTCCCA-1" "AAACAATCTACTAGCA-1" "AAACACCAATAACTGC-1" "AAACAGAGCGACTCCT-1" ...
#>   .. ..@ x       : num [1:241030] 1 1 1 7 10 1 5 2 1 1 ...
#>   .. ..@ factors : list()
#>  $ info :'data.frame':   3611 obs. of  6 variables:
#>   ..$ row     : int [1:3611] 50 3 59 14 43 47 73 61 45 42 ...
#>   ..$ col     : int [1:3611] 102 43 19 94 9 13 43 97 115 28 ...
#>   ..$ imagerow: num [1:3611] 381 126 428 187 341 ...
#>   ..$ imagecol: num [1:3611] 441 260 183 417 153 ...
#>   ..$ tissue  : Factor w/ 2 levels "0","1": 2 2 2 2 2 2 2 2 2 2 ...
#>   ..$ layer   : chr [1:3611] "Layer3" "Layer1" "WM" "Layer3" ...

example_count   <- exampleLIBD$count
example_loc     <- exampleLIBD$info[,c("imagecol","imagerow","layer")]
colnames(example_loc) <- c("x","y","label")

## create a SRT object
simSRT  <- createSRT(count_in=example_count,loc_in =example_loc)


## Set a seed for reproducible simulation
set.seed(1)

## Estimate model parameters for data generation
simSRT1 <- srtsim_fit(simSRT,sim_schem="tissue")

## Generate synthetic data with estimated parameters
simSRT1 <- srtsim_count(simSRT1)

## Explore the synthetic data
simCounts(simSRT1)[1:5,1:5]
#> 5 x 5 sparse Matrix of class "dgCMatrix"
#>                 AAACAAGTATCTCCCA-1 AAACAATCTACTAGCA-1 AAACACCAATAACTGC-1
#> ENSG00000175130                  .                  .                 10
#> ENSG00000159176                  1                  3                  5
#> ENSG00000168314                  1                  .                  6
#> ENSG00000080822                  .                  .                  3
#> ENSG00000091513                  .                  .                  5
#>                 AAACAGAGCGACTCCT-1 AAACAGCTTTCAGAAG-1
#> ENSG00000175130                  .                  2
#> ENSG00000159176                  .                  1
#> ENSG00000168314                  2                  1
#> ENSG00000080822                  1                  .
#> ENSG00000091513                  1                  3
simcolData(simSRT1)
#> DataFrame with 3611 rows and 3 columns
#>                            x         y       label
#>                    <numeric> <numeric> <character>
#> AAACAAGTATCTCCCA-1   440.639   381.098      Layer3
#> AAACAATCTACTAGCA-1   259.631   126.328      Layer1
#> AAACACCAATAACTGC-1   183.078   427.768          WM
#> AAACAGAGCGACTCCT-1   417.237   186.814      Layer3
#> AAACAGCTTTCAGAAG-1   152.700   341.269      Layer5
#> ...                      ...       ...         ...
#> TTGTTTCACATCCAGG-1   254.410   422.862          WM
#> TTGTTTCATTAGTCTA-1   217.147   433.393          WM
#> TTGTTTCCATACAACT-1   208.416   352.430      Layer6
#> TTGTTTGTATTACACG-1   250.720   503.735          WM
#> TTGTTTGTGTAAATTC-1   284.293   148.110      Layer2

We can perform reference-based Domain-specific simulation with the example data.


## Set a seed for reproducible simulation
set.seed(1)

## Estimate model parameters for data generation
simSRT2 <- srtsim_fit(simSRT,sim_scheme='domain')

## Generate synthetic data with estimated parameters
simSRT2 <- srtsim_count(simSRT2)

## Explore the synthetic data
simCounts(simSRT2)[1:5,1:5]
#> 5 x 5 sparse Matrix of class "dgCMatrix"
#>                 AAACAAGTATCTCCCA-1 AAACAATCTACTAGCA-1 AAACACCAATAACTGC-1
#> ENSG00000175130                  .                  .                 11
#> ENSG00000159176                  1                  2                  7
#> ENSG00000168314                  1                  .                  7
#> ENSG00000080822                  .                  .                  3
#> ENSG00000091513                  .                  .                  6
#>                 AAACAGAGCGACTCCT-1 AAACAGCTTTCAGAAG-1
#> ENSG00000175130                  .                  2
#> ENSG00000159176                  .                  1
#> ENSG00000168314                  2                  1
#> ENSG00000080822                  1                  .
#> ENSG00000091513                  2                  3

4 Comparison Between Reference Data and Synthetic Data

4.1 Summarized Metrics

After data generation, we can compare metrics of reference data and synthetic data


## Compute metrics 
simSRT1   <- compareSRT(simSRT1)

## Visualize Metrics
visualize_metrics(simSRT1)

4.2 Expression Patterns For Genes of Interest

visualize_gene(simsrt=simSRT1,plotgn = "ENSG00000183036",rev_y=TRUE)