PROJECT ID: PRJNA797645


Data source SRA: PRJNA797645
Description Single cell RNA-sequencing reads from 70hpf stickleback embryos. It includes scRNAseq reads (generated from illumina short read sequencing) and ScISOr-Seq reads (generated from pacbio long read sequencing).
Key word pacbio; emerging model organisms; 10x genomics; threespine stickleback; evolution; protocol
Publication Healey, H. M. , S. Bassham , and W. A. Cresko . "Single-cell Iso-Sequencing enables rapid genome annotation for scRNAseq analysis." Genetics (2022).
Abstract Single-cell RNA sequencing is a powerful technique that continues to expand across various biological applications. However, incomplete 3 '-UTR annotations can impede single-cell analysis resulting in genes that are partially or completely uncounted. Performing single-cell RNA sequencing with incomplete 3 '-UTR annotations can hinder the identification of cell identities and gene expression patterns and lead to erroneous biological inferences. We demonstrate that performing single-cell isoform sequencing in tandem with single-cell RNA sequencing can rapidly improve 3 '-UTR annotations. Using threespine stickleback fish (Gasterosteus aculeatus), we show that gene models resulting from a minimal embryonic single-cell isoform sequencing dataset retained 26.1% greater single-cell RNA sequencing reads than gene models from Ensembl alone. Furthermore, pooling our single-cell sequencing isoforms with a previously published adult bulk Iso-Seq dataset from stickleback, and merging the annotation with the Ensembl gene models, resulted in a marginal improvement (+0.8%) over the single-cell isoform sequencing only dataset. In addition, isoforms identified by single-cell isoform sequencing included thousands of new splicing variants. The improved gene models obtained using single-cell isoform sequencing led to successful identification of cell types and increased the reads identified of many genes in our single-cell RNA sequencing stickleback dataset. Our work illuminates single-cell isoform sequencing as a cost-effective and efficient mechanism to rapidly annotate genomes for single-cell RNA sequencing.


Dataset Information


Dataset ID Species Tissue / Organ Experiment type Sample Source dataset ID
1. PRJNA797645 Gasterosteus aculeatus embryo baseline 70 hpf, whole embryo, untreatment SRA: SRR17665713

Clustering Result


Cluster Cell type Gene id (symbol) Marker class Evidence
2 Neuron ENSGACG00000018407 (sncb) marker DOI:10.1016/j.neuron.2020.09.023
3 Progenitors ENSGACG00000017181 (sox3) marker DOI:10.1016/j.neuron.2020.09.023
4 Neuron ENSGACG00000002784 (jagn1a) marker DOI:10.1016/j.neuron.2020.09.023
5 Mesoderm ENSGACG00000020298 (pcolcea) marker DOI:10.1016/j.neuron.2020.09.023
5 Mesoderm ENSGACG00000016468 (foxc1a) marker DOI:10.1016/j.neuron.2020.09.023
6 Muscle cells ENSGACG00000000349 (myog) marker DOI:10.1016/j.neuron.2020.09.023
6 Muscle cells ENSGACG00000012875 (hsp90aa1.1) marker DOI:10.1016/j.neuron.2020.09.023
7 Glia ENSGACG00000018292 (plp1a) marker DOI:10.7554/eLife.60005
7 Glia ENSGACG00000013679 (pou3f1) marker DOI:10.7554/eLife.60005
8 Progenitors ENSGACG00000016665 (ccna2) marker DOI:10.1016/j.neuron.2020.09.023
8 Progenitors ENSGACG00000012186 (nusap1) marker DOI:10.1016/j.neuron.2020.09.023
9 Neuron ENSGACG00000013839 (elavl3) marker DOI:10.1016/j.neuron.2020.09.023
9 Neuron ENSGACG00000013224 (myt1b) marker DOI:10.1016/j.neuron.2020.09.023
9 Neuron ENSGACG00000009065 (sox11a) marker DOI:10.1016/j.bbi.2022.03.006
10 Somite ENSGACG00000004901 (angptl7) marker DOI:10.1093/g3journal/jkac062
10 Somite ENSGACG00000002409 (col5a2a) marker DOI:10.1093/g3journal/jkac062
10 Somite ENSGACG00000008558 (comp) marker DOI:10.1093/g3journal/jkac062
11 Epidermal cells ENSGACG00000002632 (epcam) marker DOI:10.1016/j.cell.2008.01.025
11 Epidermal cells ENSGACG00000008104 (krt4) marker DOI:10.1016/j.cell.2008.01.025
11 Epidermal cells ENSGACG00000019298 (myh9a) marker DOI:10.1016/j.cell.2008.01.025
11 Epidermal cells ENSGACG00000020320 (cldn7b) marker DOI:10.1016/j.cell.2008.01.025
13 Neuron ENSGACG00000013839 (elavl3) marker DOI:10.7554/eLife.44431
13 Neuron ENSGACG00000007835 (pax10) marker DOI:10.1093/g3journal/jkac062
14 Erythroid cells ENSGACG00000006807 (alas2) marker DOI:10.1016/j.neuron.2020.09.023
14 Erythroid cells ENSGACG00000009622 (slc4a1a) marker DOI:10.1016/j.neuron.2020.09.023
14 Erythroid cells ENSGACG00000006837 (nmt1b) marker DOI:10.1016/j.neuron.2020.09.023
15 Fibroblasts ENSGACG00000010957 (tcf21) marker DOI:10.15252/embr.202152901
15 Fibroblasts ENSGACG00000005143 (col1a1a) marker DOI:10.15252/embr.202152901
16 Retina ENSGACG00000019038 (six7) marker DOI:10.1016/j.neuron.2020.09.023
17 Muscle cells ENSGACG00000004198 (tnnc2) marker DOI:10.7554/eLife.60005
18 Chondrocytes ENSGACG00000014845 (olfml2bb) marker DOI:10.1002/jbmr.4042
19 Neural crest ENSGACG00000018278 (slc2a15b) marker DOI:10.1016/j.neuron.2020.09.023
20 Hyp ENSGACG00000004914 (dlx2b) marker DOI:10.1038/nbt.4103
21 Macrophages ENSGACG00000001966 (ncf2) marker DOI:10.1038/s41559-021-01580-3
22 Neural crest ENSGACG00000007318 (sox10) marker DOI:10.1016/j.neuron.2020.09.023
23 Endothelium ENSGACG00000013350 (pecam1) marker DOI:10.1016/j.stemcr.2021.05.014
23 Endothelium ENSGACG00000010806 (sox7) marker DOI:10.1016/j.stemcr.2021.05.014
24 Enterocytes ENSGACG00000002429 (apoa4a) marker DOI:10.1038/s41586-021-03484-5
25 Epidermal cells ENSGACG00000020076 (cd9b) marker DOI:10.1016/j.cell.2008.01.025