L3R-seq (Long-read 3’ RACE-seq) is a targeted nanopore sequencing method that uses unique molecular identifiers (UMIs) to build one high-accuracy consensus sequence per original RNA molecule. L3Rseq is its companion bioinformatics pipeline: it takes raw Oxford Nanopore FASTQ files and produces per-molecule CSV tables quantifying RNA editing, alternative splicing, 3’ end cleavage position, and poly(A) tail status.
The pipeline was developed for Arabidopsis thaliana mitochondrial ccmC mRNA, but is adaptable to any target RNA on nanopore platforms.
Get started on GitHub Full documentation
What you can measure with L3R-seq
Each row in the output CSV represents one original RNA molecule. Per-molecule columns include:
- RNA editing events (e.g., C-to-U) with configurable pattern matching
- 3’ end cleavage position on the reference
- Poly(A) tail length and sequence from the non-templated 3’ extension
- Splice status per intron (spliced / retained / not spanned)
- Noise count separating biological editing from residual sequencing error
A secondary pattern option (--count-pattern TC) enables SLAM-seq T-to-C counting alongside primary editing in the same run.
Key capabilities
| Capability | Description |
|---|---|
| UMI consensus calling | Groups reads by UMI, polishes each cluster into a single high-accuracy sequence |
| CIGAR-walk correction | Recovers 3’ ends that aligners mis-clip due to editing near the transcript boundary |
| Intron splicing detection | Classifies reads as spliced or unspliced; can auto-discover intron coordinates |
| Translocation filtering | BLAST-based chimera detection separates real poly(A) tails from artifacts |
| Built-in alignment viewer | Browser-based IGV.js viewer with sorting and coloring by any SAM tag |
Getting started
The fastest way to try L3Rseq is GitHub Codespaces — click “Code” then “Codespaces” on the repository page to get a fully configured Linux environment in your browser with all dependencies pre-installed.
Alternatively, pull the Docker image:
docker pull ghcr.io/akihitomamiya-del/l3rseq:latest
Then run the pipeline:
L3Rseq run \
--input data/fastq/ \
--outdir results/ \
--ref refs/my_gene.fasta \
--rpi-fasta refs/barcodes.fasta \
--pattern CT
See the full documentation for detailed installation and usage instructions.
Pipeline at a glance
L3Rseq runs ten steps, from raw reads to annotated CSV:
- Concatenate per-barcode FASTQ files
- Trim adapters (cutadapt, 3-pass)
- Demultiplex by sample barcode
- UMI extraction and read grouping
- Consensus calling (Racon-based polishing)
- Target region extraction
- Mapping to reference (minimap2)
- Variant calling (LoFreq)
- 3’ tail correction with CIGAR-walk
- CSV export and quality reporting
Enter at any step with --start-at / --stop-at.
Requirements
All dependencies ship inside the Docker image. No manual installation of bioinformatics tools is needed. The pipeline uses conda environments for minimap2, samtools, cutadapt, racon, vsearch, LoFreq, BLAST+, and more.
Citation
If you use L3Rseq in your research, please cite:
Mamiya, A. L3Rseq: bioinformatics pipeline for Long-read 3’ RACE-seq. https://github.com/akihitomamiya-del/L3R-seq
License
L3Rseq is released under the GPL-3.0 license.