Name: Mohsen Zakeri

Specialty: Computer Science and Computational Biology

Email: mohs.zakeri [AT] gmail.com

About me

I am a computer scientist with a background in bioinformatics and computational biology. Holding a Ph.D. in Computer Science from the University of Maryland, I specialize in developing efficient data structures and algorithms for analyzing complex biological datasets.

My experience encompasses a wide range of projects, from designing and implementing efficient classification methods for nanopore reads to contributing to the development of tools like Alevin-fry and Puffaligner for pre-processing single and bulk RNA-seq short reads. My passion lies in creating solutions that bridge the gap between computational challenges and biological insights.

I am excited about the intersection of computer science and biology, and I look forward to contributing to advancements in the field.

Find my CV here.

Projects

that I have contributed to

Movi

Movi is a full text index for indexing large pangenomes. It is designed based on move data structure.


I was the main developer on this project.


alevin-fry

A suite of tools for the rapid, accurate and memory-frugal processing single-cell and single-nucleus sequencing data.


I contributed to the development of the pseudo-alignment with structural constraints algorithm for mapping short reads to the transcriptome.

Puffaligner

Puffaligner is a fast, sensitive and accurate aligner built on top of the Pufferfish index for short sequencing reads.


I was one of the two main developers (with Fatemeh Almodaresi) of this project.

Salmon

Salmon is a wicked-fast program to produce a highly-accurate, transcript-level quantification estimates from RNA-seq data.


I contributed to the development of the data-driven factorization and the selective-alignment algorithm to improve the quantification accuracy.

AGAMEMNON

A framework to analyzing metagenomic/metatranscriptomic and host-specific samples.


I contributed to the development of the Cedar algorithm used for the quantification of the samples in AGAMEMNON.

mudskipper

A tool for converting genomic BAM/SAM files to BAM/RAD alignment files with transcriptomic coordinates.


I started the development this project during my internship at Ocean Genomics in summer 2021. It was the first project I did in Rust.

News

[February 2024] We released a new version of the Movi which exhibits even faster queries by implementing a latency hiding approach. We provide the new results in the new version of the preprint. Find it on BioRxv

[December 2023] I presented our latest work about using move structure for adaptive sampling of nanopore long reads in Genome Informatics 2023. Find my slides here.

[November 2023] We posted a pre-print about implementing move data stucture for cache-efficient string matching. Find it on BioRxv

[Fall 2023] I instructed a HEART course for incoming undergraduate students. The course was about an introduction to common exact and approximate string matching algorithms that are widely used for searching sequencing data. Find my slides for the course here.