About · what LingoCell is

A multimodal foundation model for cells.

LingoCell aligns gene expression, natural language and spatial context in a single 512-dimensional space — so that every cell can be embedded, retrieved by phrase, and described back in clear scientific English.

In short · 03

One model, three signals.

A 12-layer transformer reads the gene-token sequence; PubMedBERT reads the text; a spatial branch reads the neighbour graph and H&E patches. All three converge on the same unit-sphere.

Trained at scale, openly.

Pretraining used 11.5 M scRNA-seq cells from CellxGene and 1.57 M Visium HD cells across eleven tissues — all checkpoints, code and data manifests are public.

Useful out of the box.

The same checkpoint annotates cell types, recovers niche structure, retrieves cells by text, and writes a faithful one-paragraph description per cell.

Why · we built this

Single-cell foundation models tend to be black boxes — they produce vectors, not explanations.

LingoCell pairs every embedding with a natural-language description grounded by cross-attention on the actual expressed genes — so a reviewer can verify the model's claim against the data it saw.

It is designed for biologists who want to interrogate the model, not just consume its predictions.

That is why every workflow on the server returns intermediate artefacts: gene attention, cosine-similarity histograms, niche-purity scores, neighbour composition.

Developed at

FAIS · HGC
Developer Lab

LingoCell is developed at the Functional Analysis of Information Systems group, Human Genome Center, Institute of Medical Science, The University of Tokyo.

Visit the lab
group
FAIS · HGC
institute
Institute of Medical Science
university
The University of Tokyo
cluster
Miyabi · 4× H100
license
MIT · weights CC-BY-NC 4.0
https://fais.hgc.jp/