LingoCell aligns gene expression, natural language and spatial context in a single 512-dimensional space — embed any cell, retrieve it by phrase, and read back what the model thinks it is.
Stage 2 spatial training reaches NMI 0.71 against pathologist niche labels.
New checkpoint best_phase2.pt available on the lab page.
Benchmarks across breast, lung, colon and tonsil show consistent gains over scGPT and Geneformer baselines.
Open on bioRxiv ↗HGC opens a small allocation for community users to re-run Stage 1.5 and Stage 2 on their own splits.
Request access →