Everything going on in AI - updated daily from 500+ sources
Biological foundation models illuminate annotation blind spots in evolutionarily divergent genomes
Chromosome-scale assemblies are increasingly available for non-model organisms, but functional annotation remains limited when deep evolutionary divergence erodes primary amino-acid sequence identity even though protein structural similarity can remain conserved. We present a hybrid annotation framework that decouples gene-model discovery from cross-species similarity assignment by combining Evo2-based ab initio prediction of exon-intron structures with ESM-2 protein-embedding-based structural similarity mapping. Applied to the sea lamprey, the framework derives high- or medium-confidence cross-species similarity assignments for 73,485 Evo2-derived translated protein models, including 35,395 high-confidence calls, and expands the deduplicated structural catalog to 31,286 loci, including 20,871 additions absent from the Ensembl baseline. A joint alignment-structure classification identifies 21,391 structurally supported catalog loci that a fixed human DIAMOND protein search does not con
Read Original Article →