AI News Archive: May 25, 2026 — Part 9
Sourced from 500+ daily AI sources, scored by relevance.
- GrowRanko
AI visibility, keywords, and traffic — with an AI consultant
- ClearView Studio V2.0
On-device video enhancement for foggy/low-visibility scenes
- AI Nano Banana : Image & Video
AI-powered art & video creation app
- Pathnovo
AI for Process Engneering
- AI Tools
Help Your Business Grow! 🚀 Get premium AI tools
- NaLU AI
Stop saving "Good Morning" as your customer's name in bots
- LingoHop
Translate your self-published children's book with AI
- HumanizeMyAI
Free AI humanizer trained on 2,590 real student essays
- The Instant Creative Brief Generator
Generate agency-grade creative briefs with AI in 3 seconds.
- Coverred.app
AI cover letters. Paste the job ad, generate cover letter.
- AdEstate Virtual Staging AI
AI virtual staging that helps real estate listings sell fast
- HUBAPI
One API layer for teams building with multiple LLM providers
- Coder.ltd — Describe It. We Build It.
AI Agent making complete software from one prompt
- TokenScope
Real-time, privacy-first monitoring of your LLM API usage
- Hirel AI
AI Recruitment Platform (Ai powered interviews)
- Visiblo
Track your brand visibility in AI answers
- UseFinLit
AI-powered financial literacy for smarter money moves
- SignalShield: Verify Market Narratives
Credibility analysis for investing content
- WP The Moon - WordPress AI Plugin
The Ultimate WordPress AI Plugin for Modern Content Creators
- AI: The Complete Guide 2027
Your roadmap from Machine Learning to AGI by 2027.
- Multi-Agent Coordination Adaptation via Structure-Guided Orchestration
As large language model (LLM)-based multi-agent systems scale to handle increasingly complex tasks, balancing structural stability and dynamic adaptability becomes increasingly challenging. Existing systems typically adopt either structure-centric methods, committing to structures determined upfront...
- A Multi-Agent LLM Framework for Rating the Quality of Surgical Feedback
Verbal feedback delivered by attending surgeons in the operating room plays a critical formative role in resident trainee skill acquisition. Yet, assessing the quality of trainer feedback and its effectiveness in influencing trainee behavior during live surgery remains a challenge. Prior studies ass...
- Evo-Attacker: Memory-Augmented Reinforcement Learning for Long-Horizon Tool Attacks on LLM-MAS
While Large Language Model-based Multi-Agent Systems (LLM-MAS) demonstrate remarkable capabilities in solving complex tasks by orchestrating specialized agents and external tools, the implicit trust in tool outputs creates a critical attack surface. Existing tool attacks are limited by domain specif...
- Behind EvoMap: Characterizing a Self-Evolving Agent-to-Agent Collaboration Network
Agent-to-Agent (A2A) networks enable autonomous AI agents to collaborate by sharing reusable problem-solving instructions. However, how these decentralized ecosystems operate in practice remains largely unexplored. We present the first large-scale empirical study of EvoMap, a prominent A2A collabora...
- Collaborative Threat-Aware Autonomy (CTAA)
Navigating teams of unmanned vehicles through environments containing dynamic, adversarial Weapon Engagement Zones~(WEZs) poses a fundamental challenge to mission success: a single vehicle, however capable its onboard guidance, remains a single point of failure. This paper presents a role-differenti...
- When Agents Control Robots: A Zero Trust Policy Model for Agentic Cyber-Physical Systems
Multi-agent systems powered by large foundation models (LFMs) are increasingly deployed to control industrial robots through natural language, creating deployments in which security failures produce physical consequences. We analyse this threat landscape through Cobot-Claw, a deployed four-agent sys...
- KYA: A Framework-Agnostic Trust Layer for Autonomous Systems with Verifiable Provenance and Hierarchical Policy Composition
Observability tells operators when an agent is slow. KYA tells operators when an agent is wrong, drifting, leaking, or quietly going rogue. We present KYA (Know Your Agents), an open-source trust and governance layer for autonomous systems composed of five primitives: (1) a four-gate inbound apply p...
- Towards Reliable Fetal Ultrasound Interpretation with Multi-Agent Collaboration
Automated fetal ultrasound interpretation requires a workflow from visual perception, including plane recognition and anatomical segmentation, to clinical understanding, including biometric measurement and diagnostic reporting. However, the prevailing "one-task, one-model" paradigm limits systematic...
- Recursive Multi-Agent Trading System: Iterative Optimized Portfolio Strategy Under Geopolitical Uncertainty
Recursive Multi-Agent Trading System (RMATS) integrates four specialized agents -- Sentiment, Report, Analysis, and Risk -- coordinated through a recursive Manager Agent with iterative feedback loops. Experimental evaluation over a 561-trading-day period (January 2023 to March 2025) across a 24-asse...
- SIREN: Unified Multi-Granularity Semantic Interaction for Multi-Modal Lifelong User Interest Modeling
Industrial recommender systems increasingly leverage lifelong user behavior histories and rich multi-modal content to capture evolving user preferences. However, effectively integrating multi-modal features into lifelong interest modeling remains challenging due to the inherent misalignment between ...
- SemBridge: Language Transfer in Sparse Encoders via Multilingual Semantic Bridges
Sparse encoders offer high-precision retrieval by representing term importance within a vocabulary space, yet their English-centric structures pose a critical impediment to language transfer for non-English languages. To overcome this structural limitation, we propose SemBridge, a novel embedding in...
- DeGRe: Dense-supervised Generative Reranking for Recommendation
In multi-stage recommender systems, reranking optimizes overall utility by capturing intra-list contextual dependencies, yet its central challenge lies in exploring optimal sequences within an exponentially large permutation space. Recent studies have shifted towards end-to-end generative frameworks...
- GCIB: Graph Contrastive Information Bottleneck for Multi-Behavior Recommendation
With the rapid emergence of multi-behavior learning in recommender systems, leveraging auxiliary user behaviors has proven effective for mitigating target-behavior data sparsity. Yet auxiliary behavior graphs frequently contain noisy or irrelevant interactions that do not align with the target task,...
- LENS: A Staged Design for Interaction Granularityin Sequential CTR Prediction
In sequential CTR prediction, a central design question is at what granularity the target should interact with the user behaviour sequence. Existing models mainly follow two routes. Raw-item architectures such as DIN let the target score each item in the sequence directly. This relies on well-traine...
- From Item-Only to Query-Item: Query-Conditioned Generative Search with QGS in Quark
Generative sequence models have shown strong results in recommendation. Applying them to search ranking is more challenging. Search behavior is inherently query-driven. Each query switch introduces a sharp topic shift in the user's interaction history. Existing generative methods flatten queries and...
- RAG-Match: Retrieval-Augmented Knowledge Injection and Hierarchical Reasoning for Calibrated Semantic Relevance
Semantic relevance judgment for search is particularly challenging in knowledge-intensive scenarios, where accurate ranking requires not only semantic matching but also background grounding, multi-step reasoning, and well-calibrated decision boundaries. Existing relevance models mainly rely on direc...
- How Reliable Are Semantic-ID Tokenizer Comparisons in Generative Recommendation?
In Semantic-ID (SID) based generative recommendation, each item is represented as a sequence of discrete codes, and an autoregressive model is trained to generate the SID sequence of the next item; top-K performance is then measured by checking whether the SID sequence of the target item appears amo...
- Label-Free Multimodal Volumetric Imaging of Colon Cancer Tissue via Registration of Propagation-Based Phase-Contrast CT, Light-Sheet, and Three-Photon Microscopy
Multimodal 3D imaging has emerged as a powerful approach for investigating complex tissue architecture in pathological specimens. Techniques such as propagation-based phase-contrast computed tomography (PCT), light-sheet microscopy (LSM), and three-photon microscopy (3PM) provide complementary information on unlabeled tissue morphology based on distinct intrinsic contrast mechanisms. However, integrating these heterogeneous datasets into a unified spatial framework remains challenging due to differences in imaging geometry, spatial resolution, and modality-specific distortions. In this study, we present a registration pipeline for spatially aligning volumetric datasets acquired with PCT, LSM, and 3PM from formalin-fixed paraffin-embedded (FFPE) human colon cancer specimens. Biopsies from theses specimens were optically cleared and imaged sequentially using the three high-resolution modalities. To compensate for large positional differences between acquisitions, a three-stage cascade registration strategy was developed, consisting of coarse global alignment on down-sampled data, followed by rigid refinement at intermediate resolution. Mutual information was used as the similarity metric to ensure robust multimodal registration. The resulting framework enables the generation of spatially aligned multi-channel 3D datasets that combine structural information from X-ray phase-contrast imaging with complementary optical contrast signals. Beyond registration, we demonstrate that the fused six-dimensional feature space can be further exploited for unsupervised tissue characterization using a Gaussian Mixture Model (GMM), enabling data-driven identification of spatially coherent tissue regions without manual annotation. Qualitative evaluation confirms consistent alignment of major anatomical structures across modalities, while the unsupervised clustering reveals biologically meaningful patterns despite modality-specific noise and resolution differences. While further optimization and validation across larger datasets will enhance its computational efficiency and breadth of application, the approach already demonstrates strong potential for comprehensive tissue analysis and enables scalable, label-free 3D characterization of colon cancer tissue architecture.
- Interpreting the WaveSeekerNet model to reveal the evolution and biology of influenza A virus
Background Influenza A virus (IAV) is a major public health burden, causing seasonal epidemics and occasional pandemics. Its transmission from avian species to mammals and subsequent spread requires adaptive changes in the viral genome. Understanding these molecular adaptations is essential for pandemic preparedness, and machine learning offers a powerful approach to uncover the evolution and biology of IAV. Results Our calibrated WaveSeekerNet model accurately predicted the host source of 8 IAV segments (Macro F1-score: 0.9728), significantly improving the reliability of predicted probabilities, with calibration errors approaching zero. Interpretation showed that avian-adapted IAVs consistently activated G/C content, whereas mammalian-adapted IAVs generally activated A/T content. This distinction was confirmed by codon-level analysis, in which G/C-rich codons were rewarded for the avian hosts and A/T-rich codons for the mammalian hosts. We defined host-adaptive distance to quantify species barriers and proposed it as a risk-assessment metric. We hypothesized the Mammalian Adaptation Zone (MAZ), a zone where the virus is expected to adjust its host-adaptive distance to reach, thereby helping it establish persistent mammalian lineages. The analysis also revealed the Hard Distance of avian-origin viruses (e.g., H5Nx, H9N2), indicating they have not yet established persistent mammalian lineages. Finally, analysis of human H7N9 (2013, China) and non-human mammalian H5Nx (North America) viruses showed that WaveSeekerNet accurately identified key mammalian-adaptive mutations, including PB2-E627K and PB2-D701N. Conclusions WaveSeekerNet elucidated IAV host-adaptation mechanisms in silico, providing insights into the underlying mechanisms of host adaptation and informing improved surveillance and intervention strategies.
- Assessing Foundation Models for Computational Pathology in Endometrial Cancer
Computational pathology leverages deep learning to extract clinically relevant information from digitized tumor slides, predicting histopathological subtypes, molecular alterations, and patient outcomes. Recent pipelines increasingly rely on foundation models trained on large pan-cancer datasets to generate generalizable features. In endometrial cancer (EC), their comparative performance for clinical diagnostic tasks remains unexplored. For the first time, this study evaluates the performance of seven state-of-the-art foundation models across morphological, molecular, and prognostic tasks using a large EC dataset of 3,293 patients from randomized trials and clinical cohorts. In addition, their performance was compared to one model (EsVIT) exclusively trained on EC. The foundation models H-OPTIMUS-0, CONCH, and VIRCHOW2, achieved the highest mean performance, but the best-performing foundation model varied by task. The top-performing foundation model outperformed the EC-specific feature extractor EsVIT across all tasks. This study highlights the superiority of foundation models over a domain-specific feature extractor in EC. Selecting the optimal foundation model for novel tasks remains challenging due to performance plateaus and limited information on the training datasets, requiring rigorous benchmarking and domain insight to reach maximum potential.
- Cross-Model Variability in Large Language Model Triage Behavior for Potential Stroke Symptoms
Background: Stroke is a time-sensitive neurological emergency in which early EMS activation and presentation to definitive care are cornerstones of effective therapy. Large language models (LLMs) are increasingly consulted by the public for medical advice, but the veracity of the guidance provided by commercially available models responding to potential stroke symptoms is not well understood. Methods: We performed a cross-model benchmarking study comparing the triage choices of three frontier LLMs (Claude Sonnet 4.6, GPT-4o, and Llama 3.3-70b-versatile) on first-person vignettes describing a unilateral arm symptom on waking, across 10 symptom descriptors, and two clinical phases (before and after a partially reassuring self-examination), with or without a clinical distractor (n=50 per condition). Results: Claude sought emergency care most often, Llama least, and GPT-4o in between, diverging most sharply in the post-examination phase where Claude called 911 in 100% of runs, Llama called for non-emergency help in 100%, and GPT-4o was symptom-dependent. A distractor shifted behavior away from emergency care in almost all conditions: calling 911 fell from 37.9% to 14.6% and waiting rose from 0% to 45.9% in the post-examination vignette. Responses were also sensitive to symptom word: weak, limp, heavy, and clumsy generated higher alarm, whereas numb, tingly, odd, strange, and weird generated less urgent responses. Conclusions: The increasing use of LLMs for medical advice has significant public health implications. Commercially available LLMs show significant model-to-model variability and framing sensitivity when confronted with potential stroke symptoms, including under-recognition of canonical CDC warning descriptors, underscoring the need for systematic benchmarking as these tools become de facto first points of contact for patients experiencing neurological emergencies.
- Extraction of Human Phenotype Ontology (HPO) Concepts from Clinical Notes Utilizing Large Language Models (LLM) with Model Context Protocol (MCP)
Background: Accurate extraction of Human Phenotype Ontology (HPO) terms from clinical notes is essential for variant prioritization and genetic diagnosis. Large language models (LLMs) often struggle to balance precision, hallucination avoidance, and ontology mapping accuracy, and prior work has shown that retrieval-based grounding can improve performance for individual models. We hypothesized that real-time ontology grounding through external tools would improve these metrics across heterogeneous LLMs, and we evaluated the Model Context Protocol (MCP), a standardized open framework for integrating external tools, as a vendor-agnostic mechanism for delivering such grounding. Methods: Five LLMs (Claude Sonnet 4.5, GPT-5.1, Gemini 2.5 Pro, Grok 4.1, and Qwen3 30B) extracted HPO terms from four synthetic clinical genetics notes under two conditions: baseline ("No Tools," internal knowledge only) and tool-augmented ("With Tools"), with real-time HPO retrieval delivered through MCP for models with native support and through functionally equivalent native tool-calling interfaces otherwise. Each model performed [≥]50 runs per note per condition (>2,000 total runs). Performance was evaluated using Precision, Recall, and F1-score. Outputs were manually adjudicated to classify mapping errors and hallucinations. Results were benchmarked against a commercial EHR-based HPO extraction tool. Results: Tool augmentation significantly improved performance across all models. Mean aggregate F1-score increased from 0.46 (SD 0.22) in the baseline condition to 0.72 (SD 0.15) with tools (p < 0.001). Mapping Error Rate decreased from 40.9% to 7.8% (p < 0.001), and Precision increased from 56% to 90%. Performance gains were observed across all model families, including the open-weight Qwen3 model (F1 0.11[->]0.50). For inferred phenotypes, F1 improved from 0.20 to 0.34 (p < 0.001) without a significant increase in hallucination rate (p = 0.08). Compared with the commercial benchmark, tool-augmented LLMs achieved higher F1-scores and substantially greater recall for inferred phenotypes. Conclusions: Real-time ontology grounding substantially improves HPO extraction across diverse LLMs by reducing mapping errors and enhancing phenotype inference. The Model Context Protocol provides a standardized, interoperable mechanism for delivering such grounding, supporting reproducible, vendor-agnostic deployment of clinical LLM pipelines in genomic medicine.
- Machine Learning Estimation of Gestational Age at Delivery Using Linked Mother-Infant Electronic Health Records Across Two Health Systems
Objective This study aimed to train and evaluate supervised machine learning algorithms using electronic health record (EHR) data to accurately estimate gestational age at delivery. Materials and Methods We trained random forest, gradient boosting, and ensemble models on EHR data of mother-infant dyads from Vanderbilt University Medical Center(VUMC) and replicated the analyses at University of Michigan (UMich). We further analyzed EHR predictors of gestational age, assessed temporal drift in EHR data elements, and evaluated model performance stratified by delivery status. Results The study included pregnancies corresponding to 54,344 and 34,345 mother-infant dyads at VUMC (2005-2025) and UMich (2012-2024), respectively. The gestational age predictions of the ensemble models achieved the highest agreement with the reference standard on the VUMC dataset ({+/-}1 week: 85.2%, {+/-}2 weeks: 94.3%, MAE: 4.4 days) and demonstrated stronger generalization on the UMich dataset ({+/-}1 week: 93.1%, {+/-}2 weeks: 97.8%, MAE: 2.8 days). Further, performance was better among pregnancies delivered in more recent years, and among full- and late-term deliveries compared with preterm deliveries. Discussion The results indicate that supervised machine learning methods leveraging linked mother-infant EHRs can accurately estimate gestational age at delivery, while demonstrating the generalizability of the modeling approach and the portability of the analytic workflow across healthcare sites. Conclusion This study presents a robust and generalizable machine learning framework to estimate gestational age at delivery. The framework can be reliably used to impute gestational age in large-scale, real-world clinical studies to support maternal and neonatal health research, in which accurate estimation of pregnancy onset is critical.
- Input design for unsupervised cross-national branded food database alignment using large language models
Cross-national alignment of branded food databases is essential for international nutritional epidemiology but lacks standardized methods. Existing approaches - including food ontologies, domain-specific fine-tuned language models, and manual expert mapping - require either substantial infrastructure or do not scale to thousands of items. We propose an unsupervised evaluation framework for large language model (LLM)-based food database alignment that requires no ground-truth labels. Using the Japan Branded Food Database (JBFD; 9,519 items, 71 mid-level categories) and USDA FoodData Central (448 categories) as a case study, we introduce two complementary metrics: weighted centroid distance (nutritional proximity between matched category pairs) and dominant category share (structural consistency of category-level assignments). We then conducted a systematic ablation study across eight input conditions (A-H), varying combinations of product name, nutrient profile, and semantic category label. Results showed that nutrient-only inputs yielded poor structural consistency despite low centroid distances, while semantic category labels achieved the highest dominant category share (89.3%) but introduced circularity due to their LLM-derived origin. Among circularity-free conditions, product name combined with minimal nutrient information (energy, protein, salt; condition E) achieved the best balance of centroid distance (0.471) and dominant category share (65.8%). Model comparison across Claude Haiku, Sonnet, and Opus confirmed that NO_MATCH rates were consistent across model sizes (12-14%), suggesting that prompt design contributes more to alignment quality than model scale. These findings provide practical guidance for input design in LLM-based food database alignment without ground-truth annotation.Sonnet 4.6
- Normative modeling for quantitative brain MRI phenotyping and biomarker discovery for pediatric leukodystrophies
Importance: Leukodystrophies are a heterogeneous group of genetic disorders affecting the white matter of the brain, often presenting with overlapping clinical features but differing in neuroanatomical involvement. There is a critical need for quantitative tools to characterize disease burden and support diagnosis, severity stratification, and clinical trial readiness. Objective: To characterize shared and distinct neuroanatomical patterns across six genetically confirmed leukodystrophies using anatomical MRI-derived phenotypes benchmarked against brain growth charts, and to assess the utility of this methodological approach for identifying imaging biomarkers of disease severity. Design, Setting, and Participants: Cross-sectional neuroimaging study using retrospective clinical MRI data. Setting: Multicenter study incorporating data from the Global Leukodystrophy Initiative Clinical Trials Network (GLIA-CTN) and control data from the Childrens Hospital of Philadelphia. Participants: The study included 434 MRI scan sessions from 274 patients with genetically confirmed leukodystrophies (Pelizaeus-Merzbacher disease, Metachromatic leukodystrophy, Alexander disease, Aicardi-Goutieres syndrome, TUBB4A-related leukodystrophies, and POLR3-related leukodystrophy). Control MRI data (7628 scans from 7205 subjects) were drawn from the Scans with Limited Imaging Pathology cohort at the Children's Hospital of Philadelphia. Exposures: All MRI scans underwent automated segmentation using deep learning segmentation tools to derive global and regional brain volumes. Normative models of brain development ("brain growth charts") were generated for the control cohort using generalized additive models for location, scale, and shape. Centile scores were then calculated for leukodystrophy subjects to quantify deviations from typical development. Main Outcomes and Measures: Centile scores for global and regional brain volumes were compared across leukodystrophy subtypes to identify disease-specific neuroanatomical patterns and to evaluate their potential utility for severity stratification. Results: Distinct patterns of neuroanatomical deviation were observed across leukodystrophy subtypes. Certain leukodystrophies showed preferential involvement of specific cortical or subcortical regions, while others displayed more diffuse volume loss. Centile scores demonstrated potential for differentiating disease subtypes and stratifying individuals by severity. Preliminary longitudinal data suggest centile scores may also track progression over time. Conclusions and Relevance:This study demonstrates the feasibility and utility of MRI profiling of individuals with leukodystrophy using anatomical MRI-derived phenotypes benchmarked against brain growth charts. The approach enables data-driven, quantitative characterization of structural brain abnormalities, offering a scalable method for phenotyping, diagnosis, and future use in clinical trials.
- Cumulative In-Context Learning versus Simple Historical Weighting for Real-Time Geographic Origin Identification of Ongoing Epidemic Waves: A Comparative Evaluation Using Eight COVID-19 Waves in Japan
Background: Identifying the geographic origin of epidemic waves early is critical for targeted public health responses. Conventional statistical methods for wave origin estimation rely on fixed algorithms applied to case count time-series data and treat each wave independently. Large language models (LLMs) offer a novel alternative through cumulative learning-the ability to incorporate confirmed epidemiological findings from prior waves into predictions for subsequent waves. Whether this approach outperforms conventional statistical baselines in early detection, and whether the same cumulative learning principle can be implemented in transparent statistical methods, remains unknown. Methods: We compared three computational approaches across eight COVID-19 epidemic waves in Japan (Waves 2-8, 2020-2023): (1) non-cumulative statistical baselines (B0-B5) treating each wave independently; (2) a cumulative-learning LLM (Claude Haiku) receiving confirmed origins from all prior waves as in-context historical knowledge; and (3) cumulative calculation statistical baselines implementing the identical historical weighting mechanism as a transparent arithmetic score. We additionally evaluated a non-cumulative LLM condition-receiving only current-wave data-to isolate the contribution of intrinsic LLM geographic reasoning from accumulated historical knowledge. All approaches were evaluated at 7, 14, 21, and 28 days after wave onset and validated against genomically confirmed wave origins. Results: Cumulative calculation statistical baselines (B1, B3) achieved mean F1 = 0.51 at 14 days after wave onset, performing comparably to the cumulative-learning LLM (F1 = 0.52) and outperforming all non-cumulative statistical baselines (F1 = 0.41-0.46). Wave 7 (Omicron BA.5) was correctly identified at 14 days by both methods (F1 = 1.00). Wave 6 (Omicron BA.1) was undetectable by all methods (F1 = 0.00), consistent with an origin outside the domestic surveillance system. Conclusions: The cumulative historical weighting mechanism-not LLM reasoning per se-drives performance improvement, as transparent arithmetic implementation matches LLM accuracy. However, the non-cumulative LLM achieves F1 = 0.46 without any historical context, suggesting substantial intrinsic geographic reasoning capacity. These findings advance understanding of when and why in-context learning confers advantage, and provide a deployable, spreadsheet-implementable method for real-time epidemic origin identification requiring no AI infrastructure.
- Google tops OpenAI's math breakthrough — 9 to 1
PLUS: Build an AI secretary that plans your day
- Dreem
Generate studio-quality product visuals instantly from one image.
- Pope Leo XIV: Unchecked AI Development Risks Building a New Tower of Babel
Pope Leo XIV: Unchecked AI Development Risks Building a New Tower of Babel PCMag
- sensai
AI agents your customers will love talking to