AI News Archive: May 4, 2026 — Part 10
Sourced from 500+ daily AI sources, scored by relevance.
- Trump administration considering safety review for new AI models
The Trump administration is considering a new plan that would require the Pentagon to safety test AI models that are deployed to federal, state and local governments, Axios has learned. Why it matters: In a post- Mythos world, the Trump administration appears to be re-evaluating its hard line against the AI safety and security measures it once shrugged off. Driving the news : The White House's Office of the National Cyber Director (ONCD) hosted two meetings last week — one with tech and cyber companies and another with tech trade groups — to discuss the broader security concerns raised by advanced AI models, including Anthropic's Mythos Preview , according to two sources familiar with the matter. The office has also been discussing an AI security framework that would require the Pentagon to lead safety testing for AI deployments for federal, state and local government levels, the two sources said. That would be an additional layer of responsibility for the government to assess the secu
- Trump Administration Weighs New AI Model Guardrails
Trump Administration Weighs New AI Model Guardrails The Information
- White House May Review New AI Models Before Public Release, Report Says
A possible executive order may increase oversight of the emerging technology.
- White House considers government reviews for AI models, NYT reports
White House considers government reviews for AI models, NYT reports Reuters
- Cerebras targets $26.6 billion valuation in US IPO as AI chip demand surges
Cerebras targets $26.6 billion valuation in US IPO as AI chip demand surges Reuters
- Cerebras targets $26.6 billion valuation in US IPO as AI chip demand surges
Chipmaker Cerebras is aiming for a significant valuation in its US initial public offering. The company's advanced AI chips are in high demand due to heavy spending on AI infrastructure. This IPO will test investor appetite for AI hardware companies. Cerebras' revenue has grown, and it has turned a profit. The IPO market is showing renewed strength.
- Cerebras Plans Up to $3.5 Billion IPO
Chip startup Cerebras Systems will offer 28 million shares in its planned initial public offering at a price of $115 to $125 a share, which would raise $3.5 billion at the upper end of the range.
- European Commission is in contact with Anthropic on Mythos, Dombrovskis says
European Commission is in contact with Anthropic on Mythos, Dombrovskis says Reuters
- ServiceNow lays out path to $30 billion in annual subscription revenue as AI bets accelerate
ServiceNow lays out path to $30 billion in annual subscription revenue as AI bets accelerate Business Insider
- OpenAI’s Brockman Says Stake is Worth Close to $30 Billion
OpenAI’s Brockman Says Stake is Worth Close to $30 Billion The Information
- 'Nature' Retracts Paper on the Benefits of ChatGPT in Education
“What educators, parents and policy officials really needed was high quality data and evidence to help guide them. What they have had to deal with instead is some substandard research.”
- OpenAI, Anthropic ramp up enterprise push
OpenAI and Anthropic are both partnering with private equity firms in a bid to deploy their AI products to more businesses.
- Sierra is raising $950 million at a $15.8 billion valuation
The customer service AI startup reached $150 million in annual recurring revenue in its first eight quarters
- AI Startup Sierra Says It’s Raising at Over $15 Billion Valuation
AI Startup Sierra Says It’s Raising at Over $15 Billion Valuation The Information
- BNY CEO says AI is a jobs creator, not a destroyer
CEO Robin Vince defended the custody bank's use of artificial intelligence Monday, saying that the deployment of AI allows firms to increase their investment capacity.
- Heterogeneity of Insulin Resistance Surrogates in Thousands of Non-Diabetic Adults: Multi-Modal Data Reveals Discordant Metabolic Phenotypes
The transition from metabolic health to type 2 diabetes unfolds through progressive insulin resistance (IR), yet the gold-standard hyperinsulinemic-euglycemic clamp is inapplicable at population scale and fasting insulin is not uniformly available. Several surrogate measures have been described in the literature, but whether these surrogates identify the same individuals, and whether continuous glucose monitoring (CGM) or NMR metabolomics carry information beyond conventional markers, remains unresolved. Here, we analyzed IR surrogates in 10,114 non-diabetic adults (35-75 y) from the Human Phenotype Project (HPP), integrated with 14-day CGM, dual x-ray absorptiometry (DEXA) body composition, liver and carotid ultrasound, sleep monitoring, and NMR metabolomics and established sex-specific, age-resolved reference ranges. IR surrogates were moderately inter-correlated but captured distinct metabolic facets. We next focused on DEXA-derived visceral adipose tissue (VAT), one of the stronges
- Can large language models approximate human perceptions of disease severity? An evaluation using Global Burden of Disease 2010 disability weights
Background: Disability weights (DWs) quantify the severity of health loss and are essential for estimating disability-adjusted life years in the Global Burden of Disease (GBD) framework. Conventional DW estimation relies on resource-intensive population surveys that are difficult to update or adapt to emerging health states. Large language models (LLMs) may offer a scalable alternative by approximating human perceptions of disease severity through structured judgment tasks. Methods: This exploratory study evaluated the alignment between LLM-derived and human-derived DW rankings using 222 health states from GBD 2010. All possible pairwise comparisons (24,531 pairs, each repeated three times) were conducted across four LLMs (GPT-5 mini, GPT-5, Claude Haiku 4.5, and Claude Sonnet 4.5). DWs were estimated via probit regression and evaluated using Spearman's rank correlation and Steiger's z test. The effects of prompt language (English vs. Korean), cultural role prompting, and medical speci
- Self-Care Competence and AI-Supported Learning as Predictors of Enhanced Clinical Decision-Making Skills Among Nurses
Clinical decision-making is a critical competency for nurses particularly in resource-constrained healthcare systems where frontline practitioners must integrate clinical knowledge judgment and contextual constraints to ensure optimal patient outcomes. Although prior research highlights the benefits of artificial intelligence supported learning and individual competencies it largely assumes a direct relationship between technological support and decision quality overlooking the cognitive-regulatory mechanisms through which such effects occur. This study addresses this gap by examining self-care competence as a mediating pathway linking perceived AI based learning support to enhance clinical decision-making among nurses in Benue State, Nigeria. A descriptive cross-sectional design was employed with data collected from 600 registered nurses across public and private healthcare facilities using the Self-Care Competence Scale (SCCS) AI-Based Learning Support Scale (PAILS) and the Clinical
- BNY and Domyn execs share details behind deal to strengthen financial AI goals
BNY and Domyn execs share details behind deal to strengthen financial AI goals
- Development and Validation of Machine Learning Models for Predicting Mortality in Hospitalised Systemic Lupus Erythematosus Patients in Dr. Sardjito Hospital, Indonesia Machine Learning Prediction of In-Hospital Mortality in SLE
Objectives This study aimed to develop and validate machine learning models to predict in-hospital mortality among systemic lupus erythematosus (SLE) patients using administrative claims data in a tertiary referral center in Indonesia. Methods We conducted a retrospective cohort study of 327 SLE hospital admissions between January 2019 and June 2025. Predictor variables included demographics, hospitalisation characteristics, and the ten most frequent comorbidities. We developed Logistic Regression, Random Forest, and Extreme Gradient Boosting (XGBoost) models. Class imbalance was addressed using the Synthetic Minority Over-sampling Technique. Results The overall in-hospital mortality rate was 7.7%. While models achieved comparable discrimination (Area Under the Curve ~0.71), XGBoost was selected for its superior sensitivity (0.93) compared to Logistic Regression (0.80) and Random Forest (0.97). Feature importance analysis revealed pneumonia as the most significant predictor, followed b
- Effectiveness of a school-based multimodal suicide and depression prevention program: cluster-randomized pragmatic trial
Background: Suicide is one of the leading causes of death among adolescents. Suicidal thoughts and behaviors are significant risk factors for suicide, whilst depressive symptoms are significant risk factors for suicidal thoughts and behaviors. Multimodal school-based interventions that address these risk factors are considered the appropriate approach to suicide prevention. Method: 1,593 adolescents (aged 11-15 years) from 15 high schools in the Netherlands took part in the study, which was designed as a pragmatic cluster-randomized trial. The experimental condition consisted of (1) screening for depressive symptoms and suicidal thoughts and behaviors, followed by referral of those who scored positive; (2) gatekeeper training for school mentors; (3) a serious game to reduce stigma, promote health literacy and improve help seeking behavior; and (4) eight CBT-based sessions for adolescents with elevated depressive symptoms. The control group consisted of screening and the gatekeeper trai
- Naturalistic acceptance-based emotion regulation in adolescents with NSSI: altered prefrontal activation and amygdala-prefrontal connectivity
Background: Non-suicidal self-injury (NSSI) represents a growing public health concern, particularly in adolescents. Emotion dysregulation is central to prevailing NSSI models, yet it remains unclear whether acceptance-based emotion regulation (ER) and its underlying neural processes are disrupted in naturalistic, dynamic contexts. Methods: Pre-registered neuroimaging trial in recently diagnosed and treatment-free adolescents with NSSI (n=25) and healthy controls (n=25) using an ER paradigm with dynamic video clips and concomitant functional magnetic resonance imaging. Behavioral, neural activity, and connectivity indices during emotion reactivity and acceptance-based regulation were compared between groups. Results: Adolescents with NSSI experienced elevated negative feelings during neutral clips, reflecting heightened baseline negativity. In comparison to controls, they displayed reduced temporal and ventrolateral prefrontal engagement during emotional reactivity, but increased engag
- Wall Street's $1.5 Billion Plan to Build the 'McKinsey of AI'
Wall Street's $1.5 Billion Plan to Build the 'McKinsey of AI' Business Insider
- Anthropic is getting pulled directly into Wall Street
Anthropic's involvement with Wall Street
- Building a new enterprise AI services company with Blackstone, Hellman & Friedman, and Goldman Sachs
Building a new enterprise AI services company with Blackstone, Hellman & Friedman, and Goldman Sachs
- Tibetan-TTS:Low-Resource Tibetan Speech Synthesis with Large Model Adaptation
Tibetan text-to-speech (TTS) has long been challenged by scarce speech resources, significant dialectal variation, and the complex mapping between written text and spoken pronunciation. To address these issues, this work presents, to the best of our knowledge, the first large-model-based Tibetan TTS...
- GRAIL: A Deep-Granularity Hybrid Resonance Framework for Real-Time Agent Discovery via SLM-Enhanced Indexing
As the ecosystem of Large Language Model (LLM)-based agents expands rapidly, efficient and accurate Agent Discovery becomes a critical bottleneck for large-scale multi-agent collaboration. Existing approaches typically face a dichotomy: either relying on heavy-weight LLMs for intent parsing, leading...
- PC-MNet: Dual-Level Congruity Modeling for Multimodal Sarcasm Detection via Polarity-Modulated Attention
Multimodal sarcasm detection, which aims to precisely identify pragmatic incongruities between literal text and nonverbal cues, has gained substantial attention in multimodal understanding. Recent advancements have predominantly relied on naïve similarity-based attention mechanisms and uniform late ...
- HalluScan: A Systematic Benchmark for Detecting and Mitigating Hallucinations in Instruction-Following LLMs
Large Language Models (LLMs) have demonstrated remarkable capabilities across diverse natural language processing tasks, yet they remain susceptible to hallucinations -- generating content that is factually incorrect, unfaithful to provided context, or misaligned with user instructions. We present H...
- InfoLaw: Information Scaling Laws for Large Language Models with Quality-Weighted Mixture Data and Repetition
Upweighting high-quality data in LLM pretraining often improves performance, but in datalimited regimes, especially under overtraining, stronger upweighting increases repetition and can degrade performance. However, standard scaling laws do not reliably extrapolate across mixture recipes or under re...
- MolViBench: Evaluating LLMs on Molecular Vibe Coding
Molecular Vibe Coding, a paradigm where chemists interact with LLMs to generate executable programs for molecular tasks, has emerged as a flexible alternative to chemical agents with predefined tools, enabling chemists to express arbitrarily complex, customized workflows. Unlike general coding tasks...
- Decoding-Time Debiasing via Process Reward Models: From Controlled Fill-in to Open-Ended Generation
Large language models pick up social biases from the data they are trained on and carry those biases into downstream applications, often reinforcing stereotypes around gender, race, religion, disability, age, and socioeconomic status. The standard fixes (retraining on curated data or fine-tuning wit...
- Structural Dilemmas and Developmental Pathways of Legal Argument Mining in the Era of Artificial Intelligence
Against the backdrop of rapid advances in artificial intelligence, legal argument mining has emerged as an important research area linking legal texts with intelligent analysis, carrying significant theoretical and practical implications. Existing studies have primarily developed along three dimensi...
- OpenWithAI
Open selected text in your AI tools with one click.
- SpecKV: Adaptive Speculative Decoding with Compression-Aware Gamma Selection
Speculative decoding accelerates large language model (LLM) inference by using a small draft model to propose candidate tokens that a larger target model verifies. A critical hyperparameter in this process is the speculation length~$γ$, which determines how many tokens the draft model proposes per s...
- Enhancing RL Generalizability in Robotics through SHAP Analysis of Algorithms and Hyperparameters
Despite significant advances in Reinforcement Learning (RL), model performance remains highly sensitive to algorithm and hyperparameter configurations, while generalization gaps across environments complicate real-world deployment. Although prior work has studied RL generalization, the relative cont...
- Standing on the Shoulders of Giants: Stabilized Knowledge Distillation for Cross--Language Code Clone Detection
Cross-language code clone detection (X-CCD) is challenging because semantically equivalent programs written in different languages often share little surface similarity. Although large language models (LLMs) have shown promise for semantic clone detection, their use as black-box systems raises conce...
- HAAS: A Policy-Aware Framework for Adaptive Task Allocation Between Humans and Artificial Intelligence Systems
Deciding how to distribute work between humans and AI systems is a central challenge in organisational design. Most approaches treat this as a binary choice, yet the operational reality is richer: humans and AI routinely share tasks or take complementary roles depending on context, fatigue, and the ...
- Compress Then Adapt? No, Do It Together via Task-aware Union of Subspaces
Adapting large pretrained models to diverse tasks is now routine, yet the two dominant strategies of parameter-efficient fine-tuning (PEFT) and low-rank compression are typically composed in sequence. This decoupled practice first compresses and then fine-tunes adapters, potentially misaligning the ...
- First-Order Efficiency for Probabilistic Value Estimation via A Statistical Viewpoint
Probabilistic values, including Shapley values and semivalues, provide a model-agnostic framework to attribute the behavior of a black-box model to data points or features, with a wide range of applications including explainable artificial intelligence and data valuation. However, their exact comput...
- SCPRM: A Schema-aware Cumulative Process Reward Model for Knowledge Graph Question Answering
Large language models excel at complex reasoning, yet evaluating their intermediate steps remains challenging. Although process reward models provide step-wise supervision, they often suffer from a risk compensation effect, where incorrect steps are offset by later correct ones, assigning high rewar...
- AIs and Humans with Agency
This paper compares agency in humans with potential agency in AI programs. Human agency takes many years to develop, as the frontal lobe is activated. Early attempts to endow LLMs agency have met serious obstacles. Progress requires a new architecture where actions and plans are formulated jointly w...
- When Audio-Language Models Fail to Leverage Multimodal Context for Dysarthric Speech Recognition
Automatic speech recognition (ASR) systems remain brittle on dysarthric and other atypical speech. Recent audio-language models raise the possibility of improving performance by conditioning on additional clinical context at inference time, but it is unclear whether these models can make use of such...
- A decoupled diffusion planner that adapts to changing cost limits by using cost-conditioned generation for safety and reward gradients for performance
Offline safe reinforcement learning often requires policies to adapt at deployment time to safety budgets that vary across episodes or change within a single episode. While diffusion-based planners enable flexible trajectory generation, existing guidance schemes often treat reward improvement and co...
- U-Define: Designing User Workflows for Hard and Soft Constraints in LLM-Based Planning
LLMs are increasingly used for end-user task planning, yet their black-box nature limits users' ability to ensure reliability and control. While recent systems incorporate verification techniques, it remains unclear how users can effectively apply such rigid constraints to represent intent or adapt ...
- Bolek: A Multimodal Language Model for Molecular Reasoning
Molecular property models increasingly support high-stakes drug-discovery decisions, but their outputs are often difficult to audit: classical predictors return scores without rationale, while language models can produce fluent explanations weakly grounded in the input molecule. We introduce Bolek...
- AI-Generated Smells: An Analysis of Code and Architecture in LLM and Agent-Driven Development
The promise of Large Language Models in automated software engineering is often measured by functional correctness, overlooking the critical issue of long term maintainability. This paper presents a systematic audit of technical debt in AI-generated software, revealing that AI does not eliminate fla...
- Perceptual Flow Network for Visually Grounded Reasoning
Despite the success of Large-Vision Language Models (LVLMs), general optimization objectives (e.g., standard MLE) fail to constrain visual trajectories, leading to language bias and hallucination. To mitigate this, current methods introduce geometric priors from visual experts as additional supervis...
- ORPilot: A Production-Oriented Agentic LLM-for-OR Tool for Optimization Modeling
This paper presents ORPilot, an open-source agentic AI system that translates real-world business problems into solver-ready optimization models. Unlike academic LLM-for-OR tools that assume clean problem specifications with preformatted inline data, ORPilot is designed for production conditions: am...
- OphMAE: Bridging Volumetric and Planar Imaging with a Foundation Model for Adaptive Ophthalmological Diagnosis
The advent of foundation models has heralded a new era in medical artificial intelligence (AI), enabling the extraction of generalizable representations from large-scale unlabeled datasets. However, current ophthalmic AI paradigms are predominantly constrained to single-modality inference, thereby c...