AI News Archive: May 12, 2026 — Part 16

Sourced from 500+ daily AI sources, scored by relevance.

Vulcanizare Mobilă
First AI-native roadside assistance API, open SDK, MCP ready
🧰 ToolsMay 12, 2026https://www.producthunt.com/products/vulcanizare-mobila?utm_campaign=producthunt-api&utm_medium=api-v2&utm_source=Application%3A+the500feed+%28ID%3A+283491%29
Polymind
Ask multiple AIs. Get one synthesized answer.
🧰 ToolsMay 12, 2026https://www.producthunt.com/products/polymind-2?utm_campaign=producthunt-api&utm_medium=api-v2&utm_source=Application%3A+the500feed+%28ID%3A+283491%29
Xaloia AI
Privacy-first AI chat with live avatar conversations
🧰 ToolsMay 12, 2026https://www.producthunt.com/products/xaloia-ai?utm_campaign=producthunt-api&utm_medium=api-v2&utm_source=Application%3A+the500feed+%28ID%3A+283491%29
Agent Stack
5 tiny npm libs to stop AI agents misbehaving in prd
🧰 ToolsMay 12, 2026https://www.producthunt.com/products/agent-stack?utm_campaign=producthunt-api&utm_medium=api-v2&utm_source=Application%3A+the500feed+%28ID%3A+283491%29
MacMind
Run private AI on your Mac
🧰 ToolsMay 12, 2026https://www.producthunt.com/products/macmind?utm_campaign=producthunt-api&utm_medium=api-v2&utm_source=Application%3A+the500feed+%28ID%3A+283491%29
ResumeForge
AI resume optimizer that gets you more interviews
🧰 ToolsMay 12, 2026https://www.producthunt.com/products/resumeforge-5?utm_campaign=producthunt-api&utm_medium=api-v2&utm_source=Application%3A+the500feed+%28ID%3A+283491%29
Fill the GAP: A Granular Alignment Paradigm for Visual Reasoning in Multimodal Large Language Models
Visual latent reasoning lets a multimodal large language model (MLLM) create intermediate visual evidence as continuous tokens, avoiding external tools or image generators. However, existing methods usually follow an output-as-input latent paradigm and yield unstable gains. We identify evidence for ...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12374v1
QAP-Router: Tackling Qubit Routing as Dynamic Quadratic Assignment with Reinforcement Learning
Qubit routing is a fundamental problem in quantum compilation, known to be NP-hard. Its dynamic nature makes local routing decisions propagate and compound over time, making global efficient solutions challenging. Existing heuristic methods rely on local rules with limited lookahead, while recent le...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12365v1
A Family of Quaternion-Valued Differential Evolution Algorithms for Numerical Function Optimization
The numerical optimization of continuous functions is a fundamental task in many scientific and engineering domains, ranging from mechanical design to training of artificial intelligence models. Among the most effective and widely used algorithms for this purpose is Differential Evolution (DE), know...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12362v1
MedHopQA: A Disease-Centered Multi-Hop Reasoning Benchmark and Evaluation Framework for LLM-Based Biomedical Question Answering
Evaluating large language models (LLMs) in the biomedical domain requires benchmarks that can distinguish reasoning from pattern matching and remain discriminative as model capabilities improve. Existing biomedical question answering (QA) benchmarks are limited in this respect. Multiple-choice forma...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12361v1
BSO: Safety Alignment Is Density Ratio Matching
Aligning language models for both helpfulness and safety typically requires complex pipelines-separate reward and cost models, online reinforcement learning, and primal-dual updates. Recent direct preference optimization approaches simplify training but incorporate safety through ad-hoc modification...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12339v1
EHR-RAGp: Retrieval-Augmented Prototype-Guided Foundation Model for Electronic Health Records
Electronic Health Records (EHR) contain rich longitudinal patient information and are widely used in predictive modeling applications. However, effectively leveraging historical data remains challenging due to long trajectories, heterogeneous events, temporal irregularity, and the varying relevance ...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12335v1
Reinforcing VLAs in Task-Agnostic World Models
Post-training Vision-Language-Action (VLA) models via reinforcement learning (RL) in learned world models has emerged as an effective strategy to adapt to new tasks without costly real-world interactions. However, while using imagined trajectories reduces the sample complexity of policy training, ex...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12334v1
Towards Automated Air Traffic Safety Assessment Around Non-Towered Airports Using Large Language Models
We investigate frameworks for post-flight safety analysis at non-towered airports using large language models (LLMs). Non-towered airports rely on the Common Traffic Advisory Frequency (CTAF) for air traffic coordination and experience frequent near mid-air collisions due to the pilot self-announcem...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12332v1
LISA: Cognitive Arbitration for Signal-Free Autonomous Intersection Management
Large language models (LLMs) show strong potential for Intelligent Transportation Systems (ITS), particularly in tasks requiring situational reasoning and multi-agent coordination. These capabilities make them well suited for cooperative driving, where rule-based approaches struggle in complex and d...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12321v1
Transferable Delay-Aware Reinforcement Learning via Implicit Causal Graph Modeling
Random delays weaken the temporal correspondence between actions and subsequent state feedback, making it difficult for agents to identify the true propagation process of action effects. In cross-task scenarios, changes in task objectives and reward formulations further reduce the reusability of pre...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12312v1
Executable Agentic Memory for GUI Agent
Modern GUI agents typically rely on a model-centric and step-wise interaction paradigm, where LLMs must re-interpret the UI and re-decide actions at every screen, which is fragile in long-horizon tasks. In this paper, we propose Executable Agentic Memory (EAM), a structured Knowledge Graph (KG) that...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12294v1
PriorZero: Bridging Language Priors and World Models for Decision Making
Leveraging the rich world knowledge of Large Language Models (LLMs) to enhance Reinforcement Learning (RL) agents offers a promising path toward general intelligence. However, a fundamental prior-dynamics mismatch hinders existing approaches: static LLM knowledge cannot directly adapt to the complex...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12289v1
Iterative Audit Convergence in LLM-Managed Multi-Agent Systems: A Case Study in Prompt Engineering Quality Assurance
Prompt specifications for multi-agent large language model (LLM) systems carry data contracts and integration logic across many interdependent files but are rarely subjected to structured-inspection rigor. This paper reports a single-system empirical case study of iterative, agent-driven auditing ap...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12280v1
NARA: Anchor-Conditioned Relation-Aware Contextualization of Heterogeneous Geoentities
Geospatial foundation models have primarily focused on raster data such as satellite imagery, where self-supervised learning has been widely studied. Vector geospatial data instead represent the world as discrete geoentities with explicit geometry, semantics, and structured spatial relations, includ...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12276v1
How Useful Is Cross-Domain Generalization for Training LLM Monitors?
Using prompted language models as classifiers enables classification in domains with limited training data, but misses some of the robustness and performance benefits that fine-tuning can bring. We study whether training on multiple classification tasks, each with its own prompt, improves performanc...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12265v1
Long Horizon
Your coding agent writes the feature and runs the tests
🧰 ToolsMay 12, 2026https://www.producthunt.com/products/long-horizon?utm_campaign=producthunt-api&utm_medium=api-v2&utm_source=Application%3A+the500feed+%28ID%3A+283491%29
Reconnecting Fragmented Citation Networks with Semantic Augmentation
Citation graphs are fundamental tools for modeling scientific structure, but are often fragmented due to missing citations of scientifically connected articles. To address this issue, we propose a computationally efficient hybrid framework integrating citation topology with large language model (LLM...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12263v1
Beyond GRPO and On-Policy Distillation: An Empirical Sparse-to-Dense Reward Principle for Language-Model Post-Training
In settings where labeled verifiable training data is the binding constraint, each checked example should be allocated carefully. The standard practice is to use this data directly on the model that will be deployed, for example by running GRPO on the deployment student. We argue that this is often ...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12483v1
ToolCUA: Towards Optimal GUI-Tool Path Orchestration for Computer Use Agents
Computer Use Agents (CUAs) can act through both atomic GUI actions, such as click and type, and high-level tool calls, such as API-based file operations, but this hybrid action space often leaves them uncertain about when to continue with GUI actions or switch to tools, leading to suboptimal executi...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12481v1
KV-Fold: One-Step KV-Cache Recurrence for Long-Context Inference
We introduce KV-Fold, a simple, training-free long-context inference protocol that treats the key-value (KV) cache as the accumulator in a left fold over sequence chunks. At each step, the model processes the next chunk conditioned on the accumulated cache, appends the newly produced keys and values...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12471v1
Solve the Loop: Attractor Models for Language and Reasoning
Looped Transformers offer a promising alternative to purely feed-forward computation by iteratively refining latent representations, improving language modeling and reasoning. Yet recurrent architectures remain unstable to train, costly to optimize and deploy, and constrained to small, fixed recurre...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12466v1
Enabling AI-Native Mobility in 6G: A Real-World Dataset for Handover, Beam Management, and Timing Advance
To address the issues of high interruption time and measurement report overhead under user equipment (UE) mobility especially in high speed 5G use cases the use of AI/ML techniques (AI/ML beam management and mobility procedures) have been proposed. These techniques rely heavily on data that are most...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12453v1
Classifier Context Rot: Monitor Performance Degrades with Context Length
Monitoring coding agents for dangerous behavior using language models requires classifying transcripts that often exceed 500K tokens, but prior agent monitoring benchmarks rarely contain transcripts longer than 100K tokens. We show that when used as classifiers, current frontier models fail to notic...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12366v1
$δ$-mem: Efficient Online Memory for Large Language Models
Large language models increasingly need to accumulate and reuse historical information in long-term assistants and agent systems. Simply expanding the context window is costly and often fails to ensure effective context utilization. We propose $δ$-mem, a lightweight memory mechanism that augments a ...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12357v1
A New Technique for AI Explainability using Feature Association Map
Lack of transparency in AI systems poses challenges in critical real-life applications. It is important to be able to explain the decisions of an AI system to ensure trust on the system. Explainable AI (XAI) algorithms play a vital role in achieving this objective. In this paper, we are proposing a ...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12350v1
KAN-CL: Per-Knot Importance Regularization for Continual Learning with Kolmogorov-Arnold Networks
Catastrophic forgetting remains the central obstacle in continual learning (CL): parameters shared across tasks interfere with one another, and existing regularization methods such as EWC and SI apply uniform penalties without awareness of which input region a parameter serves. We propose KAN-CL, a ...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12306v1
TokenRatio: Principled Token-Level Preference Optimization via Ratio Matching
Direct Preference Optimization (DPO) is a widely used RL-free method for aligning language models from pairwise preferences, but it models preferences over full sequences even though generation is driven by per-token decisions. Existing token-level extensions typically decompose a sequence-level Bra...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12288v1
Why Conclusions Diverge from the Same Observations: Formalizing World-Model Non-Identifiability via an Inference
When people share the same documents and observations yet reach different conclusions, the disagreement often shifts into a judgment that the other party is cognitively defective, irrational, or acting in bad faith. This paper argues that such divergence is better described as a form of non-identifi...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12255v1
Pion: A Spectrum-Preserving Optimizer via Orthogonal Equivalence Transformation
We introduce Pion, a spectrum-preserving optimizer for large language model (LLM) training based on orthogonal equivalence transformation. Unlike additive optimizers such as Adam and Muon, Pion updates each weight matrix through left and right orthogonal transformations, preserving its singular valu...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12492v1
Task-Adaptive Embedding Refinement via Test-time LLM Guidance
We explore the effectiveness of an LLM-guided query refinement paradigm for extending the usability of embedding models to challenging zero-shot search and classification tasks. Our approach refines the embedding representation of a user query using feedback from a generative LLM on a small set of d...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12487v1
MEME: Multi-entity & Evolving Memory Evaluation
LLM-based agents increasingly operate in persistent environments where they must store, update, and reason over information across many sessions. While prior benchmarks evaluate only single-entity updates, MEME defines six tasks spanning the full space defined by the multi-entity and evolving axes, ...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12477v1
Approximation Theory of Laplacian-Based Neural Operators for Reaction-Diffusion System
Neural operators provide a framework for learning solution operators of partial differential equations (PDEs), enabling efficient surrogate modeling for complex systems. While universal approximation results are now well understood, approximation analysis specific to nonlinear reaction-diffusion sys...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12025v1
Multi-Stream LLMs: Unblocking Language Models with Parallel Streams of Thoughts, Inputs and Outputs
The continued improvements in language model capability have unlocked their widespread use as drivers of autonomous agents, for example in coding or computer use applications. However, the core of these systems has not changed much since early instruction-tuned models like ChatGPT. Even advanced AI ...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12460v1
TextSeal: A Localized LLM Watermark for Provenance & Distillation Protection
We introduce TextSeal, a state-of-the-art watermark for large language models. Building on Gumbel-max sampling, TextSeal introduces dual-key generation to restore output diversity, along with entropy-weighted scoring and multi-region localization for improved detection. It supports serving optimizat...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12456v1
ORCE: Order-Aware Alignment of Verbalized Confidence in Large Language Models
Large language models (LLMs) often produce answers with high certainty even when they are incorrect, making reliable confidence estimation essential for deployment in real-world scenarios. Verbalized confidence, where models explicitly state their confidence in natural language, provides a flexible ...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12446v1
ORBIT: Preserving Foundational Language Capabilities in GenRetrieval via Origin-Regulated Merging
Despite the rapid advancements in large language model (LLM) development, fine-tuning them for specific tasks often results in the catastrophic forgetting of their general, language-based reasoning abilities. This work investigates and addresses this challenge in the context of the Generative Retrie...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12419v1
Aligning Flow Map Policies with Optimal Q-Guidance
Generative policies based on expressive model classes, such as diffusion and flow matching, are well-suited to complex control problems with highly multimodal action distributions. Their expressivity, however, comes at a significant inference cost: generating each action typically requires simulatin...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12416v1
Model-based Bootstrap of Controlled Markov Chains
We propose and analyze a model-based bootstrap for transition kernels in finite controlled Markov chains (CMCs) with possibly nonstationary or history-dependent control policies, a setting that arises naturally in offline reinforcement learning (RL) when the behavior policy generating the data is un...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12410v1
Trajectory-Agnostic Asteroid Detection in TESS with Deep Learning
We present a novel method for extracting moving objects from TESS data using machine learning. Our approach uses two stacked 3D U-Nets with skip connections, which we call a W-Net, to filter background and identify pixels containing moving objects in TESS image time-series data. By augmenting the tr...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12391v1
Events as Triggers for Behavioral Diversity in Multi-Agent Reinforcement Learning
Effective multi-agent cooperation requires agents to adopt diverse behaviors as task conditions evolve-and to do so at the right moment. Yet, current Multi-Agent Reinforcement Learning (MARL) frameworks that facilitate this diversity are still limited by the fact that they bind fixed behaviors to fi...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12388v1
A Semi-Supervised Framework for Speech Confidence Detection using Whisper
Automatic detection of speaker confidence is critical for adaptive computing but remains constrained by limited labelled data and the subjectivity of paralinguistic annotations. This paper proposes a semi-supervised hybrid framework that fuses deep semantic embeddings from the Whisper encoder with a...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12387v1
MetaColloc: Optimization-Free PDE Solving via Meta-Learned Basis Functions
Solving partial differential equations (PDEs) with machine learning typically requires training a new neural network for every new equation. This optimization is slow. We introduce MetaColloc. It is an optimization-free and data-free framework that removes this bottleneck completely. We decouple bas...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12368v1
From Message-Passing to Linearized Graph Sequence Models
Message-passing based approaches form the default backbone of most learning architectures on graph-structured data. However, the rapid progress of modern deep learning architectures in other domains, particularly sequence modeling, raises the question of how graph learning can benefit from these adv...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12358v1
In-context learning to predict critical transitions in dynamical systems
Critical transitions - abrupt, often irreversible changes in system dynamics - arise across human and natural systems, often with catastrophic consequences. Real-world observations of such shifts remain scarce, preventing the development of reliable early warning systems. Conventional statistical an...
📄 ResearchMay 12, 2026http://arxiv.org/abs/2605.12308v1