AI News Archive: May 13, 2026 — Part 20

Sourced from 500+ daily AI sources, scored by relevance.

Probing Persona-Dependent Preferences in Language Models
Large language models (LLMs) can be said to have preferences: they reliably pick certain tasks and outputs over others, and preferences shaped by post-training and system prompts appear to shape much of their behaviour. But models can also adopt different personas which have radically different pref...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13339v1
Tracing Persona Vectors Through LLM Pretraining
How large language models internally represent high-level behaviors is a core interpretability question with direct relevance to AI safety: it determines what we can detect, audit, or intervene on. Recent work has shown that traits such as evil or sycophancy correspond to linear directions in the in...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13329v1
CANTANTE: Optimizing Agentic Systems via Contrastive Credit Attribution
LLM-based multi-agent systems have demonstrated strong performance across complex real-world tasks, such as software engineering, predictive modeling, and retrieval-augmented generation. Yet automating their configuration remains a structural challenge, as scores are available only at the system lev...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13295v1
What properties of reasoning supervision are associated with improved downstream model quality?
Validating training data for reasoning models typically requires expensive trial-and-error fine-tuning cycles. In this work, we investigate whether the utility of a reasoning dataset can be reliably predicted prior to training using intrinsic data metrics. We propose a suite of quantitative measures...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13290v1
The Readability Spectrum: Patterns, Issues, and Prompt Effects in LLM-Generated Code
As Large Language Models (LLMs) are transforming software development, the functional quality of generated code has become a central focus, leaving readability, one of critical non-functional attributes, understudied. Given that LLM-generated code still needs human review before adoption, it is impo...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13280v1
IndexedAI
Your site scores X/100 for AI agents with next steps
🧰 ToolsMay 13, 2026https://www.producthunt.com/products/indexedai?utm_campaign=producthunt-api&utm_medium=api-v2&utm_source=Application%3A+the500feed+%28ID%3A+283491%29
claude-share
Securely share your Claude Code with your friends
🧰 ToolsMay 13, 2026https://www.producthunt.com/products/claude-share?utm_campaign=producthunt-api&utm_medium=api-v2&utm_source=Application%3A+the500feed+%28ID%3A+283491%29
Zen Reports
See how much traffic your website gets from ChatGPT
🧰 ToolsMay 13, 2026https://www.producthunt.com/products/zen-reports?utm_campaign=producthunt-api&utm_medium=api-v2&utm_source=Application%3A+the500feed+%28ID%3A+283491%29
Mi
30-line zero-config CLI agent for bug fixes + refactoring
🧰 ToolsMay 13, 2026https://www.producthunt.com/products/mi?utm_campaign=producthunt-api&utm_medium=api-v2&utm_source=Application%3A+the500feed+%28ID%3A+283491%29
Linchpin
Open-source, self-hostable runtime for managed AI agents
🧰 ToolsMay 13, 2026https://www.producthunt.com/products/linchpin-2?utm_campaign=producthunt-api&utm_medium=api-v2&utm_source=Application%3A+the500feed+%28ID%3A+283491%29
SurfBuddy
AI sidebar companions for cross-app workflows and automation
🧰 ToolsMay 13, 2026https://www.producthunt.com/products/surfbuddy-2?utm_campaign=producthunt-api&utm_medium=api-v2&utm_source=Application%3A+the500feed+%28ID%3A+283491%29
Mycelis
Serverless AI workspace with smart routing & MCP agents
🧰 ToolsMay 13, 2026https://www.producthunt.com/products/mycelis?utm_campaign=producthunt-api&utm_medium=api-v2&utm_source=Application%3A+the500feed+%28ID%3A+283491%29
Jootle
The AI-Native Operations Platform
🧰 ToolsMay 13, 2026https://www.producthunt.com/products/jootle?utm_campaign=producthunt-api&utm_medium=api-v2&utm_source=Application%3A+the500feed+%28ID%3A+283491%29
Pipecat
Build AI workflows and assistants for your business
🧰 ToolsMay 13, 2026https://www.producthunt.com/products/pipecat-2?utm_campaign=producthunt-api&utm_medium=api-v2&utm_source=Application%3A+the500feed+%28ID%3A+283491%29
BossHogg
Agent-first CLI for PostHog analytics and feature flags
🧰 ToolsMay 13, 2026https://www.producthunt.com/products/bosshogg?utm_campaign=producthunt-api&utm_medium=api-v2&utm_source=Application%3A+the500feed+%28ID%3A+283491%29
SearchScore AI
See if AI search can find and recommend your brand.
🧰 ToolsMay 13, 2026https://www.producthunt.com/products/searchscore-ai-v7?utm_campaign=producthunt-api&utm_medium=api-v2&utm_source=Application%3A+the500feed+%28ID%3A+283491%29
Compact Latent Manifold Translation: A Parameter-Efficient Foundation Model for Cross-Modal and Cross-Frequency Physiological Signal Synthesis
The analysis of physiological time series, such as electrocardiograms (ECG) and photoplethysmograms (PPG), is persistently hindered by modality and frequency gaps stemming from heterogeneous recording devices. Existing foundation models typically rely on continuous latent spaces, which frequently su...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13248v1
It's not the Language Model, it's the Tool: Deterministic Mediation for Scientific Workflows
Language models can produce convincing scientific analyses, but repeated generations on the same data do not guarantee the same result. A researcher may regenerate an identical query and receive a different fit, a different peak position or a different analysis procedure, without an obvious way to d...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13245v1
Teacher-Guided Policy Optimization for LLM Distillation
The convergence of reinforcement learning and imitation learning has positioned Reverse KL (RKL) as a promising paradigm for on-policy LLM distillation, aiming to unify exploration with teacher supervision. However, we identify a critical limitation: when the student and teacher distributions diverg...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13230v1
Improving Code Translation with Syntax-Guided and Semantic-aware Preference Optimization
LLMs have shown immense potential for code translation, yet they often struggle to ensure both syntactic correctness and semantic consistency. While preference-based learning offers a promising alignment strategy, it is hindered by unreliable semantic rewards derived from sparse test cases or restri...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13229v1
ReTool-Video: Recursive Tool-Using Video Agents with Meta-Augmented Tool Grounding
Video understanding requires active evidence seeking, motivating tool-augmented video agents for temporal reasoning, cross-modal understanding, and complex question answering. Existing video agents have improved video reasoning with retrieval, memory, frame inspection, and verifier tools, but they s...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13228v1
Hierarchical Attacks for Multi-Modal Multi-Agent Reasoning
Multi-modal multi-agent systems (MM-MAS) have gained increasing attention for their capacity to enable complex reasoning and coordination across diverse modalities. As these systems continue to expand in scale and functionality, investigating their potential vulnerabilities has become increasingly i...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13213v1
STAR: Semantic-Temporal Adaptive Representation Learning for Few-Shot Action Recognition
Few-shot action recognition (FSAR) requires models to generalize to novel action categories from only a handful of annotated samples. Despite progress with vision-language models, existing approaches still suffer from semantic-temporal misalignment, where static textual prompts fail to capture decis...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13202v1
ECG-NAT: A Self-supervised Neighborhood Attention Transformer for Multi-lead Electrocardiogram Classification
Electrocardiogram (ECG) arrhythmia classification remains challenging due to signal variability, noise, limited labeled data, and the difficulty in achieving both accuracy and efficiency in models. While self-supervised learning reduces label dependency, most methods target either global contextual ...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13194v1
Stable Attention Response for Reliable Precipitation Nowcasting
Precipitation nowcasting remains challenging due to the highly localized, rapidly evolving, and heterogeneous nature of atmospheric dynamics. Although recent methods increasingly adopt attention-based architectures in both unimodal and multimodal settings, they mainly emphasize stronger representati...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13181v1
CLIP Tricks You: Training-free Token Pruning for Efficient Pixel Grounding in Large VIsion-Language Models
In large vision-language models, visual tokens typically constitute the majority of input tokens, leading to substantial computational overhead. To address this, recent studies have explored pruning redundant or less informative visual tokens for image understanding tasks. However, these methods str...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13178v1
PanoWorld: Towards Spatial Supersensing in 360$^\circ$ Panorama World
Multimodal large laboratory models (MLLMs) still struggle with spatial understanding under the dominant perspective-image paradigm, which inherits the narrow field of view of human-like perception. For navigation, robotic search, and 3D scene understanding, 360-degree panoramic sensing offers a form...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13169v1
EvObj: Learning Evolving Object-centric Representations for 3D Instance Segmentation without Scene Supervision
We introduce EvObj for unsupervised 3D instance segmentation that bridges the geometric domain gap between synthetic pretraining data and real-world point clouds. Current methods suffer from structural discrepancies when transferring object priors from synthetic datasets (e.g., ShapeNet) to real sca...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13152v1
GRIP-VLM: Group-Relative Importance Pruning for Efficient Vision-Language Models
In Vision-Language Models (VLMs), processing a massive number of visual tokens incurs prohibitive computational overhead. While recent training-aware pruning methods attempt to selectively discard redundant tokens, they largely rely on continuous-gradient relaxations. However, visual token pruning i...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13375v1
AI Harness Engineering: A Runtime Substrate for Foundation-Model Software Agents
Foundation models have transformed automated code generation, yet autonomous software-engineering agents remain unreliable in realistic development settings. The dominant explanation locates this gap in model capability. We propose a different locus: software-engineering capability emerges from a mo...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13357v1
Inducing Overthink: Hierarchical Genetic Algorithm-based DoS Attack on Black-Box Large Language Reasoning Models
Large Reasoning Models (LRMs) are increasingly integrated into systems requiring reliable multi-step inference, yet this growing dependence exposes new vulnerabilities related to computational availability. In particular, LRMs exhibit a tendency to "overthink", producing excessively long and redunda...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13338v1
Ego2World: Compiling Egocentric Cooking Videos into Executable Worlds for Belief-State Planning
Embodied agents in household environments must plan under partial observation: they need to remember objects, track state changes, and recover when actions fail. Existing benchmarks only partially test this ability. Egocentric video datasets capture realistic human activities but remain passive, whi...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13335v1
Stylized Text-to-Motion Generation via Hypernetwork-Driven Low-Rank Adaptation
Text-driven motion diffusion models are capable of generating realistic human motions, but text alone often struggles to express fine-level nuances of motion, commonly referred to as style. Recent approaches have tackled this challenge by attaching a style injection mechanism to a pretrained text-dr...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13333v1
What Limits Vision-and-Language Navigation ?
Vision-and-Language Navigation (VLN) is a cornerstone of embodied intelligence. However, current agents often suffer from significant performance degradation when transitioning from simulation to real-world deployment, primarily due to perceptual instability (e.g., lighting variations and motion blu...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13328v1
VERA-MH: Validation of Ethical and Responsible AI in Mental Health
Chatbot usage has increased, including in fields for which they were never developed for--notably mental health support. To that end, we introduce Validations of Ethical and Responsible AI in Mental Health (VERA-MH), a novel clinically-validated evaluation for safety of chatbots in the context of me...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13318v1
IdeaForge: A Knowledge Graph-Grounded Multi-Agent Framework for Cross-Methodology Innovation Analysis and Patent Claim Generation
Current AI-assisted innovation systems typically apply a single ideation methodology (such as TRIZ or Design Thinking) using sequential prompt-based workflows that do not preserve intermediate reasoning structure. As a result, insights generated across methodologies remain fragmented, limiting trace...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13311v1
Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling
Recent progress in reasoning models has substantially advanced long-horizon mathematical and scientific problem solving, with several systems now reaching gold-medal-level performance on International Mathematical Olympiad (IMO) and International Physics Olympiad (IPhO) problems. In this paper, we i...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13301v1
Discrete Diffusion for Complex and Congested Multi-Agent Path Finding with Sparse Social Attention
Multi-Agent Path Finding (MAPF) is a coordination problem that requires computing globally consistent, collision-free trajectories from individual start positions to assigned goal positions under combinatorial planning complexity. In dense environments, suboptimal initial plans induce compound confl...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13296v1
IndicMedDialog: A Parallel Multi-Turn Medical Dialogue Dataset for Accessible Healthcare in Indic Languages
Most existing medical dialogue systems operate in a single-turn question--answering paradigm or rely on template-based datasets, limiting conversational realism and multilingual applicability. We introduce IndicMedDialog, a parallel multi-turn medical dialogue dataset spanning English and nine Indic...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13292v1
"It became a self-fulfilling prophecy": How Lived Experiences are Entangled with AI Predictions in Menstrual Cycle Tracking Apps
In menstrual cycle tracking apps (MCTAs), AI-based predictions and insights have become increasingly popular. These features enable users to receive personalized information about their bodies and mental states. However, there is currently little research on how these predictive AI features and expl...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13261v1
X-Restormer++: 1st Place Solution for the UG2+ CVPR 2026 All-Weather Restoration Challenge
In this work, we present our winning solution for the 8th UG2+ Challenge (CVPR 2026) Track 1: Image Restoration under All-weather Conditions. Our method is built upon the strong baseline framework X-Restormer, which effectively captures both channel-wise global dependencies and spatially-local struc...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13258v1
An Agentic AI Framework with Large Language Models and Chain-of-Thought for UAV-Assisted Logistics Scheduling with Mobile Edge Computing
In cloud manufacturing, unmanned aerial vehicles (UAVs) can support both product collection and mobile edge computing (MEC). This joint operation forms a hybrid scheduling problem, where physical logistics decisions are coupled with computational task scheduling. In this paper, UAVs collect finished...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13221v1
McCast: Memory-Guided Latent Drift Correction for Long-Horizon Precipitation Nowcasting
Existing precipitation nowcasting methods typically adopt an autoregressive formulation, where future states are predicted from previous outputs. However, such an approach accumulates errors over long rollouts, causing forecasts to drift away from physically plausible evolution trajectories. Althoug...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13197v1
N-vium: Mixture-of-Exits Transformer for Accelerated Exact Generation
Improving the inference efficiency of autoregressive transformers typically means reducing FLOPs per token, usually through approximations that degrade model quality. We introduce N-vium, a mixture-of-exits transformer that partially parallelizes computation across depth on standard hardware, increa...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13190v1
Strikingness-Aware Evaluation for Temporal Knowledge Graph Reasoning
Temporal Knowledge Graph Reasoning (TKGR) aims at inferring missing (especially future) events from historical data. Current evaluation in TKGR uniformly weights all events, ignoring that most are trivial repetitions, which overestimate the true reasoning ability. Therefore, the rare outstanding eve...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13153v1
AcquisitionSynthesis: Targeted Data Generation using Acquisition Functions
Data quality remains a critical bottleneck in developing capable, competitive models. Researchers have explored many ways to generate top quality samples. Some works rely on rejection sampling: generating lots of synthetic samples and filtering out low-quality samples. Other works rely on larger or ...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13149v1
LeanSearch v2: Global Premise Retrieval for Lean 4 Theorem Proving
Proving theorems in Lean 4 often requires identifying a scattered set of library lemmas whose joint use enables a concise proof -- a task we call global premise retrieval. Existing tools address adjacent problems: semantic search engines find individual declarations matching a query, while premise-s...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13137v1
GRACE: Gradient-aligned Reasoning Data Curation for Efficient Post-training
Existing reasoning data curation pipelines score whole samples, treating every intermediate step as equally valuable. In reality, steps within a trace contribute very unevenly, and selecting reasoning data well requires assessing them individually. We present GRACE, a gradient-aligned curation metho...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13130v1
Exploiting Pre-trained Encoder-Decoder Transformers for Sequence-to-Sequence Constituent Parsing
To achieve deep natural language understanding, syntactic constituent parsing plays a crucial role and is widely required by many artificial intelligence systems for processing both text and speech. A recent approach involves using standard sequence-to-sequence models to handle constituent parsing a...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13373v1
What Does LLM Refinement Actually Improve? A Systematic Study on Document-Level Literary Translation
Iterative self-refinement is a simple inference-time strategy for machine translation: an LLM revises its own translation over multiple inference-time passes. Yet document-scale refinement remains poorly understood: 1) which pipelines work best, 2) what quality dimensions improve, and 3) how refiner...
📄 ResearchMay 13, 2026http://arxiv.org/abs/2605.13368v1