AI News Archive: May 19, 2026 — Part 18
Sourced from 500+ daily AI sources, scored by relevance.
- What LLM Can I Run
Pick the right local LLM for your GPU.
- VirtualLotus
Your AI emotional support
- Instabio
Free AI Instagram bio generator for creators
- CartoVoxel
Create Minecraft worlds plus share-ready AI city posters.
- MikaAI
Generate Professional Contracts in 60 Seconds with AI
- Enhancio
Enhance and animate photos and videos.
- Voice Pilot
AI receptionist , qualifies & books 24/7, call leads
- Adryxa AI Studio
AI-powered workflows and content automation
- LLM Agents Make Collective Belief Dynamics Programmable: Challenges and Research Directions
Classical models of opinion dynamics assume human participants with bounded rationality and limited coordination. The rise of LLM-based agents introduces a qualitative shift: agents can now participate in online discussions at scale, maintain consistent persuasion strategies, and coordinate systemat...
- Memory-Augmented Reinforcement Learning Agent for CAD Generation
Automatic generation of computer-aided design (CAD) models is a core technology for enabling intelligence in advanced manufacturing. Existing generation methods based on large language models (LLMs) often fall short when handling complex CAD models characterized by long operation sequences, diverse ...
- EngiAI: A Multi-Agent Framework and Benchmark Suite for LLM-Driven Engineering Design
Large Language Model (LLM) agents are increasingly applied to engineering design tasks, yet existing evaluation frameworks do not adequately address multi-agent systems that combine simulation, retrieval, and manufacturing preparation. We introduce a benchmark suite with three evaluation dimensions:...
- STAR-PólyaMath: Multi-Agent Reasoning under Persistent Meta-Strategic Supervision
Frontier AI models and multi-agent systems have led to significant improvements in mathematical reasoning. However, for problems requiring extended, long-horizon reasoning, existing systems continue to suffer from fundamental reliability issues: hallucination accumulation, memory fragmentation, and ...
- AQuaUI: Visual Token Reduction for GUI Agents with Adaptive Quadtrees
Large Multimodal Models (LMMs) have recently emerged as promising backbones for GUI-agent models, where high-resolution GUI screenshots are introduced to the prompts at each iteration step. However, these screenshots exhibit highly non-uniform spatial information density: large regions may carry lit...
- CASPIAN: Online Detection and Attribution of Cascade Attacks in LLM Multi-Agent Systems via Cross-Channel Causal Monitoring
Cascade attacks in LLM multi-agent systems (MAS) arise when adversarial influence propagates across agents and leads to escalated system-level failures through complex agent interactions. Detecting such cascades is challenging, as their signals are distributed, tightly coupled across interaction cha...
- PAVE: A Cognitive Architecture for Legitimate Violation in Generative Agent Societies
Generative agents based on large language models reproduce believable human behavior in cooperative settings, but how they should reason in situations where rule-breaking may be required, such as fire evacuation or authority-supervised emergency, remains poorly characterized. We propose PAVE (Percep...
- AffectAI-Capture: A Reproducible Multimodal Protocol for Small-Group Meeting Research
We present AffectAI-Capture, a protocol for collecting synchronized multimodal data in four-person meeting-like interactions, combining eye tracking, wearable physiology, close-talk and room audio, multi-view video, event logging, and structured self-report. Sessions use fixed task blocks grounded i...
- TombWriter: Scaffolding Story Archeology through Beat-Level Interaction in Human-AI Co-Writing
The dominant paradigm for LLM interaction in AI co-writing uses disposable prompts that vanish after use. This may lead to imprecise results, cumbersome workflows, and diminished author agency and ownership. We propose LLM-based story archeology, where prompts serve as a hierarchical story instrumen...
- The Accessibility Capability Boundary: Operational Limits and Expansion Potential of AI-Generated Browser-Native Accessibility Systems
As large language models (LLMs) demonstrate increasing competence in synthesizing functional user interfaces, a fundamental question emerges in accessibility computing: \textit{how far can AI-driven accessibility systems go?} This paper introduces the \textit{Accessibility Capability Boundary} (ACB)...
- Toward User Comprehension Supports for LLM Agent Skill Specifications
Users often interpret and select agent skills through their \texttt{SKILL.md} specifications. To protect users, existing audits mainly focus on malicious or unsafe skills. We study the complementary question of whether specifications help users form bounded expectations about what a skill consumes, ...
- From Role to Person: Trust Calibration Challenges in Twin Agents
Agentic AI has taken on the role of assistant, collaborator, and decision-support tool. We argue the next role on that list is more personal: you. These are digital twins of each individual -- twin agents -- representing their knowledge, perspective, and communicative style to colleagues when they a...
- Material for Thought: Generative AI as an Active Creative Medium
Human-AI collaboration research has largely positioned the human as a judge of AI output, centering effort on evaluating whether rec- ommendations are reliable enough to accept. This decision-support framing leaves little room for the human as creator. We argue that for creative work, this framing m...
- CutVerse: A Compositional GUI Agents Benchmark for Media Post-Production Editing
While GUI agents have made significant progress in web navigation and basic operating system tasks, their capabilities in professional creative workflows remain largely underexplored. To bridge this gap, we introduce Cutverse, a benchmark designed to systematically evaluate autonomous GUI agents in ...
- Platform architecture determines whether recommendation algorithms can shape information quality on social media
Social media platforms shape public discourse through two fundamental design choices that naturally co-occur in any field investigation: platform architecture, which defines what types of actors exist and how they interact, and recommendation algorithm, which determines what content is surfaced to u...
- BiRD: A Bidirectional Ranking Defense Mechanism for Retrieval Augmented Generation
The growing adoption of Retrieval-Augmented Generation (RAG) has led to a rise in adversarial attacks. Existing defenses, relying on semantic analysis or voting, face a trade-off between high computational cost and limited robustness under strong poisoning attacks. Their fundamental limitation is th...
- Awakening the Hydra: Stabilizing Multi-Concept Backdoor Injection in Text-to-Image Diffusion Models
Text-to-image diffusion models are increasingly developed through open-source reuse and repeated downstream fine-tuning, where reused checkpoints are difficult to verify and thus more susceptible to hidden backdoor behaviors. In such ecosystems, a single pretrained model may be sequentially adapted ...
- Exposing Functional Fusion: A New Class of Strategic Backdoor in Dynamic Prompt Architectures
Existing ViT backdoor attacks based on backbone-overwriting full-tuning are computationally expensive and inflict performance degradation. This has forced adversaries towards the Visual Parameter-Efficient Fine-Tuning (PEFT) paradigm, dominated by adapter-based (e.g., LoRA) and prompt-based (e.g., V...
- XAI FL-IDS: A Federated Learning and SHAP-Based Explainable Framework for Distributed Intrusion Detection Systems
An Intrusion Detection System (IDS) is vital in cybersecurity, detecting unauthorized activity across networks. With attacks on network layers increasing, stronger IDSs are needed. Yet most IDSs rely on centralized detection, forcing IoT nodes to ship data to a server, adding overhead and offering n...
- Exploring and Developing a Pre-Model Safeguard with Draft Models
Large Language Model (LLM) alignment remains vulnerable to jailbreak attacks that elicit unsafe responses, motivating pre-model and post-model guards. Pre-model guards audit the safety of prompts before invoking target models. However, relying solely on the prompt often leads to high false-negative ...
- Backdooring Masked Diffusion Language Models
Masked diffusion language models (MDLMs) are emerging as a compelling new paradigm for text generation, but their training-time security remains largely unexplored. Existing backdoor attacks on Gaussian diffusion models or autoregressive language models do not directly apply to MDLMs because MDLMs r...
- Detecting and Mitigating Backdoor Attacks in OTA-FL Systems: A Two-Stage Robust Aggregation Scheme
Over-the-air federated learning (OTA-FL) improves communication efficiency by exploiting the superposition property of wireless channels, but this same property also creates a critical security vulnerability: the parameter server (PS) cannot access individual local updates, making it difficult to id...
- Quantum Machine Learning for Cyber-Physical Anomaly Detection in Unmanned Aerial Vehicles: A Leakage-Free Evaluation with Proxy-Audited Feature Sets
Unmanned aerial vehicles (UAVs) are cyber-physical systems whose attack surface spans networked avionics and on-board sensor fusion: a compromised GPS or battery module can mimic a benign mission segment and evade naive anomaly detectors. We present a leakage-free evaluation of quantum machine learn...
- Token by Token, Compromised: Backdoor Vulnerabilities in Unified Autoregressive Models
Unified autoregressive models (UAMs) are transformer models that generate text as well as image tokens within a single autoregressive pass. Shared parameters and a multimodal vocabulary simplify the training pipeline and facilitate flexible multimodal generation, yet might introduce new vulnerabilit...
- Hunting Vulnerability Variants in AI Infra: Measurement and Reference-Driven Detection
AI infra has become a shared execution layer for model training, deployment, and agent orchestration. Because many projects reimplement similar model-centric workflows, a vulnerability disclosed in one repository can recur as a variant in another repository with a related design. Yet the prevalence ...
- DASM: Domain-Aware Sharpness Minimization for Multi-Domain Voice Stream Steganalysis
The growing use of information hiding in network streaming media for covert communication poses a significant security threat, necessitating the development of robust detection technologies. However, existing steganalysis methods for network voice streams mostly rely on data distributions in specifi...
- Measuring Safety Alignment Effects in Autonomous Security Agents
Do stock safety-aligned language models and their uncensored or abliterated derivatives behave differently when run as autonomous security agents? Single-turn refusal benchmarks cannot answer this question: security agents must inspect repositories, call tools, and produce vulnerability evidence ins...
- SCARA: A Semantics-Constrained Autonomous Remediation Agent for Opaque Industrial Software Vulnerabilities
Critical-infrastructure operators are increasingly expected to assess and remediate vulnerabilities in deployed industrial software. However, much of this software exists as opaque industrial software (OIS), including stripped firmware, proprietary protocol handlers, and compiled control logic witho...
- Inferring Sensitive Attributes from Knowledge Graph Embeddings: Attack and Defense Strategies
Knowledge Graphs (KGs) are a powerful representation of linked data, offering flexibility, semantic richness, and support for knowledge enrichment and reasoning. They help data owners organize and exploit heterogeneous data to provide insightful services (e.g., recommendations), yet real-world KGs a...
- Devilray: A Systematic Adversarial Model Revealing Blind Spots in Fake Base Station Detection
Fake Base Station (FBS) detection has been a critical focus of cellular security research for over two decades. However, significant financial and regulatory barriers to accessing commercial FBS (C-FBS) devices have limited direct visibility into real-world operations, forcing detection systems to b...
- Can LLMs Produce Better Object-Oriented Designs than Human-Involved Development?
Background: Large Language Models (LLMs) are increasingly used for code generation. However, their ability to generate multi-class projects that require object-oriented design (OOD) remains unclear, especially relative to projects developed with human involvement. Aims: The primary objective of this...
- Prior Knowledge or Search? A Study of LLM Agents in Hardware-Aware Code Optimization
LLM discovery and optimization systems are increasingly applied across domains, implementing a common propose-evaluate-revise loop. Such optimization or discovery progresses via context conditioning on received feedback from an environment. However, as modern LLM agents are increasingly complex in t...
- CriterAlign: Criterion-Centric Rationale Alignment for Code Preference Judging
Pairwise human preference prediction is central to evaluating code-generation systems, where quality often depends on task-specific trade-offs beyond functional correctness. While rubric-based LLM judges improve interpretability by decomposing evaluation into explicit criteria, most existing pipelin...
- Characterizing Real-World Bugs in Tile Programs for Automated Bug Detection
Tile-based programming frameworks are increasingly adopted to write high-performance GPU kernels in domains such as deep learning and scientific computing. While these frameworks enhance productivity and hardware utilization, their multi-stage compilation pipelines introduce distinct code generation...
- Provable Fairness Repair for Deep Neural Networks
Deep neural networks (DNNs) are suffering from ethical issues such as individual discrimination. In response, extensive NN repair techniques have been developed to adjust models and mitigate such undesired behaviors. However, existing fairness repair methods are typically data-centric, which often l...
- MOCHA: Multi-Objective Chebyshev Annealing for Agent Skill Optimization
LLM agents organize behavior through skills - structured natural-language specifications governing how an agent reasons, retrieves, and responds. Unlike monolithic prompts, skills are multi-field artifacts subject to hard platform constraints: description fields are truncated for routing, instructio...
- Does Code Cleanliness Affect Coding Agents? A Controlled Minimal-Pair Study
As autonomous coding agents see rapid adoption, their evaluation has primarily focused on task completion rates holding the target codebase fixed. This leaves a critical question unanswered: does the structural and stylistic quality, or ``cleanliness'' of the underlying code affect an agent's abilit...
- OpenComputer: Verifiable Software Worlds for Computer-Use Agents
We present OpenComputer, a verifier-grounded framework for constructing verifiable software worlds for computer-use agents. OpenComputer integrates four components: (1) app-specific state verifiers that expose structured inspection endpoints over real applications, (2) a self-evolving verification l...
- When to Answer and When to Defer: A Decision Framework for Reliable Code Predictions
Code language models are increasingly adopted for both understanding and generative tasks. Despite their success, these models frequently produce overconfident incorrect predictions and underconfident correct predictions, undermining their reliability in deployment. Practical deployment demands thre...
- On-the-Fly Input Adaptation for Reliable Code Intelligence
Code language models (CLMs) play a central role in software engineering across both generation and classification tasks. However, these models still exhibit notable mispredictions in real-world applications, even when trained on up-to-date data. Existing solutions address this by retraining the mode...
- MuMuTestUp: Mutation-based Multi-Agent Test Case Update
Modern software systems evolve rapidly under CI/CD practices, where tests are critical for quality. However, substantial code changes often render existing test cases obsolete, causing pipeline disruptions, reduced productivity, and compromised quality. Recent automatic test update approaches levera...
- When Web Apps Heal Themselves: A MAPE-K Based Approach to Fault Tolerance and Adaptive Recovery
Ensuring the reliability and resilience of modern web applications remains a critical challenge due to increasing system complexity and dynamic runtime environments. This study proposes a modular self-healing framework based on the monitor-analyze-plan-execute over a shared knowledge base (MAPE-K) m...