AI News Archive: May 4, 2026 — Part 13
Sourced from 500+ daily AI sources, scored by relevance.
- Universality in Deep Neural Networks: An approach via the Lindeberg exchange principle
We consider the infinite-width limit of a fully connected deep neural network with general weights, and we prove quantitative general bounds on the $2$-Wasserstein distance between the network and its infinite-width Gaussian limit, under appropriate regularity assumptions on the activation function....
- Visual Latents Know More Than They Say: Unsilencing Latent Reasoning in MLLMs
Continuous latent-space reasoning offers a compact alternative to textual chain-of-thought for multimodal models, enabling high-dimensional visual evidence to be integrated without explicit reasoning tokens. However, we identify a previously overlooked optimization pathology in existing latent visua...
- Federated Reinforcement Learning for Efficient Mobile Crowdsensing under Incomplete Information
Mobile crowdsensing (MCS) is a distributed sensing architecture that utilizes existing sensors on mobile units (MUs) to perform sensing tasks. A mobile crowdsensing platform (MCSP) publishes the sensing tasks and the MUs decide whether to participate in exchange for money. The MCS system is dynamic:...
- Robust and Fast Training via Per-Sample Clipping
We propose a robust gradient estimator based on per-sample gradient clipping and analyze its properties both theoretically and empirically. We show that the resulting method, per-sample clipped SGD (PS-Clip-SGD), achieves optimal in-expectation convergence rates for non-convex optimization problems ...
- ParaRNN: An Interpretable and Parallelizable Recurrent Neural Network for Time-Dependent Data
The proliferation of large-scale and structurally complex data has spurred the integration of machine learning methods into statistical modeling. Recurrent neural networks (RNNs), a foundational class of models for time-dependent data, can be viewed as nonlinear extensions of classical autoregressiv...
- Spectral Model eXplainer: a chemically-grounded explainability framework for spectral-based machine learning models
Spectral-based machine learning models have been increasingly deployed in chemometrics and spectroscopy, where predictive accuracy is as important as explainability. Current employed eXplainable Artificial Intelligence (XAI) methods are largely adapted from tabular or generic multivariate domains, a...
- CNNs for Vis-NIR Chemometrics: From Contradiction to Conditional Design
Near-infrared (NIR; a.k.a.\ NIRS) deep-learning studies in chemometrics increasingly report mutually inconsistent conclusions regarding convolutional neural network (CNN) design, including small versus large kernels, shallow versus deep architectures, raw spectra versus preprocessing, and single-dom...
- Gradient-Gated DPO: Stabilizing Preference Optimization in Language Models
Preference optimization has become a central paradigm for aligning large language models with human feedback. Direct Preference Optimization (DPO) simplifies reinforcement learning from human feedback by directly optimizing pairwise preferences, removing the need for reward modeling and policy optim...
- Isotropic Fourier Neural Operators
Fourier Neural Operators are deep learning models that learn mappings between function spaces and can be used to learn and solve partial differential equations (PDEs), in some cases significantly faster than traditional PDE solvers. Within the model are Fourier layers, which apply linear transformat...
- HARMES: A Multi-Modal Dataset for Wearable Human Activity Recognition with Motion, Environmental Sensing and Sound
With each sensing modality exhibiting inherent strengths and limitations, multi-modal approaches for wearable Human Activity Recognition (HAR) are becoming increasingly relevant -- particularly for recognizing Activities of Daily Living (ADLs), where individual modalities often produce ambiguous sig...
- Gradient Boosted Risk Scores
Risk scores are an interpretable and actionable class of machine learning models with applications in medicine, insurance, and risk management. Unlike most computational methods, risk scores are designed to be computed by a human by attributing points to a data sample based on a limited set of crite...
- On Training Large Language Models for Long-Horizon Tasks: An Empirical Study of Horizon Length
Large language models (LLMs) have shown promise as interactive agents that solve tasks through extended sequences of environment interactions. While prior work has primarily focused on system-level optimizations or algorithmic improvements, the role of task horizon length in shaping training dynamic...
- ContextBeat
AI-driven context engine for engineering teams
- Recurrent Deep Reinforcement Learning for Chemotherapy Control under Partial Observability
Chemotherapy dose optimization can be formulated as a dynamic treatment regime, requiring sequential decisions under uncertainty that must balance tumor suppression against toxicity. However, most reinforcement learning approaches assume full observability of the patient state, a condition rarely me...
- Beyond Specialization: Robust Reinforcement Learning Navigation via Procedural Map Generators
Deep reinforcement learning (DRL) navigation policies often overfit to the structure of their training environments, as environmental diversity is typically constrained by the manual effort required to design diverse scenarios. While procedural map generation offers scalable diversity, no prior work...
- Physics-Informed Neural Learning for State Reconstruction and Parameter Identification in Coupled Greenhouse Climate Dynamics
Physics-informed neural networks (PINNs) have recently emerged as a promising framework for integrating data-driven learning with physical knowledge. In this work, we propose a coupled PINN approach for the joint reconstruction of indoor temperature and humidity dynamics in greenhouse environments, ...
- Evaluating Tabular Representation Learning for Network Intrusion Detection
Classic Network Intrusion Detection Systems (NIDS) often rely on manual feature engineering to extract meaningful patterns from network traffic data. However, this approach requires domain expertise and runs counter to the widely adopted principle of modern machine learning and neural networks: that...
- A Novel Preprocessing-Driven Approach to Remaining Useful Life (RUL) Prediction Using Temporal Convolutional Networks (TCN)
Accurate prediction of Remaining Useful Life (RUL) in aero-engines is vital for predictive maintenance, improved operational reliability, and reduced lifecycle costs. While deep learning approaches have demonstrated strong potential in this area, most existing methods focus primarily on model archit...
- Pretraining on Sleep Data Improves non-Sleep Biosignal Tasks
Sleep foundation models have recently demonstrated strong performance on in-domain polysomnography tasks, including sleep staging, apnea detection, and disease risk prediction. In this work, we investigate whether sleep biosignals can serve as an effective pretraining distribution for learning repre...
- Efficient Preference Poisoning Attack on Offline RLHF
Offline Reinforcement Learning from Human Feedback (RLHF) pipelines such as Direct Preference Optimization (DPO) train on a pre-collected preference dataset, which makes them vulnerable to preference poisoning attack. We study label flip attacks against log-linear DPO. We first illustrate that flipp...
- Reference-Sampled Boltzmann Projection for KL-Regularized RLVR: Target-Matched Weighted SFT, Finite One-Shot Gaps, and Policy Mirror Descent
Online reinforcement learning with verifiable rewards (RLVR) turns checkable outcomes into a scalable training signal, but it keeps rollout generation, verifier scoring, and reference-policy evaluations on the optimization path. Static weighted supervised fine-tuning (SFT) on precomputed rollouts se...
- Dimensionality-Aware Anomaly Detection in Learned Representations of Self-Supervised Speech Models
Self-supervised speech models (S3Ms) achieve strong downstream performance, yet their learned representations remain poorly understood under natural and adversarial perturbations. Prior studies rely on representation similarity or global dimensionality, offering limited visibility into local geometr...
- MSMixer: Learned Multi-Scale Temporal Mixing with Complementary Linear Shortcut for Long-Term Time Series Forecasting
Long-term time series forecasting requires models that simultaneously capture rapid oscillations, medium-range periodicities, and slowly evolving macro-trends from a fixed look-back window. Existing lightweight MLP-based models typically operate on a single temporal resolution, limiting their abilit...
- Online Generalised Predictive Coding
This paper introduces an extension of generalised filtering for online applications. Generalised filtering refers to data assimilation schemes that jointly infer latent states, learn unknown model parameters, and estimate uncertainty in an integrated framework -- e.g., estimate state and observation...
- CARD: Coarse-to-fine Autoregressive Modeling with Radix-based Decomposition for Transferable Free Energy Estimation
Estimating free energy differences quantifies thermodynamic preferences in molecular interactions, which is central to chemistry and drug discovery. Despite fruitful progress, existing methods still face key limitations: classical computational approaches remain prohibitively expensive due to their ...
- ARA: Agentic Reproducibility Assessment For Scalable Support Of Scientific Peer-Review
Scientific peer review increasingly struggles to assess reproducibility at the scale and complexity of modern research output. Evaluating reproducibility requires reconstructing experimental dependencies, methodological choices, data flows, and result-generating procedures, which often exceeds what ...
- Selective Prediction from Agreement: A Lipschitz-Consistent Version Space Approach
We consider selective classification with abstention in the fixed-pool (or transductive) setting, where the unlabeled pool is given beforehand and only a subset of points can be queried for labels. Our main insight is to view selective prediction through agreement: given queried labels and Lipschitz...
- Gradient-Discrepancy Acquisition for Pool-Based Active Learning
The effectiveness of active learning hinges on the choice of the acquisition criterion by which a learning algorithm selects potentially informative data points whose label is subsequently queried. This paper proposes a novel gradient-based acquisition criterion, derived from a generalization bound ...
- Arrange Demo Agent
AI agent that runs your SaaS demos and converts users
- StreamIndex: Memory-Bounded Compressed Sparse Attention via Streaming Top-k
DeepSeek-V3.2 and V4 introduce Compressed Sparse Attention (CSA): a lightning indexer (a learned scoring projection over compressed keys) scores them, the top-k are selected per query, and a sparse attention kernel reads only those. Public CSA implementations materialize a [B, S, H_I, T] FP32 score ...
- MPCS: Neuroplastic Continual Learning via Multi-Component Plasticity and Topology-Aware EWC
Continual learning systems face a fundamental tension between plasticity -- acquiring new knowledge -- and stability -- retaining prior knowledge. We introduce MPCS (Multi-Plasticity Continual System), a neuroplastic architecture that integrates eleven complementary mechanisms: task-driven neuro...
- Markdownly
Clean web content into AI-ready, useable markdown instantly
- Xmagnet for Claude
AI-native B2B CRM with 35 tools that runs inside Claude
- Blink URLs
⚡ Edge-native link intelligence with AI-driven ROI insights.
- JasPing
AI support across chat, WhatsApp & voice calls
- Optiwise
AI-Native MSME factory optimisation
- ExamTickets
AI assistant for exam preparation
- Outpulse
The AI sales agent that books meetings while you sleep
- Nudge lens
Reveal hidden decision pressure in real time
- Gia
Turn thoughts into a structured day—automatically.
- Promptora AI Vibe Engine
Image Idea AI Generator
- SpeedChat Academy
Get Claude certified. Built on Anthropic's curriculum.
- nBlick
Control your brand across AI answers
- OneOver
All your AI models in one workspace
- MusicWave
AI music generator that turns ideas into full songs
- AI data dashboard
Raw data to insightful dashboard in minutes!!
- ICE - Infinite Context Engine
Virtual Memory Manager for LLMs. Infinite context. Drop-in.
- PulseBar AI
Track Claude & Codex usage in real time
- AI Checker: Humanize AI
AI Detector. Text, Image & PDF
- SEO do Futuro
AI-powered SEO analysis for Brazilian websites in 1min