AI News Archive: May 5, 2026 — Part 28
Sourced from 500+ daily AI sources, scored by relevance.
- Nora: Normalized Orthogonal Row Alignment for Scalable Matrix Optimizer
Matrix-based optimizers have demonstrated immense potential in training Large Language Models (LLMs), however, designing an ideal optimizer remains a formidable challenge. A superior optimizer must satisfy three core desiderata: efficiency, achieving Muon-like preconditioning to accelerate optimizat...
- Arcana Labs
AI studio for image and video creation
- Vanishing L2 regularization for the softmax Multi Armed Bandit
Multi Armed Bandit (MAB) algorithms are a cornerstone of reinforcement learning and have been studied both theoretically and numerically. One of the most commonly used implementation uses a softmax mapping to prescribe the optimal policy and served as the foundation for downstream algorithms, includ...
- GEM-FI: Gated Evidential Mixtures with Fisher Modulation
Evidential Deep Learning (EDL) enables single-pass uncertainty estimation by predicting Dirichlet evidence, but it can remain overconfident and poorly calibrated, and it often fails to represent multi-modal epistemic uncertainty. We introduce Gated Evidential Mixtures (GEM), a family of models that ...
- Predicting missing values: A good idea?
Minimizing the Mean Squared Error (MSE) is a key objective in machine learning and is commonly used for imputing missing values. While this approach provides accurate point estimates, it introduces systematic biases in downstream analyses. These biases affect key parameters such as variance, prevale...
- Distribution-Free Pretraining of Classification Losses via Evolutionary Dynamics
We propose Evolutionary Dynamic Loss (EDL), a framework that learns a transferable classification loss in the probability space using unlimited synthetic prediction-label pairs, without accessing real samples during the main loss pretraining stage. EDL parameterizes the loss as a lightweight network...
- Information Plane Analysis of Binary Neural Networks
Information plane (IP) analysis has been suggested to study the training dynamics of deep neural networks through mutual information (MI) between inputs, representations, and targets. However, its statistical validity is often compromised by the difficulty of estimating MI from samples of high-dimen...
- Free Decompression with Algebraic Spectral Curves
Tools from random matrix theory have become central to deep learning theory, using spectral information to provide mechanisms for modeling generalization, robustness, scaling, and failure modes. While often capable of modeling empirical behavior, practical computations are limited by matrix size, of...
- A Few-Step Generative Model on Cumulative Flow Maps
We propose a unified, few-step generative modeling framework based on \emph{cumulative flow maps} for long-range transport in probability space, inspired by flow-map techniques for physical transport and dynamics. At its core is a cumulative-flow abstraction that connects local, instantaneous update...
- Leveraging Code Automorphisms for Improved Syndrome-Based Neural Decoding
Syndrome-based neural decoding (SBND) has emerged as a promising deep learning approach for soft-decision decoding of high-rate, short-length codes. However, this approach still has substantial room for improvement. In this paper, we show how to leverage code automorphisms to enhance the ability of ...
- PHALAR: Phasors for Learned Musical Audio Representations
Stem retrieval, the task of matching missing stems to a given audio submix, is a key challenge currently limited by models that discard temporal information. We introduce PHALAR, a contrastive framework achieving a relative accuracy increase of up to $\approx 70\%$ over the state-of-the-art while re...
- Exact ReLU realization of tensor-product refinement iterates
We study scalar dyadic refinement operators on R^2 of the form (Vf)(x,y) = sum_{(j,k) in Z^2} c_{j,k} f(2x-j, 2y-k), where only finitely many mask coefficients c_{j,k} are nonzero. Under a fixed support-window hypothesis, we prove that for every compactly supported continuous piecewise linear seed g...
- Ecologically-Constrained Task Arithmetic for Multi-Taxa Bioacoustic Classifiers Without Shared Data
Training data for bioacoustics is scattered across taxa, regions, and institutions. Centralizing it all is often infeasible. We show that independently fine-tuned BEATs encoders can be composed into a unified 661-species classifier via task vector arithmetic without sharing data. We find that bioaco...
- Raising the Ceiling: Better Empirical Fixation Densities for Saliency Benchmarking
Empirical fixation densities, spatial distributions estimated from human eye-tracking data, are foundational to saliency benchmarking. They directly shape benchmark conclusions, leaderboard rankings, failure case analyses, and scientific claims about human visual behavior. Yet the standard estimatio...
- Complex Equation Learner: Rational Symbolic Regression with Gradient Descent in Complex Domain
Symbolic regression aims to discover interpretable equations from data, yet modern gradient-based methods fail for operators that introduce singularities or domain constraints, including division, logarithms, and square roots. As a result, Equation Learner-type models typically avoid these operators...
- The Manokhin Probability Matrix: A Diagnostic Framework for Classifier Probability Quality
The Brier score conflates two distinct properties of probabilistic predictions: reliability (calibration error) and resolution (discriminatory power). We introduce the Manokhin Probability Matrix, a BCG-style two-dimensional diagnostic framework that separates them. Classifiers are placed on a 2x2 g...
- Graph Convolutional Support Vector Regression for Robust Spatiotemporal Forecasting of Urban Air Pollution
Urban air quality forecasting is challenging because pollutant concentrations are nonlinear, nonstationary, spatiotemporally dependent, and often affected by anomalous observations caused by traffic congestion, industrial emissions, and seasonal meteorological variability. This study proposes a Grap...
- Training-Free Probabilistic Time-Series Forecasting with Conformal Seasonal Pools
We propose Conformal Seasonal Pools (CSP), a training-free probabilistic time-series forecaster that mixes same-season empirical draws with signed residual draws around a seasonal naive forecast. In an audited rolling-origin benchmark on the six time-series datasets where DeepNPTS was originally eva...
- Low Rank Tensor Completion via Adaptive ADMM
We consider a novel algorithm, for the completion of partially observed low-rank tensors, as a generalization of matrix completion. The proposed low-rank tensor completion (TC) method builds on the conventional nuclear norm (NN) minimization-based low-rank TC paradigm, by leveraging the alternating ...
- Tempered Guided Diffusion
Training-free conditional diffusion provides a flexible alternative to task-specific conditional model training, but existing samplers often allocate computation inefficiently: independent guided trajectories can vary widely in quality, and additional function evaluations along a single trajectory m...
- Amortized Variational Inference for Joint Posterior and Predictive Distributions in Bayesian Uncertainty Quantification
Bayesian predictive inference propagates parameter uncertainty to quantities of interest through the posterior-predictive distribution. In practice, this is typically performed using a two-stage procedure: first approximating the posterior distribution of model parameters, and then propagating poste...
- Uni-OPD: Unifying On-Policy Distillation with a Dual-Perspective Recipe
On-policy distillation (OPD) has recently emerged as an effective post-training paradigm for consolidating the capabilities of specialized expert models into a single student model. Despite its empirical success, the conditions under which OPD yields reliable improvement remain poorly understood. In...
- Most ReLU Networks Admit Identifiable Parameters
We study the realization map of deep ReLU networks, focusing on when a function determines its parameters up to scaling and permutation. To analyze hidden redundancies beyond these standard symmetries, we introduce a framework based on weighted polyhedral complexes. Our main result shows that for ev...
- SigLoMa: Learning Open-World Quadrupedal Loco-Manipulation from Ego-Centric Vision
Designing an open-world quadrupedal loco-manipulation system is highly challenging. Traditional reinforcement learning frameworks utilizing exteroception often suffer from extreme sample inefficiency and massive sim-to-real gaps. Furthermore, the inherent latency of visual tracking fundamentally con...
- Robust Visual SLAM for UAV Navigation in GPS-Denied and Degraded Environments: A Multi-Paradigm Evaluation and Deployment Study
Reliable localization in GPS-denied, visually degraded environments is critical for autonomous UAV opera- tions. This paper presents a systematic comparative evaluation of five V-SLAM systems ORB-SLAM3, DPVO, DROID-SLAM, DUSt3R, and MASt3R spanning classical, deep learning, recurrent, and Vision Tra...
- Learning Reactive Dexterous Grasping via Hierarchical Task-Space RL Planning and Joint-Space QP Control
In this work, we propose a hybrid hierarchical control framework for reactive dexterous grasping that explicitly decouples high-level spatial intent from low-level joint execution. We introduce a multi-agent reinforcement learning architecture, specialized into distinct arm and hand agents, that act...
- Evaluating Generative Models as Interactive Emergent Representations of Human-Like Collaborative Behavior
Human-AI collaboration requires AI agents to understand human behavior for effective coordination. While advances in foundation models show promising capabilities in understanding and showing human-like behavior, their application in embodied collaborative settings needs further investigation. This ...
- Bridging the Embodiment Gap: Disentangled Cross-Embodiment Video Editing
Learning robotic manipulation from human videos is a promising solution to the data bottleneck in robotics, but the distribution shift between humans and robots remains a critical challenge. Existing approaches often produce entangled representations, where task-relevant information is coupled with ...
- BifrostUMI: Bridging Robot-Free Demonstrations and Humanoid Whole-Body Manipulation
High-quality data collection is a fundamental cornerstone for training humanoid whole-body visuomotor policies. Current data acquisition paradigms predominantly rely on robot teleoperation, which is often hindered by limited hardware accessibility and low operational efficiency. Inspired by the Univ...
- On Surprising Effects of Risk-Aware Domain Randomization for Contact-Rich Sampling-based Predictive Control
Domain randomization (DR) is widely used in policy learning to improve robustness to modeling error, but remains underexplored in contact-rich sampling-based predictive control (SPC), where rollout quality is highly sensitive to uncertainty. In this work, we take the first step by studying risk-awar...
- Neural Control: Adjoint Learning Through Equilibrium Constraints
Many physical AI tasks are governed by implicit equilibrium: an agent actuates a subset of degrees of freedom (boundary DoFs), while the remaining free DoFs settle by minimizing a total potential energy. Even seemingly basic tasks such as bending a deformable linear object (DLO) to a target shape ca...
- Robust Path Tracking for Vehicles via Continuous-Time Residual Learning: An ICODE-MPPI Approach
Model Predictive Path Integral (MPPI) control is a powerful sampling-based strategy for nonlinear autonomous systems. However, its performance is often bottlenecked by the fidelity of nominal dynamics. We propose ICODE-MPPI, a robust framework that leverages Input Concomitant Neural Ordinary Differe...
- Stochastic Schrödinger Diffusion Models for Pure-State Ensemble Generation
In quantum machine learning (QML), classical data are often encoded as quantum pure states and processed directly as quantum representations, motivating representation-level generative modeling that samples new quantum states from an underlying pure-state ensemble rather than re-preparing them from ...
- Lety.ai
The Infrastructure Behind AI Agencies | White-Label Platform
- Understanding Self-Supervised Learning via Latent Distribution Matching
Self-supervised learning (SSL) excels at finding general-purpose latent representations from complex data, yet lacks a unifying theoretical framework that explains the diverse existing methods and guides the design of new ones. We cast SSL as latent distribution matching (LDM): learning representati...
- A Hierarchical Sampling Framework for bounding the Generalization Error of Federated Learning
We study expected generalization bounds for the Hierarchical Federated Learning (HFL) setup using Wasserstein distance. We introduce a generalized framework in which data is sampled hierarchically, and we model it with a multi-layered tree structure that induces dependencies among the clients' datas...
- Adaptive Estimation and Optimal Control in Offline Contextual MDPs without Stationarity
Contextual MDPs are powerful tools with wide applicability in areas from biostatistics to machine learning. However, specializing them to offline datasets has been challenging due to a lack of robust, theoretically backed methods. Our work tackles this problem by introducing a new approach towards a...
- Integrating Feature Correlation in Differential Privacy with Applications in DP-ERM
Standard differential privacy imposes uniform privacy constraints across all features, overlooking the inherent distinction between sensitive and insensitive features in practice. In this paper, we introduce a relaxed definition of differential privacy that accounts for such heterogeneity, allowing ...
- On the Spectral Structure and Objective Equivalence of Orthogonal Multilabel Fisher Discriminants
We provide a unified theoretical analysis of Linear Discriminant Analysis with simultaneous multilabel scatter matrix formulations and Stiefel orthogonality constraints. Our contributions span both algebraic structure and statistical guarantees. On the algebraic side, we characterize the rank of the...
- On Model-Based Clustering With Entropic Optimal Transport
We develop a new methodology for model-based clustering. Optimizing the log-likelihood provides a principled statistical framework for clustering, with solutions found via the EM algorithm. However, because the log-likelihood is nonconvex, only convergence to stationary points can be guaranteed, and...
- Kilo Code v7 for VS Code
Parallel agents, diff reviewer, and multi-model comparisons
- Flowstep 1.0
AI design engineer to turn your thoughts into editable UI
- Waydev Agent
Prove ROI and see if your AI spend is actually paying off
- Ghostwriter
Write and publish posts on LinkedIn & X
- Oriane
The perception layer for Marketers and their AIs
- Intuned Agent
Production browser automation, built and maintained by AI
- Firstwork
Agentic AI for frontline hiring and onboarding
- Hestus
Native CAD autocomplete — 2.5x faster, 4x fewer clicks
- Dina
From screen to polished video in minutes
- Unity AI
AI agents built directly into Unity workflows