AI News Archive: June 24, 2026 — Part 18

Sourced from 500+ daily AI sources, scored by relevance.

UAE employers face AI and cyber workforce risks
UAE employers face AI and cyber workforce risks Arabian Business
🌐 MovesJun 24, 2026https://www.arabianbusiness.com/abnews/uae-ai-cyber-workforce-risks
Family files wrongful death suit following Tesla crash in Texas
Victim's family files wrongful death suit following Tesla crash in Texas.
🌐 MovesJun 24, 2026https://www.engadget.com/2200784/family-files-wrongful-death-suit-following-tesla-crash-in-texas/
Tesla sued over deadly crash into a Texas home
Tesla sued over deadly crash into a Texas home Austin American-Statesman
🌐 MovesJun 24, 2026https://www.statesman.com/business/article/tesla-lawsuit-katy-autopilot-fatal-crash-22318613.php
Modulate Launches AI Music Detection API to Help Platforms Verify AI-Generated Music at Scale
Modulate Launches AI Music Detection API to Help Platforms Verify AI-Generated Music at Scale USA Today
🌐 MovesJun 24, 2026https://www.usatoday.com/press-release/story/35467/modulate-launches-ai-music-detection-api-to-help-platforms-verify-ai-generated-music-at-scale/
Modulate Launches AI Music Detection API to Help Platforms Verify AI-Generated Music at Scale
Modulate Launches AI Music Detection API to Help Platforms Verify AI-Generated Music at Scale azcentral.com and The Arizona Republic
🌐 MovesJun 24, 2026https://www.azcentral.com/press-release/story/87130/modulate-launches-ai-music-detection-api-to-help-platforms-verify-ai-generated-music-at-scale/
Seattle pay software company Syndio makes first acquisition, buying AI startup
The Seattle company is making the acquisition as it seeks to build more AI tools.
💰 MoneyJun 24, 2026https://www.bizjournals.com/seattle/news/2026/06/23/syndio-maria-colacurcio-embrace-ai-microsoft.html?ana=brss_6150
Pentesting Can T Keep Up With Ai Coding Report
Pentesting Can T Keep Up With Ai Coding Report Computing UK
🌐 MovesJun 24, 2026https://www.computing.co.uk/tag/undefined/news/2026/security/pentesting-can-t-keep-up-with-ai-coding-report
IITPSA Skills Survey to probe AI's impact on SA ICT jobs
The 2026 IITPSA ICT Skills Survey is now open, with a focus on how AI is affecting recruitment, jobs and IT professionals in South Africa.
🌐 MovesJun 24, 2026https://www.itweb.co.za/article/iitpsa-skills-survey-to-probe-ais-impact-on-sa-ict-jobs/kYbe9MXbP6NvAWpG
Boston Dynamics aims for big growth in Waltham
Boston Dynamics aims for big growth in Waltham The Boston Globe
🌐 MovesJun 24, 2026https://www.bostonglobe.com/2026/06/24/business/robots-humanoid-boston-dynamics-waltham/
All the world's a robot-staging ground for tech entrepreneurs building 'physical AI'
All the world's a robot-staging ground for tech entrepreneurs building 'physical AI' Dallas News
🌐 MovesJun 24, 2026https://www.dallasnews.com/business/article/all-the-world-s-a-robot-staging-ground-for-tech-22318170.php
All the world’s a robot-staging ground for tech entrepreneurs building ‘physical AI’
All the world’s a robot-staging ground for tech entrepreneurs building ‘physical AI’ Boston Herald
🌐 MovesJun 24, 2026https://www.bostonherald.com/2026/06/24/all-the-worlds-a-robot-staging-ground-for-tech-entrepreneurs-building-physical-ai/
AI companies stabilize after rout and oil continues slide as Trump threatens major drillers
AI companies stabilize after rout and oil continues slide as Trump threatens major drillers Dallas News
🌐 MovesJun 24, 2026https://www.dallasnews.com/news/world/article/asian-stocks-are-mixed-after-big-tech-sell-off-22318273.php
Inventor of Crispr is skeptical about AI’s impact on medical innovation
Inventor of Crispr is skeptical about AI’s impact on medical innovation East Bay Times
🌐 MovesJun 24, 2026https://www.eastbaytimes.com/2026/06/24/biotech-visionary-is-skeptical-about-ais-impact-on-medical-innovation/
Senior Kubernetes Platform Engineer - AI/ML Infrastructure
Senior Kubernetes Platform Engineer - AI/ML Infrastructure Built In
🌐 MovesJun 24, 2026https://builtin.com/job/senior-kubernetes-platform-engineer-ai-ml-infrastructure/9882431
Staff Software Engineer, AI/ML
Staff Software Engineer, AI/ML Built In
🌐 MovesJun 24, 2026https://builtin.com/job/staff-software-engineer-ai-ml/9873798
Learning Action Priors for Cross-embodiment Robot Manipulation
Most Vision-Language-Action (VLA) models build on a Vision-Language Model (VLM) backbone by attaching an action module and optimizing the full policy jointly. This design inherits strong visual and linguistic priors from the VLM, but leaves the action module to learn physical motion almost from scra...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.26095v1
Same Evidence, Different Answer: Auditing Order Sensitivity in Multimodal Large Language Models
Standard benchmarks for multimodal large language models (MLLMs) score each item on one canonical ordering and miss whether order-irrelevant shuffling changes the answer, a baseline reliability property called for by emerging AI evaluation guidelines. We introduce Facet-Probe, a five-facet audit (op...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.26079v1
A cross-process welding penetration status prediction algorithm based on unsupervised domain adaptation in laser and TIG welding
Supervised deep learning has been widely used for weld penetration state classification; however, its performance often degrades significantly under domain shift, such as when transferring models between welding processes with distinct physical mechanisms:for instance, from arc-dominated tungsten in...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.26078v1
A welding penetration prediction model for laser welding process based on self-supervised learning using physics-informed neural networks
The laser welding full-penetration is of critical importance, as it constitutes one of the fundamental factors in achieving defect-free welded joints. Accurate prediction of the penetration state is therefore essential for ensuring weld quality. To this end, this paper introduces SimPhysNet, a novel...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.26059v1
FedReLa: Imbalanced Federated Learning via Re-Labeling
Federated learning has emerged as the foremost approach for decentralized model training with privacy preservation. The global class imbalance and cross-client data heterogeneity naturally coexist, and the mismatch between local and global imbalances exacerbates the performance degradation of the ag...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.26037v1
TriViewBench: Controlled Complexity Scaling for Multi-View Structural Reasoning in MLLMs
Multimodal Large Language Models (MLLMs) demonstrate strong performance on standard visual question answering benchmarks, yet their scalability under controlled structural complexity remains poorly understood. We introduce TriViewBench, a controlled three-view visual reasoning benchmark constructed ...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.26029v1
Taxonomy-aware deep learning for hierarchical marine species classification in underwater imagery
Automated classification of marine species from underwater imagery is essential for scalable ocean biodiversity monitoring and conservation policy. Existing approaches struggle with severe domain shift across collection platforms, fine-grained visual similarity between closely related species, and u...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.25989v1
Tensorion: A Tensor-Aware Generalization of the Muon Optimizer
Common first-order optimizers, such as Adam, implicitly treat each parameter block as an unstructured vector, which disregards the multilinear weight structure present in many modern machine learning models. Recent work has shown that exploiting matrix structure can improve optimization dynamics. A ...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.25975v1
In-context Region-based Drag: Drag Any Region to Any Shape
Diffusion models have shown promise in drag-style editing. Previous works mainly focus on point-based drag, which is inherently ambiguous. This paper focuses on region-based drag and introduces a novel In-Context Region-based Drag (ICRDrag) method. Under the in-context learning framework, ICRDrag co...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.25907v1
OracleAnalyser: Analysing Implicit Semantics of Oracle Bone Scripts through MLLMs with Post-training
With the advancement of artificial intelligence, research on oracle bone scripts has entered a new era. However, existing methods and benchmarks remain largely confined to recognition tasks, overlooking the equally crucial aspect of oracle bone analysis. To address this gap, we propose OracleAnalyse...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.25906v1
Color Matters: Trigger Color Affects Success in Federated Backdoor Attacks
Federated learning is vulnerable to backdoor attacks in which malicious clients inject poisoned updates while preserving benign-task performance. In this paper, we study a semantics-driven backdoor mechanism in which attackers use natural visual accessories as triggers and manipulate only the trigge...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.25858v1
Liner Developer Platform
Build search agents with 10x cheaper web search
🧰 ToolsJun 24, 2026https://www.producthunt.com/products/liner-developer-platform?utm_campaign=producthunt-api&utm_medium=api-v2&utm_source=Application%3A+the500feed+%28ID%3A+283491%29
Hybrid deep learning-based phase diversity method for wavefront reconstruction
The efficiency of high-power laser systems is limited by wavefront distortions in the beam, particularly non-common path aberrations, which reduce the peak intensity at the focal plane. Compensating for these aberrations requires the calibration of the adaptive optics system. Conventional calibratio...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.25855v1
Graph it first! Enabling Reasoning on Long-form Egocentric Videos through Scene Graphs
Existing multi-modal large language models (MLLMs) face significant challenges in processing long video sequences due to strict input token limitations. As a result, current video understanding approaches, especially in egocentric settings characterized by complex dynamics, frequent state changes, a...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.25842v1
Edges Before Embeddings: A Confidence-Aware Blur Gate for Vision-Language Pipelines
Production vision pipelines silently degrade on blurry input, wasting compute on downstream OCR, retrieval, and vision-language model (VLM) calls that cannot recover a usable output. We present MagikaDocumentFromPixel, a lightweight, CPU-friendly image quality gate that classifies a single image as ...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.25838v1
ShutterMuse: Capture-Time Photography Guidance with MLLMs
Real-world photography requires capture-time guidance for both camera framing and subject pose. Yet existing aesthetic cropping benchmarks mainly evaluate post-hoc crop prediction and overlook subject-side recommendations, leaving the capture-time guidance capabilities of multimodal large language m...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.25763v1
Uncertainty Quantification for Computer-Use Agents: A Benchmark across Vision-Language Models and GUI Grounding Datasets
Computer-use agents turn vision-language model (VLM) predictions into executable GUI clicks, so reliable uncertainty estimates are essential for rejection, calibration, miss-severity ranking, and spatial safety regions. Yet evidence on post-hoc uncertainty quantification (UQ) for these agents is fra...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.25760v1
UniTeD: Unified Temporal Diffusion for Joint Perception and Planning in Autonomous Driving
Diffusion models have shown strong potential for multi-modal planning in end-to-end autonomous driving. However, most existing methods confine diffusion to the planning module, conditioning on fixed outputs from separate discriminative perception networks. This decoupled design propagates perception...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.25736v1
Towards a Dynamic and Fixed-budget Memory Bank for Efficient Streaming Video Understanding
Currently, streaming video understanding is still a daunting task for existing \emph{multimodal large language models} (MLLMs). Its difficulties not only lie in handling the ever-increasing video frames, but also in the unpredictability of future video content and input instructions. In this paper, ...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.25658v1
Auto-Labelling-Based Domain Transfer for 3D Object Detection on a Bicycle-Mounted LiDAR Platform
Reliable 3D perception of vulnerable road users (VRUs) such as cyclists and pedestrians is essential for their safety in urban traffic and a core requirement for autonomous driving (AD). Alongside advances in vehicle-based perception, research increasingly equips bicycles with sensors to study traff...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.25652v1
SSMNBench: Diagnosing Image-based Cross-View Human-Object Understanding via Single-View Sufficiency and Multi-View Necessity
Multimodal Large Language Models (MLLMs) have shown remarkable progress in single-image perception, yet their ability to reason about complex cross-view human-centric scenes remains largely unverified. Current multi-view benchmarks evaluate models using a fixed "bag of frames" and thus conflate a mo...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.25634v1
Expresso-AI: Explainable Video-Based Deep Learning Models for Depression Diagnosis
Given the widespread prevalence of depression and its consequential impact on individuals and society, it is crucial to obtain objective measures for early diagnosis and intervention. As a multidisciplinary topic, these objective measures should be interpretable and accessible to health care profess...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.25606v1
TryOnCrafter: Unleashing Camera Trajectories for Realistic Video Virtual Try-on via a Renderable 4D Try-on Proxy
While Video Virtual Try-on (VVT) has achieved remarkable progress in synthesizing realistic garment overlays on dynamic subjects, existing paradigms remains fundamentally constrained by a passive dependency on source camera trajectories, failing to accommodate the requisite interactive freedom for o...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.26092v1
MVTrack4Gen: Multi-View Point Tracking as Geometric Supervision for 4D Video Generation
Synthesizing a novel-view video from a monocular reference video along a target camera trajectory requires both geometric consistency and motion fidelity with respect to the reference video. Existing methods based on explicit 3D representations are limited by the accuracy of off-the-shelf reconstruc...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.26087v1
DomainShuttle: Freeform Open Domain Subject-driven Text-to-video Generation
Open domain subject-driven text-to-video (S2V) generation has drawn significant interest in academia and industry. Open domain S2V mainly involves two scenarios: in-domain, which requires retaining the reference subject features as much as possible, and cross-domain, which preserves the intrinsic fe...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.26058v1
RoboAtlas: Contextual Active SLAM
We present RoboAtlas, a contextual Active SLAM framework that adaptively balances geometric exploration and semantic reasoning using a scalable 3D semantic mapping system, OpenRoboVox. RoboAtlas integrates frontier exploration, global semantic-map reasoning, and egocentric VLM-based reasoning throug...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.26046v1
How Robust is OCR-Reasoning? Evaluating OCR-Reasoning Robustness of Vision-Language Models under Visual Perturbations
Vision-language models (VLMs) have achieved strong performance on OCR-based benchmarks and increasingly focused on text-rich understanding, but their robustness under controlled visual degradation remains insufficiently understood. This gap is critical for OCR reasoning, where visual corruption can ...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.26041v1
Maskin
AI made individuals faster. Maskin makes companies smarter.
🧰 ToolsJun 24, 2026https://www.producthunt.com/products/maskin?utm_campaign=producthunt-api&utm_medium=api-v2&utm_source=Application%3A+the500feed+%28ID%3A+283491%29
In-Context World Modeling for Robotic Control
Modern Vision-Language-Action (VLA) models often fail to generalize to novel setups, such as altered camera viewpoints or robot morphologies, because they are typically conditioned only on current observations and language instructions. By ignoring the underlying system configuration as a variable, ...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.26025v1
MIMFlow: Integrating Masked Image Modeling with Normalizing Flows for End-to-End Image Generation
Normalizing Flows (NFs) are powerful generative models capable of exact density estimation and sampling. However, their strict invertibility often forces the model to exhaust its capacity on low-level pixel details, hindering the capture of high-level semantic structures. While Masked Image Modeling...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.26016v1
From Sparse and Imperfect 2D Anchors to Consistent 3D Gaussian Street Scenes: Support-Aware Appearance
Image priors can synthesize target conditions for 3D Gaussian street scenes, but independently edited views do not define a coherent 3D target. Direct fitting can propagate view-specific noise, while existing pipelines do not jointly handle imperfect sparse anchors and standard-rasterizer deployment...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.26007v1
A Benchmark for Heterogeneous Stereo Deblurring with Physically- and Epipolar-constrained Cross Attention
Modern stereo-capable smartphones enable immersive XR content capture. However, hardware heterogeneity across camera modules often causes severe asymmetric blur artifacts. Existing methods and benchmarks largely assume homogeneous stereo setups and therefore do not explicitly address such asymmetric...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.25962v1
Pulmonary Embolism Risk Stratification from CTPA and Medical Records: Vascular Graphs Are Not All You Need
Risk stratification for pulmonary embolism (PE) is critical for clinical decision-making. Stratification guidelines are based on patient medical records, parameters measured from computed tomography pulmonary angiography (CTPA), and blood tests. However, blood tests are often missing in routine prac...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.25956v1
DSP-SLAM++: A Unified Framework for Multi-Class, High-Fidelity Object SLAM in the Wild
Existing object-aware SLAM systems force a trade-off between real-time performance, multi-class support, and the generation of high-fidelity, semantically coherent object models. To address this trade-off, we present DSP-SLAM++, which extends the DSP-SLAM framework with an asynchronous mapping pipel...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.25953v1
FunPiQ: A New Benchmark for Pixel-Level Quality Assessment in Fundus Images
Color fundus photography (CFP) is the most common ophthalmic imaging modality for large-scale screening. However, it is highly susceptible to degradations, making robust fundus image quality assessment (FIQA) crucial. The criteria for what constitutes high-quality at the image level vary across clin...
📄 ResearchJun 24, 2026http://arxiv.org/abs/2606.25915v1