AI News Archive: June 10, 2026 — Part 4
Sourced from 500+ daily AI sources, scored by relevance.
- Bank of America Highlights Surging Demand for AI-Led Treasury, Fx Solutions in Asia Pacific
Bank of America Highlights Surging Demand for AI-Led Treasury, Fx Solutions in Asia Pacific The Straits Times
- How AMD Is Mimicking Nvidia’s Circular Funding Deals
How AMD Is Mimicking Nvidia’s Circular Funding Deals Barron's
Score: 70🌐 MovesJun 10, 2026https://www.barrons.com/articles/amd-stock-price-nvidia-circular-financing-6cc9c0c2 - New framework for auditing machine unlearning
Algorithms & Theory
Score: 70🌐 MovesJun 10, 2026https://research.google/blog/new-framework-for-auditing-machine-unlearning/ - A California city just approved $3.15 million in police drones that respond to 911 calls in 30 seconds
Stockton, California, has approved a $3.15 million investment in police drones that can respond to 911 calls in as little as 30 seconds. The city council voted 7-0 to expand its contract with Flock Safety, adding a drone-as-first-responder platform to the automatic licence plate readers the company already supplies. The total contract value now exceeds […] This story continues at The Next Web
Score: 70🌐 MovesJun 10, 2026https://thenextweb.com/news/stockton-flock-drones-first-responder-surveillance - UK minister for AI calls for more attractive datacentre builds
Drawing an analogy with the pleasing aesthetics of St Pancras and King’s Cross railway stations, the UK government’s minister for artificial intelligence and online safety , Kanishka Narayan MP, used his presentation at this week’s AI Summit , part of London Tech Week , to encourage datacentre builds that people can be proud of. “Our industrial legacy is written into the British landscape,” he said. “Infrastructure shapes how people feel about the places they live in. Those of you who have the misfortune of arriving here via Euston Station would appreciate this truth.” Narayan pointed out that St Pancras was built for the railway age – to move people, goods and opportunity across the country. “But it was also built with civic ambition and enhanced London and embodied the ambition of our communities,” he said. In contrast, Narayan described Euston as “a functional, but frankly bland and ugly building” – a symbol of what happens when architects forget the civic purpose of infrastructure. “ Datacentres are the new railway stations , power stations, telephone exchanges of the intelligence economy,” he said. “They will power the models, the services, the business and public tools that will define the next industrial revolution. They are also set in real places, near real communities.” Rather than being buildings that communities tolerate or put up with, Narayan called for datacentres’ build design to be something people can be proud of. Narayan used his speech to announce the RIBA x DSIT Data Centre Design Challenge, a government-backed design competition run by the Department for Science Innovation and Technology (DSIT) and the Royal Institute of British Architects (RIBA) that seeks to ensure that as datacentres grow, they deliver for the communities around them too. The idea is to encourage collaboration between architects, designers, engineers and communities to raise the bar on high-quality design, meaningful public engagement and sustainable environmental outcomes, which, according to DSIT, will reimagine datacentres not just as critical national infrastructure, but as places of genuine civic value. He said the competition aims to raise the bar of datacentre design and is the first government-backed competition where architects, designers and engineers are being asked to design datacentres “with meaningful public engagement, strong environmental outcomes and genuine civic value”. He added: “If a community is helping power the AI age, it should be able to see that contribution with pride.” AI that works for workers Along with the datacentre design competition, the UK government is also encouraging a more pro-work approach to AI deployment. During his speech at the AI Summit, Narayan said the government is launching a Pro-Worker AI Adoption Prize , which he said would celebrate organisations adopting pro-worker AI, encouraging other firms to follow their lead. “We want businesses, workers, unions and investors to nominate organisations at the cutting edge of pro-worker AI adoption – not just adopting AI, but using AI to create new products, new jobs and new tasks in a way that boosts the demand for human expertise,” he said. The top 50 organisations nominated will be shortlisted by an expert panel of judges, chaired by the Nobel Memorial Prize-winning economist Simon Johnson. “Business cases of pro-worker AI adoption will be written up and taught in leading UK business schools so the next generation of people in this country look at pro-worker AI adoption too,” Narayan added. Read more about datacentre developments The great datacentre backlash – the campaigners : In part one of a series looking at attitudes to datacentres, we look at the organisations that oppose new builds. Could an environmental legal challenge derail UK government’s fast-tracked datacentre builds? The government is under fire after details emerged that it waved through three large-scale datacentre planning applications without conducting an environmental impact assessment first.
Score: 70🌐 MovesJun 10, 2026https://www.computerweekly.com/news/366644212/UK-minister-of-AI-calls-for-more-attractive-datacentre-builds - CATL Invests in DeepSeek: Zeng Yuqun's AI Energy Strategy Takes Shape
CATL has invested in DeepSeek's first funding round, marking the battery giant's strategic entry into AI as Zeng Yuqun pivots toward AI data center energy infrastructure with over $1 billion in AIDC investments.
Score: 70💰 MoneyJun 10, 2026https://pandaily.com/catl-invests-deepseek-zeng-yuqun-ai-energy-strategy-jun2026 - Bill Gates warns Microsoft, Amazon, Google on data center push
Bill Gates has told the AI industry it does not have permission to drive up household electricity bills, warning Amazon, Google, Meta and Microsoft that the old utility-funded grid model is finished. Speaking on CNBC, the Microsoft co-founder said hyperscalers must now pick data centre sites where the economics and politics hold. With 48 projects worth $156 billion already blocked in 2025 and public opposition at record highs, the buildout is colliding with sentiment.
- Salesforce layoffs: Employees from Agentforce AI, Mulesoft IT teams handed pink slips
The enterprise software company has undertaken a fresh round of layoffs, affecting employees working on its Agentforce AI product, Mulesoft IT integration tool, and Marketing Cloud software, according to Business Insider.
- Art Director Union’s Rebuke To Scorsese On AI Highlights Divisions In Creative Industries
Can Hollywood's creative community stay united in the face of technology that is disrupting their industry?
- Anthropic accused of ‘secret sabotage’ as Claude Fable 5 silently limits capabilities for AI researchers and developers
Anthropic accused of ‘secret sabotage’ as Claude Fable 5 silently limits capabilities for AI researchers and developers Fortune
Score: 70🌐 MovesJun 10, 2026https://fortune.com/2026/06/10/anthropic-accu-claude-fable-5-limits-capabilities-ai-researchers-developers/ - Anti-Nvidia Data-Center Startup Is Valued at $1.55 Billion in New Funding Round
TensorWave will use a fresh $350 million to fill more data centers with chips from AMD, an investor.
- Surprise upset: GPT-5.5 beats Claude Fable 5 on brutal new Agents’ Last Exam benchmark
Researchers from the University of California, Berkeley's Center for Responsible, Decentralized Intelligence (RDI), alongside an advisory committee of over 300 domain experts, have launched Agents’ Last Exam (ALE) —a grueling new benchmark built to measure whether artificial intelligence can actually execute economically valuable, long-horizon professional workflows. In a shocking upset, OpenAI’s GPT-5.5 from April, operating through the Codex harness, secured the absolute top spot on the new ALE Leaderboard with a 24.0% pass rate, beating Anthropic's highly anticipated, brand new Mythos-class Claude Fable 5 model released just yesterday, which came in third with a score of 22.0%. Rather than testing models on isolated coding puzzles, ALE is explicitly designed as an instrument to close the gap between academic benchmark hype and real, GDP-relevant labor impact. And right now, the data proves the most advanced models in the world are fundamentally failing the exam. Ending the Era of 'Cheating' and Brittle Graders The fundamental shift in ALE lies in its evaluation architecture and the demands it places on the agent. Historically, AI benchmarks have relied on static question-answering or narrow, text-based terminal environments. More recent agentic evaluations introduced multi-step interaction but suffered from severe grading issues. As noted in recent independent audits of older leaderboards like SWE-Bench Pro, automated verifiers frequently reject correct solutions, and certain models—specifically the Claude Opus family—have been caught "cheating" by reading hidden answer keys in a container's Git history rather than solving the underlying problem. ALE neutralizes these loopholes by forcing models into a strict Generalist Computer-Use Agent (GCUA) framework. To pass, an agent cannot merely execute terminal commands. The benchmark maps capability across five functional layers: Brain (reasoning), Eyes (visual perception), Body (orchestration), Hands (tool invocation), and Feet (runtime substrate). An agent must use its "Eyes" and "Hands" to navigate Linux or Windows virtual machines, interleaving shell scripting with point-and-click operations inside heavy desktop software. Crucially, ALE almost entirely rejects the unpredictable "LLM-as-a-judge" grading paradigm, relying on it for a mere 6.8% of its workflows. If a task involves generating a 3D mesh or parsing SEC filings, the benchmark uses deterministic, code-based evaluation to compare the agent's artifact against an expert's ground-truth reference. Measuring Task Performance Across 55 Industries ALE launches with 1,490 task instances and is scaling toward a massive 5,000-task target. What makes the product remarkable is its authenticity. The tasks are strictly anchored in the U.S. federal occupational taxonomy (O*NET / SOC 2018) , covering 55 non-physical industry sub-domains. The workflows are sourced directly from the professional histories of industry practitioners. Agents are asked to perform 3D model creation in Siemens NX, scene setup in Unreal Engine, neuroimaging analysis in FSLeyes, and visual effects compositing in Adobe After Effects. When faced with these authentic, long-horizon workflows, the limitations of current AI are glaring. ALE divides its tasks into three difficulty tiers: Near-Term, Full-Spectrum, and Last-Exam. Top 5 Agentic Harnesses on the ALE Leaderboard Rank Agent Harness Underlying Model Pass Rate Mean Score 1 Codex gpt-5-5 24.0% 42.8% 2 Ale Claw gpt-5-5 23.0% 45.8% 3 Claude Code claude-fable-5 22.0% 40.5% 4 OpenClaw gpt-5-5 21.1% 41.0% 5 Cursor CLI composer-2-5 20.4% 38.5% The victory of GPT-5.5 aligns with recent third-party analysis suggesting that OpenAI's models are currently superior at strictly adhering to multi-part, complex prompts. Conversely, users report Anthropic's Claude architecture can sometimes be "forgetful" with multi-part instructions, abandoning required steps mid-workflow — a fatal flaw in ALE's rigorous pipeline. And while hitting a 24.0% pass rate is enough to claim the crown, the absolute performance ceiling remains remarkably low. On the hardest "Last-Exam" tier — representing the frontier of professional difficulty — most configurations, including Anthropic's older Claude Opus 4.8 and Google's Gemini CLI, record a devastating 0.0% pass rate. Solving Benchmark Contamination A core vulnerability in modern AI evaluation is "benchmark contamination"—the phenomenon where test questions inevitably leak into the massive data lakes used to train next-generation models. Once a model memorizes the benchmark, the evaluation becomes entirely useless. ALE solves this through a dual-use deployment strategy. The project operates as an open-source research initiative, but it closely guards its evaluation data. Only about 10% of the dataset (roughly 150 tasks) is released publicly on platforms like GitHub and Hugging Face. The remaining 1,300+ tasks are kept strictly private. For developers and enterprise evaluators, this means ALE functions as a "living benchmark". Private tasks are systematically rotated into the public pool over time, while retired public tasks are swapped out. This rolling release ensures that the evaluation surface remains uncontaminated across successive model generations, giving enterprise buyers confidence that an agent's high score is earned , not memorized. Additionally, ALE provides transparency by tracking both "Full" and "Unlicensed" scores. Because real professional work often requires paid, proprietary software, the "Full" leaderboard incorporates tasks that rely on commercial CAD tools, paid APIs, or licensed datasets. The "Unlicensed" tier drops these license-gated tasks to provide a clean, like-for-like comparison using only freely available tools, ensuring models aren't simply rewarded for having access to paid enterprise software. Bottom Line: ALE Shows Even the Highest-Performing Models and Harnesses Have Room for Improvement For developers frustrated by the gap between marketing claims and actual production performance, ALE's brutal grading curve is highly validating. Zengyi Qin , an MIT PhD researcher and data contributor to the project, took to X to announce the launch, sharing images of the paper and the staggering 100+ institution contributor list. "Introducing Agents’ Last Exam (ALE)," Qin wrote. "Built by 300+ domain experts from 100+ institutions. Covering 55 industry domains. Claude Opus 4.8 has 0.0% pass rate on the hardest subset. Glad to have contributed to this benchmark". In a follow-up post highlighting the Hugging Face ArXiv paper link, Qin added: "Very solid work from project leads @YiyouSun @Xinyang_Han_ @dawnsongtweets and @BerkeleyRDI". As businesses deploy billions in capital betting on AI agents, they desperately need a compass that points true north. If an agent can eventually conquer the gauntlet of Agents' Last Exam, it won't just be passing a test—it will be proving it is ready to join the workforce. Until then, the sobering pass rates on the leaderboard serve as a necessary reality check for the entire AI ecosystem.
- SoftBank’s attempt to get $6 billion OpenAI margin loan stalls
SoftBank’s attempt to get $6 billion OpenAI margin loan stalls The Japan Times
Score: 70💰 MoneyJun 10, 2026https://www.japantimes.co.jp/business/2026/06/10/companies/softbank-openai-attempt-margin-loan/ - xAI fired an engineer who raised alarms about Grok safety, new lawsuit claims
A former xAI engineer is suing the company and SpaceX, alleging he was fired for raising AI safety concerns about Grok days before SpaceX's historic IPO.
Score: 69🌐 MovesJun 10, 2026https://techcrunch.com/2026/06/10/xai-fired-an-engineer-who-raised-alarms-about-grok-safety-new-lawsuit-claims/ - Microsoft’s Brad Smith: Graduates jeering AI are ‘telling us what we need to hear’
Microsoft President Brad Smith, in a new blog post and a GeekWire interview, argues that AI will reshape work rather than eliminate it, and says the company's own future depends on people staying employed. He acknowledges the tension with the tech industry's job cuts while contending that fields like computer science are changing, not disappearing. Read More
Score: 69🌐 MovesJun 10, 2026https://www.geekwire.com/2026/microsofts-brad-smith-graduates-jeering-ai-are-telling-us-what-we-need-to-hear/ - 72% of banks lack AI model 'kill switches,' failure reporting
Most banks are unprepared to stop or report an AI model run amok, a Wolters Kluwer study found.
Score: 69🌐 MovesJun 10, 2026https://www.americanbanker.com/news/72-of-banks-lack-ai-model-kill-switches-failure-reporting - Mathematical proof reveals why fixed AI guardrails can never block every jailbreak
Can we make artificial intelligence impervious to adversaries who want to twist the technology to nefarious ends? Though AI is among the newest of technologies, the answer to that question is nearly a century old.
Score: 69🌐 MovesJun 10, 2026https://techxplore.com/news/2026-06-mathematical-proof-reveals-ai-guardrails.html - Google won’t just admit it’s feeding YouTube creators to its music AI
If you've uploaded a song to YouTube, Google almost certainly considers your video fair game for training its Lyria music AI, it just won't admit it right now. A group of independent musicians is suing Google, claiming that it illegally used songs they uploaded to YouTube to train its Lyria 3 model. Google has filed […]
Score: 69🌐 MovesJun 10, 2026https://www.theverge.com/tech/947770/google-lyria-music-ai-lawsuit-youtube - AI Risk Worries Insurers and Businesses Alike
As companies adopt AI, many insurance firms are explicitly excluding AI risks, while others are forging ahead to create the right framework. What risks can firms reasonably manage?
Score: 69🌐 MovesJun 10, 2026https://www.darkreading.com/cyber-risk/ai-risk-worries-insurers-businesses-alike - Europe’s new e‑commerce agenda: How AI is resetting growth and competition
From agentic shopping to retail media and omnichannel intelligence, AI is changing the economics of digital commerce. Here’s how leaders can turn experimentation into a durable advantage.
- Rewiring retail in Europe: The AI imperative
AI is already reshaping the European retail value chain. The question now is how leaders modernize while strengthening the human experience—with a €240 billion to €320 billion opportunity at stake.
Score: 69🌐 MovesJun 10, 2026https://www.mckinsey.com/industries/retail/our-insights/rewiring-retail-in-europe-the-ai-imperative - Cybersecurity researchers aren’t happy about the guardrails on Anthropic’s Fable
Cybersecurity researchers are complaining that Anthropic's new model Fable has guardrails that are too strict for any cybersecurity work.
- Researchers develop AI-powered railway control system for efficient urban train operation
As cities continue to expand, railways are expected to become an important component of urban mobility systems. Compared with automobiles and air transport, railways are highly energy efficient and produce relatively low environmental impacts and therefore form an important component of sustainable transportation systems. Thus, researchers have been exploring advanced train control technologies for further improvement of operational efficiency, passenger comfort and punctuality while reducing energy consumption.
Score: 69🌐 MovesJun 10, 2026https://techxplore.com/news/2026-06-ai-powered-railway-efficient-urban.html - Siri Is Getting a Long-Awaited AI Overhaul, But Not if You Live in These 2 Regions
Siri Is Getting a Long-Awaited AI Overhaul, But Not if You Live in These 2 Regions PCMag UK
Score: 69🌐 MovesJun 10, 2026https://uk.pcmag.com/macos/165450/siri-gets-long-awaited-ai-overhaul-not-if-you-live-in-2-regions-wwdc-2026 - The AI boom is pushing up inflation
Most Americans think President Donald Trump’s tariffs led to higher prices.
Score: 69🌐 MovesJun 10, 2026https://www.politico.com/newsletters/morning-money/2026/06/10/the-ai-boom-is-pushing-up-inflation-00955526 - ChatGPT was caught recommending fake scam stores
Think twice before clicking retail links recommended by ChatGPT.
Score: 68🌐 MovesJun 10, 2026https://www.androidauthority.com/chatgpt-caught-recommending-scam-products-3676182/ - To discover new physics, AI may need to “unlearn” the old one
To discover new physics, AI may need to “unlearn” the old one EurekAlert!
- India Brings Its AI Pitch to VivaTech 2026
India’s VivaTech 2026 role gives its AI strategy a global stage as enterprises across APAC weigh regional AI partnerships, infrastructure control, and dependence on foreign chips and cloud platforms. The post India Brings Its AI Pitch to VivaTech 2026 appeared first on TechRepublic .
- McCarthy, Palantir partner on AI
The news comes as enterprise-level agreements on artificial intelligence gain ground among large contractors.
Score: 68🌐 MovesJun 10, 2026https://www.constructiondive.com/news/mccarthy-palantir-artificial-intelligence-ai-partnership/822517/ - ‘AI is not killing all these jobs’: LinkedIn boss on UK hiring slump
Hiring in the capital is down 32 per cent from January 2019, a steeper fall than in any other region tracked by LinkedIn, as higher interest rates and corporate cost-cutting weigh on firms that dominate London’s labour market. Across the UK, hiring has slowed by 24 per cent on pre-pandemic levels and 10 per cent [...]
Score: 68🌐 MovesJun 10, 2026https://www.cityam.com/ai-is-not-killing-all-these-jobs-linkedin-boss-on-uk-hiring-slump/ - Securing the AI workforce: Zscaler’s zero-trust play for agentic AI
Since Zscaler Inc.‘s launch, the company’s mission has been to disrupt traditional access and security with its Zero Trust platform. At its user event, Zenith Live, in Las Vegas, the company made its case for what its next act would look like: becoming the foundational “zero trust for agentic AI” platform. For enterprises, the keynote by […] The post Securing the AI workforce: Zscaler’s zero-trust play for agentic AI appeared first on SiliconANGLE .
Score: 68🌐 MovesJun 10, 2026https://siliconangle.com/2026/06/10/securing-ai-workforce-zscalers-zero-trust-play-agentic-ai/ - Zoomlion's Humanoid Robot Z01 Shines at KOMATEK 2026, Showcasing Advances in Embodied AI and Industrial Robotics
Zoomlion's Humanoid Robot Z01 Shines at KOMATEK 2026, Showcasing Advances in Embodied AI and Industrial Robotics The Straits Times
- South Korea reports first exam cheating cases using AI smart glasses
South Korea reports first exam cheating cases using AI smart glasses The Straits Times
Score: 68🌐 MovesJun 10, 2026https://www.straitstimes.com/asia/east-asia/south-korea-reports-first-exam-cheating-cases-using-ai-smart-glasses - Security at machine speed: why the SOC must be rebuilt for the AI era
As AI compresses attacks into minutes, SOCs shift to agentic operations for real-time detection and response.
Score: 68🌐 MovesJun 10, 2026https://www.techradar.com/pro/security-at-machine-speed-why-the-soc-must-be-rebuilt-for-the-ai-era - For Robotaxis, Safety Must Be Built In, Not Bolted On
A car pulls up to the curb. The app says, “Your ride is here.” No one’s in the driver’s seat. For people who live in one of the dozens of cities now hosting robotaxi services, this is already a reality. The robotaxi industry has moved from prototype milestones to commercial operations, with an expanding ecosystem […]
- Chinese Tech Giants Turn College Admissions Into Latest AI Battleground
Chinese Tech Giants Turn College Admissions Into Latest AI Battleground Caixin Global
- Why Anthropic's 'safe' Mythos-class model won't answer questions about cancer
Why Anthropic's 'safe' Mythos-class model won't answer questions about cancer Business Insider
Score: 68🌐 MovesJun 10, 2026https://www.businessinsider.com/anthropic-claude-fable-5-safeguards-block-requests-cybersecurity-biology-2026-6 - Researchers say they trained a foundation model from scratch for about $1,500
Training a foundation LLM from scratch costs millions and requires internet-scale data — which is why most enterprises don't bother. Sapient thinks it has a cheaper path. To overcome this brute-force scaling dogma, researchers at Sapient developed HRM-Text , which replaces standard Transformers with a highly sample-efficient Hierarchical Recurrent Model (HRM), an architecture they first introduced last year . HRM decouples computation into slow-evolving strategic and fast-evolving execution layers. Instead of brute-force autoregressive prediction on raw text, HRM-Text trains exclusively on instruction-response pairs. This is close to real-world enterprise settings, where users usually expect a targeted answer to a specific task. The researchers were able to train a 1B-parameter HRM-Text from scratch at a fraction of the cost and tokens of normal LLMs. Their model achieved performance competitive with much larger open models on key industry benchmarks. For real-world AI applications, this means foundational pretraining is no longer restricted to highly resourced institutions. With HRM-Text, organizations can affordably pretrain their own highly capable reasoning models from scratch and pair them with external knowledge stores. The training bottleneck When we train an LLM, we don't actually care if it has memorized the exact sequence of words in a random 2014 Reddit thread. What we want is for the model to develop a deep, underlying understanding of human language, logic, facts, and reasoning. The current approach is brute force: scrape the internet, run next-token prediction trillions of times, and assume the model has developed a working internal model of the world. Basically, this means that we waste millions of dollars of computing power forcing models to memorize everything collected from the internet, just so they can indirectly learn how to think. For example, standard decoder-only models spend valuable compute assigning loss to reconstruct the prompt itself, even though the user's prompt is already known and provided at inference time. Instead of simply viewing this as a computational hurdle, the industry must recognize it as a severe business limitation. In comments provided to VentureBeat, Guan Wang, CEO of Sapient Intelligence, framed this as an issue of the "economics of iteration." "Enterprises today face three compounding problems: training is expensive, infrastructure is heavy, and experimentation cycles are too slow," Wang said. "The industry’s scaling addiction says: 'When the model fails, make it bigger. Add more data. Add more GPUs.' That has worked, but it is reaching a point of diminishing returns. More scale often means more memorization, more latency, more infrastructure, and more vendor dependency. It does not necessarily give an enterprise a better reasoning engine." This architectural and computational inefficiency is exactly why fine-tuning existing dense transformers isn't always the silver bullet for enterprises. Fine-tuning to preserve a model's general capabilities often requires mixing substantial general-purpose data into the process, making it computationally heavy and difficult to control. "Imagine a hedge fund, insurer, or bank that has highly proprietary data: internal research notes, transaction logic, compliance rules, analyst memos, risk models, portfolio constraints," Wang said. "They may not want to send that data to an external frontier model, and they may not need a giant general-purpose model that memorized the internet. What they need is a compact reasoning core that can learn their task structure, reason across rules and numbers, and run in a controlled environment." Because HRM-Text focuses its computation strictly on task completion and latent reasoning, it allows enterprises to start with a smaller, smarter model and adapt it to a proprietary domain with far less infrastructure. Rethinking architectures with HRM-Text HRM, which was introduced in 2025, represents a fundamental departure from traditional Transformer models. To build a more sample-efficient engine, HRM decouples computation into slow-evolving strategic and fast-evolving execution layers. The fast L-module performs local iterative refinement, while the slow H-module maintains stable semantic context across cycles. Processing consists of two high-level cycles, where each cycle executes three fast L-module updates followed by a single slow H-module update. Standard parameter-shared recurrent architectures (like Samsung's TRM ) can sometimes handle small logic puzzles, but the Sapient researchers found they become highly unstable when scaled to 1-billion parameters for language tasks. The separation between HRM's slow H-module and fast L-module is mathematically necessary, not just an aesthetic choice. As Wang said: "For logic grids, you can sometimes get away with a tiny recursive mechanism because the world is clean and bounded. Language is not like that. Language needs both fast local refinement and slow semantic stability." While the original HRM proved highly effective for controlled, symbolic reasoning problems, the researchers hit a wall when applying it to the massive, open-ended complexities of generalized language modeling. While HRM's loops make it an incredibly efficient thinker, those same loops make it mathematically volatile to train on the diverse chaos of human language. Running recurrent loops on language creates massive mathematical instability, specifically, exploding or vanishing gradients. To prevent this feedback loop in the neural network, the researchers introduced two key architectural innovations in HRM-Text. First, they developed MagicNorm, a specialized normalization technique designed specifically to keep the internal signals stable, no matter how many times the model loops its thought process. Second, they designed a warm-up method to stabilize training. During early training, the model is only evaluated on short, shallow reasoning loops. As training progresses, the system warms up, gradually giving the model deeper and longer reasoning sequences. They also switched the training objective from next-token prediction to task completion, where the model is rewarded only on the full response as opposed to individual tokens it generates. To achieve this goal, they changed the training data of HRM-Text from raw text to instruction-response pairs only. HRM-Text in action The researchers built a highly compact 1-billion-parameter HRM-Text model. Instead of using the standard multi-stage pipeline that requires churning through trillions of words of raw internet text, they trained it from scratch on a tightly curated dataset of just 40 billion tokens. The training data consisted entirely of instruction-response pairs across general instructions, math, symbolic logic, textbook exercises, and rewritten knowledge. They trained the model using the task-completion objective. To force the model to rely on its internal hierarchical architecture rather than copying step-by-step logic, they explicitly stripped out "thinking" tokens from the training data. The model was evaluated across a diverse suite of standard foundational AI benchmarks, heavily indexing on knowledge, reasoning, logic, math, and comprehension. The researchers tested HRM-Text against both small models and highly-resourced open-weight and fully open models. The results show a significant shift in the compute-to-performance frontier. The 1B-parameter HRM-Text achieved 60.7% on MMLU, 84.5% on GSM8K, and 56.2% on MATH. This performance is highly competitive with (and in several cases surpasses) the 2B to 7B parameter foundation models it was tested against. The most important takeaway for the enterprise audience lies in the efficiency statistics and practical implications. Pretraining a foundation model from scratch is typically a multi-million dollar endeavor reserved for tech giants. HRM-Text was trained in just 1.9 days on a cluster of 16 GPUs. The total estimated compute cost was roughly $1,500. It achieved its competitive scores using 100 to 900 times fewer training tokens and 96 to 432 times less estimated compute than models like Qwen, Gemma, and Llama. Another important point is the decoupling of reasoning from knowledge memorization. From a practical standpoint, HRM-Text's success on reasoning-heavy tasks despite its tiny 40B-token training diet proves that a model does not need to memorize the entire internet to become a smart reasoning engine. For enterprise applications, this behavior is a feature, not a bug. The researchers suggest a future where businesses deploy highly compact, incredibly cheap recurrent models that act as the "reasoning core" specialized for business logic. Instead of forcing the model to memorize company databases during pretraining, the model acts as the reasoning engine, relying on external retrieval systems to fetch factual knowledge. Critics have pointed out that training on instruction-response pairs makes comparisons against models trained on raw text an "apples-to-oranges" scenario. Wang pushes back on this framing, pointing out that every serious modern LLM sees instruction-response data during training or alignment. "So the comparison is not apples-to-oranges. It is closer to apple cores-and-apples. We started directly from the core task format because that is how people actually use models: they give an instruction and expect a useful response," he said. The researchers also ran rigorous contamination tests to ensure the model wasn't simply memorizing benchmark answers. On DROP, the one benchmark showing a marginal contamination signal under a specific setting, HRM-Text still scored an impressive 81.1% on a strictly clean, 0% contamination subset. Ultimately, Wang argues that for enterprises, "the right evaluation is not trivia recall. It is a workflow evaluation... Give HRM-Text a task like: multi-step financial reasoning, compliance logic, scientific workflow automation, structured extraction followed by reasoning." Practical implementation and the future of enterprise AI While the benchmark scores and cost efficiencies are striking, Sapient is clear about the model's current boundaries. The initial release is best viewed as a proof-of-concept, akin to early GPT releases, designed to showcase the architecture's unique advantages. "Honestly, HRM-Text is not yet a plug-and-play ChatGPT replacement," Wang said. "It is a compact foundation language reasoning model. For an enterprise engineering team, the operational work is mainly around templates, mode selection, attention masking, and alignment." For AI engineering teams looking to experiment, getting started requires some specific, but standard, text-generation discipline. The model lists native support in the Transformers library (requiring transformers >= 5.9.0), and usage paths for vLLM and SGLang are actively being developed. The primary engineering task involves managing the PrefixLM design: production multi-turn chat applications will require careful KV-cache logic to ensure user prompts receive full bidirectional attention while the assistant's outputs remain causal. "When the cost of training a capable reasoning model drops to around $1,500, AI stops being only an infrastructure question and becomes a strategy question," Wang said. "A Fortune 500 company no longer has to ask, ‘Can we afford a foundation model?’ It would ask, ‘What should our model know about our business, and what kind of reasoning should it be optimized for?’"
Score: 68🌐 MovesJun 10, 2026https://venturebeat.com/technology/researchers-say-they-trained-a-foundation-model-from-scratch-for-about-1-500 - Palantir’s Karp says Sanders will regret only asking for 50% of AI companies. Full nationalization is coming.
Palantir CEO Alex Karp says full nationalization of AI companies is coming, and that Senator Bernie Sanders’ proposal for 50% public ownership will soon look moderate. “In two years, they’re not going to think Bernie Sanders is progressive,” Karp told CNBC on Wednesday. “They’re going to be like, ‘Bernie Sanders, you only want 50%? What […] This story continues at The Next Web
Score: 68🌐 MovesJun 10, 2026https://thenextweb.com/news/palantir-karp-ai-nationalization-sanders-50-percent - Meta now wants to use your activity from other websites to personalize its AI
Meta is finding more ways to use your data to serve ads and personalize its products.
Score: 67🌐 MovesJun 10, 2026https://www.androidauthority.com/meta-activity-external-apps-websites-personalize-ai-home-feeds-3676116/ - Intelligence layer becomes the enterprise AI control plane for enterprise AI
As enterprises accelerate past AI experimentation into full-scale production, the central challenge has shifted from accessing models to managing the organizational context those models need to act reliably. The pressure to govern costs, secure data and maintain accountability is now redefining how companies architect their entire AI intelligence layer. That convergence of AI adoption and […] The post Intelligence layer becomes the enterprise AI control plane for enterprise AI appeared first on SiliconANGLE .
Score: 67🌐 MovesJun 10, 2026https://siliconangle.com/2026/06/10/intelligence-layer-models-protect-enterprise-ai-investments-finopsx/ - MNP and Caseware Partner to Build the Future of Agentic Audit
MNP and Caseware Partner to Build the Future of Agentic Audit Toronto Star
- GitHub Copilot is generating more code than your team can review: Why senior engineers are now the bottleneck
Your engineering department is producing significantly more code than it can safely deliver to your customers. At first glance, that looks exactly like progress. Tools like GitHub Copilot allow developers to generate boilerplate code faster than ever before. Raw output increases. Feature backlogs shrink. Development teams feel incredibly productive. Then software delivery slows to a crawl. Not because developers are writing less code, but because the organization simply cannot process what is being produced. Why the delivery bottleneck simply moved Software delivery has always operated under a strict constraint. Historically, that fundamental constraint was the physical act of writing code. Overall development speed depended entirely on how quickly human engineers could implement features. Artificial intelligence removes that specific constraint completely. Code generation is no longer the limiting factor in the software development lifecycle. Everything that follows the generation phase immediately becomes the new bottleneck: peer review, architectural validation, security integration and final release. Output now increases much faster than system throughput. The operational drag of machine generation In a recent engagement, I reviewed an engineering team that had widely adopted AI-assisted development tools across their entire department. The early results were strong. The team produced more code, implemented initial features faster and increased developer activity metrics. Within a few sprint cycles, however, the entire delivery pipeline slowed down. Review queues grew to unmanageable sizes. Senior engineers became completely overloaded. Instead of focusing on strategic architecture and complex system design, they spent the majority of their week reviewing massive volumes of generated code. The organization vastly improved its ability to produce raw syntax, but it did not improve its ability to validate and ship it. Why generated code requires more human review The core issue is not that machine-generated code is inherently flawed or broken. The issue is how it fundamentally changes the workflow of the engineering department. When developers write code manually, they carry deep historical context. They intuitively understand why specific changes exist, how they fit into the broader legacy system, and what exact business constraints dictated the logic. Generated code completely lacks that human context. Review takes significantly longer because the underlying intent is not immediately clear to the reviewer. As output volume increases across the team, the required review effort grows quickly. The difference between local speed and system throughput At the individual level, developers feel dramatically faster. At the system level, the organization slows down. Improving local productivity does not guarantee improvement in overall system throughput. Research from GitClear shows that AI-assisted development is heavily associated with a sharp increase in code churn. More code is written, modified and replaced over time without adding actual value to the product. Insights from the Google Cloud DevOps Research and Assessment report emphasize that elite engineering performance depends entirely on the efficiency of the entire delivery system, not just individual developer output. The constraint in the pipeline always determines the final outcome. The compounding cost of code validation As overall code volume increases, peer review becomes the dominant operational cost for the department. This friction appears as longer delivery cycles, constantly delayed releases and an increased workload placed directly on your most experienced staff. Organizations often respond to this friction by adding more reviewers, extending project timelines or reducing the scope of the release. These reactive actions completely reduce the initial economic gains promised by faster code generation. The severe risk to senior engineer retention Senior engineers are a strictly limited and highly expensive corporate resource. Their highest value to the enterprise lies in system design, technical direction and complex problem-solving. When they become the primary bottleneck for reviewing machine-generated code, their role shifts entirely toward validation rather than creation. This significantly reduces their strategic impact across the organization. More importantly, it creates a quantifiable retention risk. Senior engineers are hired to build complex architectures, not to act as syntax proofreaders for an algorithm. If their daily workflow devolves into a tedious cycle of untangling verbose pull requests, job satisfaction declines. Replacing a senior architect is far more expensive than any efficiency gained by a coding assistant. Why future adoption amplifies the problem As the enterprise adoption of AI tools increases, so does the raw output. According to Gartner , AI-assisted development will become the standard baseline across enterprise teams. This means significantly more code will be moving through your existing systems. If your review processes do not evolve to handle this new reality, the bottleneck simply intensifies. Optimizing for delivery over generation Software delivery is not about how fast code is written. It is about how efficiently work moves from a business idea to a live production environment. Increasing generation speed without improving operational flow creates immense internal friction. Organizations that adapt successfully must shift their entire focus from maximizing output to optimizing throughput. This requires enforcing strict limits on the size of code changes, demanding clear documentation of intent before review and relying heavily on automated testing to catch syntax errors before a human engineer is involved. Why raw output is a false metric More code does not equal more progress. If your team is producing more syntax than it can safely process, the engineering bottleneck has not been removed. It has simply moved. Until the system is redesigned to handle that specific shift, increased output will not translate into faster delivery. Instead, it will result in missed commitments, severe planning risk and entirely unpredictable delivery timelines. This article is published as part of the Foundry Expert Contributor Network. Want to join?
- The AI-related leadership crisis that’s only five years away – and how to avoid it
AI is eliminating entry-level roles that traditionally shaped the next generation of managers – creating a potentially imminent leadership crisis
- Zoho unveils Nathu La server to lower AI inference costs and strengthen technology sovereignty
Zoho Corporation has unveiled Nathu La, an in-house-designed server platform that extends the company’s effort to build and control its technology stack, from hardware infrastructure to enterprise software applications. The company claims that the server delivers comparable performance while consuming 12–18% less power and reducing total cost of ownership (TCO) by 20–30%. Zoho expects these efficiencies to help lower the cost of running artificial intelligence (AI) inference workloads. Developed in collaboration with Intel, the Nathu La platform is powered by Intel Xeon 6 processors. Intel provided technical expertise and enablement support during its development. Extending Zoho’s full-stack strategy Zoho plans to host its applications on the Nathu La platform, allowing it to optimise the software and hardware stack for its own workloads. The company expects this approach to improve application performance, reduce infrastructure costs, and strengthen data governance for customers. “Zoho Corporation has invested in building its own technology stack from the ground up over the last three decades. The Nathu La server launch is in line with that goal,” said Shailesh Davey, CEO, Zoho Corporation. “With Zoho’s strategy of using contextual, right-sized models, running on our own platform, now on our own servers, accelerated by our own GPU database, we are compounding the benefits accrued from owning and operating our entire technology stack,” he added. Davey said the company’s long-term research and development investments across the technology stack were intended to make its solutions more sustainable and accessible to businesses. Designed for efficiency and modularity The design of Nathu La draws on principles established by the Open Compute Project (OCP), including modularity, thermal efficiency, and ease of maintenance. Zoho said the server has been designed for workloads such as virtualisation, high-performance computing (HPC), AI inference, and storage. Optimising the platform for these workloads could also improve the performance of Zoho applications for end users. The motherboard and chassis platform is the outcome of five years of research and development across hardware, firmware, and systems management. The server includes customised power-delivery subsystems, an in-house Data Centre Secure Control Module (DC-SCM), and modular chassis options intended to support different deployment environments. Zoho said modular components, including the DC-SCM and network interface card (NIC), were designed by its hardware engineering team and assembled by Indian electronics manufacturing services (EMS) partners. The company has filed more than five patents covering areas such as thermal management and cost-optimised server architecture. Building hardware expertise beyond major cities Zoho began assembling a small hardware research and development team in Nagpur in 2020 to work on projects including server design. The Nathu La team also includes recruits from Student’s Engagement for Transformative Upskilling (SETU), a Zoho initiative aimed at developing industry-ready engineers from colleges across Central India. The programme focuses on advanced learning in electronics system design and manufacturing (ESDM), with an emphasis on hands-on engineering, first-principles problem-solving, research, and applied technical skills. Zoho said more than 300 students have been trained through SETU, with some subsequently joining the company. “The development of the Nathu La server reflects our commitment to creating complex technology powered by talent from smaller towns and villages,” said Davey. “Through focused investments in R&D and skill development, this foray into hardware enables us not only to build and own the technology, but also to cultivate the expertise and talent behind it,” he added. A push towards technology sovereignty Although Nathu La uses Intel processors, Zoho said the intellectual property associated with the server platform and its modular designs is owned in India. The company sees this ownership as a way to reduce dependence on overseas vendors for areas such as firmware management, licensing continuity, platform customisation, and security audits. Zoho also said the platform incorporates hardware-rooted security across the stack and aligns with open-source software policies and local-content requirements for government procurement. The company has positioned Nathu La as supporting programmes such as Make in India, Atmanirbhar Bharat Abhiyaan, and the National Supercomputing Mission. For Zoho, the server represents more than an entry into hardware. It brings the company closer to controlling the infrastructure on which its applications and AI models run, while linking technology sovereignty with the increasingly important economics of AI inference.
- How to help knowledge workers who lose their jobs to AI
Brookings Institution researcher Molly Kinder on why she's leaving her job to create solution for AI's "messy middle." PLUS: Claude Fable arrives
Score: 67🌐 MovesJun 10, 2026https://www.platformer.news/how-to-help-knowledge-workers-who-lose-their-jobs-to-ai/ - Security Flaw in Claude Code Illustrates the Risk of AI in Developer Workflows
Security Flaw in Claude Code Illustrates the Risk of AI in Developer Workflows DevOps.com
Score: 67🌐 MovesJun 10, 2026https://devops.com/security-flaw-in-claude-code-illustrates-the-risk-of-ai-in-developer-workflows/ - Trump says he thinks AI companies will agree to 'giving back' to the public
Trump says he thinks AI companies will agree to 'giving back' to the public Reuters
- Taiwan’s 50 Richest 2026: AI Demand Drives Up Tycoons’ Wealth By 56% To A Record $308 Billion
Jason and Richard Chang, the siblings behind semiconductor packaging and testing company ASE Technology Holding, take the No. 1 spot for the first time.
- 'Reading relationships, crunching stats'—184-times faster data analysis
A research team at POSTECH led by Professor Wook-Shin Han of the Department of Computer Science and Engineering and the Graduate School of Artificial Intelligence, along with Ph.D. candidates Taesung Lee and Jaehyun Ha, has developed TurboLynx, an engine capable of analyzing complex, highly interconnected data up to 184 times faster than existing systems.
Score: 66🌐 MovesJun 10, 2026https://techxplore.com/news/2026-06-relationships-crunching-stats-faster-analysis.html