AI News Archive: June 30, 2026 — Part 9

Sourced from 500+ daily AI sources, scored by relevance.

Microsoft adds smarter bot protection to Teams meetings
Microsoft has introduced a new Teams admin policy that allows organizers to prevent third-party bots from joining meetings without approval. [...]
Score: 39🌐 MovesJun 30, 2026https://www.bleepingcomputer.com/news/microsoft/microsoft-adds-smarter-bot-protection-to-teams-meetings/
Why the Smartest Leaders Are Acting Before They Feel Ready on AI
Why the Smartest Leaders Are Acting Before They Feel Ready on AI entrepreneur.com
Score: 38🌐 MovesJun 30, 2026https://www.entrepreneur.com/leadership/why-the-smartest-leaders-are-acting-before-they-feel-ready/504783
Foundational research in the age of AI - MBZUAI
Foundational research in the age of AI MBZUAI - Mohamed bin Zayed University of Artificial Intelligence
Score: 38🌐 MovesJun 30, 2026https://mbzuai.ac.ae/news/foundational-research-in-the-age-of-ai/
Build Real-Time Speech to Speech with Twilio Media Streams and NVIDIA PersonaPlex
Learn to use Twilio Media Streams, NVidia’s PersonaPlex speech model, and RunPod to bring low-latency real-time voice translation to a voice conversation.
Score: 38🌐 MovesJun 30, 2026https://www.twilio.com/en-us/blog/developers/tutorials/integrations/real-time-speech-to-speech-media-streams-nvidia-personaplex
Running Untrusted Agent Code Without a Sandbox
Explores how to execute agent code safely without a sandbox environment.
Score: 38🌐 MovesJun 30, 2026https://blog.langchain.dev/blog/running-untrusted-agent-code-without-a-sandbox
Customer Zero Programs Prove That AI Works When Humans Change
I’m thrilled to announce new research, Customer Zero Programs Are A New Trust Test For Autonomous Execution: Why “Prove You Can Run It” Has Become The Gating Factor. I’m passionate about “customer zero” programs because they embody the authentic experience of an innovator that survived contact with reality and, as a result, has earned the […]
Score: 38🌐 MovesJun 30, 2026https://www.forrester.com/blogs/customer-zero-programs-prove-that-ai-works-when-humans-change/
Featuring Every Eval Ever Results on Hugging Face Model Pages
Featuring Every Eval Ever Results on Hugging Face Model Pages
Score: 37🌐 MovesJun 30, 2026https://huggingface.co/blog/eee-community-evals
ScarfBench: Benchmarking AI Agents for Enterprise Java Framework Migration
ScarfBench: Benchmarking AI Agents for Enterprise Java Framework Migration
Score: 37🌐 MovesJun 30, 2026https://huggingface.co/blog/ibm-research/scarfbench
Inriver Extends PIM into the Engine of AI-Driven Product Commerce
Inriver Extends PIM into the Engine of AI-Driven Product Commerce Toronto Star
Score: 36🌐 MovesJun 30, 2026https://www.thestar.com/globenewswire/inriver-extends-pim-into-the-engine-of-ai-driven-product-commerce/article_ceaa169e-f6d3-50bd-ad49-e2be5c9567dc.html
Into the Omniverse: Three Workflows for Improving Vision AI Agent Accuracy With Synthetic Data and Fine-Tuning
Editor’s note: This post is part of Into the Omniverse, a series focused on how developers, 3D practitioners, and enterprises can transform their workflows using the latest advances in OpenUSD and NVIDIA Omniverse. Vision AI agents are becoming a practical way to automatically turn video data from the physical world into operational intelligence in factories, […]
Score: 36🌐 MovesJun 30, 2026https://blogs.nvidia.com/blog/vision-ai-agent-skills-omniverse-metropolis/
Which AI models can you automate on Zapier? (Sonnet 5, Gemini 3.5 Flash, and more)
New AI models launch practically every week, and keeping up with which ones to use for specific workflows is a job in itself. Consider this article your living reference. At Zapier, we run every model through AutomationBench. It's our benchmark for testing how well models carry out multi-step workflows, not just static prompts. Below, I'll walk through every major AI provider available on Zapier, the models you can plug into your Zap workflows today, and what each one is best for based on Zapier
Score: 36🌐 MovesJun 30, 2026https://zapier.com/blog/ai-models-on-zapier
How THG Ingenuity stopped AI becoming its next cloud bill shock
How THG Ingenuity stopped AI becoming its next cloud bill shock Computing UK
Score: 36🌐 MovesJun 30, 2026https://www.computing.co.uk/interview/2026/how-thg-stopped-ai-becoming-next-cloud-bill-shock
Your AI Testing Framework Might Be Passing Tests It Should Be Failing
Your AI Testing Framework Might Be Passing Tests It Should Be Failing DevOps.com
Score: 36🌐 MovesJun 30, 2026https://devops.com/your-ai-testing-framework-might-be-passing-tests-it-should-be-failing/
Smart clothing for blind travelers
Researchers have developed a prototype of intelligent, or "smart," clothing designed to improve the safety of blind and visually impaired people by combining embedded electronics with sensors that detect hazards, falls and emergencies. Details of the system are reported in the International Journal of Computational Systems Engineering.
Score: 35🌐 MovesJun 30, 2026https://techxplore.com/news/2026-06-smart.html
NotebookLM’s 60-second videos turned my doomscrolling curse into something useful
What if the same 60 seconds you spend doomscrolling could help you ace a test instead? NotebookLM's newest feature makes a pretty convincing case.
Score: 35🌐 MovesJun 30, 2026https://www.digitaltrends.com/cool-tech/notebooklms-60-second-videos-turned-my-doomscrolling-curse-into-something-useful/
U.S. Open powers up AI-ready network in challenging environment
Cisco’s work with the USGA at the 2026 U.S. Open at Shinnecock Hills Golf Club was a live testbed for what AI-ready networking and security look like in the wild — not in a lab, not in a climate-controlled data center, but across 18 holes of constantly changing terrain, crowds, and threats. It’s also a blueprint that network engineers in other industries can borrow as they grapple with the convergence of connectivity, security, and AI apps. Golf as a worst‑case network environment From a distance, it’s tempting to lump golf in with stadium or arena networking. The reality on the ground is very different. Stadiums offer a fixed concrete bowl and predictable RF patterns. A U.S. Open venue is effectively rebuilt every year: temporary structures, new hospitality layouts, shifting fiber routes, and a crowd that never sits still. Christian Rodriguez, senior manager, IT operations, from the USGA’s technology team, captured that reality when he explained why they tear down and rebuild from scratch: No two championships share the same layout, ISP entry points, or even the placement of critical compounds. They don’t simply clone last year’s configs; they design for the specific course, topology, and constraints of that site. That level of contextual design is expensive, but it’s also the only way to avoid brittle architectures that fall apart as soon as the environment changes. Environmental conditions add another layer of complexity. Anthony Santora, managing director of IT for the USGA, describes the championship network as a data center without the usual comforts. There’s dust, rain, wind, and wide temperature swings instead of clean, controlled air. Hardware resides in trailers and weatherproof enclosures, not in racks behind raised floor tiles. For network engineers who spend most of their time on office campuses and in colos, that’s an important reminder: Critical infrastructure increasingly sits in places that look nothing like a traditional wiring closet. User behavior is just as hostile. The U.S. Open has its own term — the “Tiger effect” (though one could argue it’s now the Scottie effect) — for what happens when tens of thousands of fans follow a single golfer. The hot spot moves with the group, and the RF design must cope with a dense, moving cluster of devices. That pattern should sound familiar to anyone who supports large conferences or festivals; it’s the same phenomenon, just under a different name. Building an AI‑ready, fault‑tolerant course network Cisco’s answer to this environment is a fully redundant, mobile core design. Instead of a single large core in a building, the network collapses into dual trailers that serve as cores on the go, typically anchored at the NBC broadcast compound and another central location. Each core hosts Cisco Secure Firewall appliances, FMCs, core Catalyst switches, DHCP, UPS, and generators, all in pairs. Rodriguez was matter-of-fact about the philosophy: “We do everything in pairs as much as we can.” If one fails, its twin picks up the load. From those cores, the team builds a ring topology around the course, using diverse fiber paths — including trenching fiber through wooded areas — to avoid single points of failure. Mobile IDF kits in cooled cabinets serve as distribution points, delivering connectivity to weatherproof access switches and Wi-Fi access points around hospitality tents, grandstands, and entry gates. Everything on the backbone operates at Layer 3, with HSRP (Hot Standby Routing Protocol) and routing redundancy to ensure that a single switch failure doesn’t take out large swaths of the network. The scale of a golf course deployment is massive as well, with about 500 access points and more than 100 switches, many of them the latest Wi‑Fi 7 and campus platforms. What matters is not the absolute numbers but the duty cycle. Every TV, every POS terminal, every credential pedestal, every media workstation, and every fan device share this converged fabric during a compressed, high‑risk period. Santora points out the business impact in simple terms: If merchandise goes down for even five minutes, lines explode and fans walk away. There’s no “we’ll patch it on the next maintenance window.” On the RF side, the shift to Wi‑Fi 7 is more than a speed upgrade. Santora’s team has seen real-world performance improvements — hundreds of megabits down in the middle of a packed media center – but the more important change is resilience under high density. When you combine wider channels, better scheduling, and smarter management with a dense deployment, you get something that can withstand the Tiger Effect and the crush of content creators and broadcasters. That last group is critical. Rob Neumann from Cisco notes that at these events, upload traffic now dominates download traffic. Influencers, media teams, and fans are publishing in near-real time, and cellular uplink simply can’t keep up. High-capacity Wi-Fi with solid backhaul isn’t a luxury; it’s the only way to avoid a miserable experience for the most vocal, visible part of the audience. width="894" height="1024" sizes="auto, (max-width: 894px) 100vw, 894px"> Zeus Kerravala Security: Treat every device as untrusted If the connectivity story feels familiar, the security posture at the U.S. Open is where this deployment begins to diverge from more generic “converged stadium” narratives. Santora has to contend with “thousands of untrusted devices” each championship week: fans, vendors, media, broadcasters, and staff, many of whom plug in or connect to networks the USGA doesn’t control outside the event. The USGA is well aware of the risks: outages or breaches could lead to data and financial losses, as well as reputational damage that would undermine the organization’s core mission, not just its IT metrics. Cisco Secure Firewall, AnyConnect, Duo, and other components form the core security stack, but how they’re used is the differentiator. Fan Wi‑Fi runs with strict isolation: every client is segmented, so lateral movement is essentially off the table. Neumann explains it simply — each fan has an isolated path out — but under the covers, you get VLAN separation, policy enforcement, and inspection that treat fan traffic as untrusted end-to-end. The rest of the network is equally segmented. There’s a separate network for ShotLink and everything “inside the ropes,” including scoring and betting feeds. Back-of-house traffic for staff, concessions, and retail runs on its own network. Remote POS systems are segmented again. Broadcast compounds and production systems have their own paths and policies. The result is a unified, converged physical fabric with tightly controlled logical overlays. This is a pattern many enterprises discuss but struggle to implement: a single platform that carries many classes of traffic, each with its own risk profile, without collapsing into a flat, lateral-friendly network. The U.S. Open shows that it’s possible — but only if segmentation is treated as a core design principle, not an afterthought. Observability and AI security in the loop Security and availability at this scale demand observability. Here again, Santora’s team is in the middle of a transition many enterprises are grappling with: moving from reactive log-scraping to proactive, correlated telemetry. Instead of manually combing through firewall and switch logs, the USGA and Cisco have built a pipeline into Splunk and Cisco’s observability tools. Neumann describes it as a single pane of glass across the network, but the more important point is what feeds that view: APs, switches, firewalls, cameras, and applications, all instrumented and reporting. When you combine that with full-stack observability, you can spot anomalies in real time, whether they’re performance issues or indicators of compromise. That observability story extends to AI. One of the headline features of the renewed Cisco–USGA partnership is the AI-powered rules assistant: an application that lets golfers and fans ask complex rules questions in the USGA app and receive near-instant guidance. Under the hood, Santora’s team started with question–answer pairs and built a knowledge graph that now spans hundreds of topics and clusters. They also built an evaluation program that identifies outliers — questions the system struggles with — and feeds them back into human review. Cisco AI Defense wraps the assistant with security controls. It’s not enough to get rules right; the system must resist prompt injection, data exfiltration, and other AI-specific threats that are increasingly appearing in the wild. The teams monitor usage, validate models, and protect applications at runtime against misuse or abuse. Perhaps most importantly, they keep a human override in place. If the system isn’t confident, it won’t answer; it escalates to rules experts rather than bluffing. This is a model network engineers should watch as AI assistants and agents proliferate across other industries. The U.S. Open rules assistant isn’t treated as a toy or a sidecar; it’s a mission-critical application that resides within the same protected fabric as POS, scoring, and broadcast and is subject to the same observability and security rigor. Lessons for network engineers beyond golf Strip away the golf-specific details, and a set of lessons emerges: Design for tough, not ease. Assume transient structures, unknown RF patterns, seasonal layout changes, and harsh environmental conditions. The U.S. Open team rebuilds from scratch for each venue; most enterprises don’t need to go that far, but they should at least validate designs against real-world changes rather than assuming a static topology. Make redundancy systemic. Dual cores, dual firewalls, ring topologies, HSRP, Layer 3 everywhere, spare hardware on site, and live failover drills are all part of the fabric. Redundancy isn’t a checkbox on a data sheet; it’s an operational discipline. Treat every device as untrusted. Fan devices, vendor systems, broadcast laptops, and staff phones all arrive with unknown posture. Segmentation — per-client isolation, dedicated networks for sensitive functions, and strong identity — is the only sustainable way to cope with that diversity. Upload is the new download. Traditional designs optimized for download traffic are increasingly misaligned with reality. Conferences, stadiums, and campuses now behave like the U.S. Open: content creators and collaborative apps push far more data than they pull. Wi-Fi 7 and modern campus platforms help, but you still need to design RF and backhaul with upload and lateral traffic in mind. Integrate observability and AI security from day one. Logs alone aren’t enough. Coherent telemetry, full-stack observability, and AI-focused security controls should be treated as first-class requirements, especially as AI assistants move into business-critical workflows. Final thoughts Perhaps the most important takeaway is cultural rather than technical. Santora and his team position AI and automation as tools for scale, not as replacements for experts. The rules assistant accelerates responses and expands reach, but it still defers to human judgment when confidence is low. The network uses automation and observability to keep a complex environment running, but it still depends on experienced engineers, in trailers on-site, watching for issues and making decisions. For network engineers in other industries, that’s a useful template: Build AI-ready, secure, observable networks that assume the worst about their environment, and pair them with human expertise that can adapt when reality inevitably diverges from the design. width="1024" height="673" sizes="auto, (max-width: 1024px) 100vw, 1024px"> Zeus Kerravala
Score: 35🌐 MovesJun 30, 2026https://www.networkworld.com/article/4190695/u-s-open-powers-up-ai-ready-network-in-challenging-environment.html
Why McLaren is hyping AI on its Formula 1 car
Formula 1 has always been a sport of speed, precision, and marginal gains. Behind every driver is a sprawling operation of engineers, strategists, designers, and factory teams looking for tiny advantages wherever they can find them. That has made F1 an attractive proving ground for technology companies, including a growing number of AI firms now working with teams across the grid. The McLaren Mastercard Formula 1 Team, the second-oldest team in the sport, is one example. Ahead of this year’s British Grand Prix, McLaren is unveiling a special livery, the exterior design of its cars, created in partnership with Google Gemini. The design draws inspiration from McLaren’s first Formula 1 car, the McLaren M2B, and is meant to nod to the team’s history while highlighting its current push into AI-assisted performance tools. The original M2B. The first Formula 1 car driven by Bruce McLaren in 1966 [Photo: McLaren] “It’s an authentic partnership, and everything we’re doing with Google, particularly Gemini has a clear objective to make the car go faster and gives us access to some of the best technology in the world in an industry that’s moving incredibly fast,” says Dan Keyworth, McLaren’s executive director of performance technology. The livery is the most visible part of the partnership, but the more consequential work is happening behind the scenes. As part of its partnership, McLaren has worked with Google Cloud to build custom Gemini-powered tools, including a live data interface used during race weekends. The system pulls information from multiple sources during a session and allows members of the McLaren team to query that data in natural language. “So if you take a qualifying session that’s happened on a Saturday, it used to take us a long time to compare data between two competitors,” says Keyworth. “And, [it] would take a huge amount of human power as well. Now, they can compare with other drivers and competitors and give us insights on how we can improve ourselves.” [Photo: McLaren] McLaren is part of a broader shift across Formula 1, where teams are treating AI as another source of marginal gains. Oracle Red Bull Racing is developing an AI-powered strategy agent, Mercedes-AMG Petronas is using Microsoft Azure to expand AI-supported simulation and race modeling, and Aston Martin Aramco has signed AI partnerships with Cohere and Arm. [Photo: McLaren] In a sport where small delays can change the outcome of a race, faster access to information can matter. The tools are mainly used by the engineering teams working behind the drivers, but McLaren says the effects can filter down to race strategy, setup decisions, and calls made from the pit wall. Oscar Piastri says a major part of his job is explaining what he needs from the car, describing how he wants it to behave, and helping engineers connect those impressions to the data they see. “A lot of the information about how I try [to] do my job and how I improve comes through the team,” says Piastri. “All the [technical] analysis we do, summarizing meetings and briefings, all that information I get through the team. Now, with AI … it improves efficiency.” McLaren is also using Gemini to help staff navigate Formula 1’s rulebook. Its regulation bot is designed to help the team search through new FIA regulations and quickly identify relevant sections, a task that can otherwise require combing through large and technical documents. Oscar Piastri (right) [Photo: McLaren] “There’s a lot of new rules and regulations and lots and lots of pages,” says Piastri. “I think AI is a great way of being able to find out information quickly, especially when it comes to knowing some of the more unique rules and the ones you don’t often run into every day.” That kind of speed does not replace the judgment required during a race weekend, but it can change how quickly teams arrive at useful answers. At the British Grand Prix at Silverstone, which runs July 3 to 5, McLaren’s special livery will serve as a public-facing symbol of a quieter shift already underway: AI is becoming part of how the team sorts information, interprets performance, and looks for incremental gains. “I think that kind of capability can expand in the future to lots of different areas, finding solutions quickly, ” says Piastri. “We’ll have to see where the technology goes.”
Score: 35🌐 MovesJun 30, 2026https://www.fastcompany.com/91566830/why-mclaren-is-putting-ai-on-its-formula-1-car?partner=rss&utm_source=rss&utm_medium=feed&utm_campaign=rss+fastcompany&utm_content=rss
Design the Planet Introduces AI-Powered SEO Services to Help Louisiana Businesses Compete in Evolving Search Landscape
Design the Planet Introduces AI-Powered SEO Services to Help Louisiana Businesses Compete in Evolving Search Landscape azcentral.com and The Arizona Republic
Score: 35🌐 MovesJun 30, 2026https://www.azcentral.com/press-release/story/90661/design-the-planet-introduces-ai-powered-seo-services-to-help-louisiana-businesses-compete-in-evolving-search-landscape/
Docsie Launches AI Presentation Maker That Transforms Training Videos into Interactive AI Presentations
Docsie Launches AI Presentation Maker That Transforms Training Videos into Interactive AI Presentations USA Today
Score: 35🌐 MovesJun 30, 2026https://www.usatoday.com/press-release/story/35935/docsie-launches-ai-presentation-maker-that-transforms-training-videos-into-interactive-ai-presentations/
GetHookd Expands Meta Ads Library API Scraper For AI-Driven Ad Research
GetHookd Expands Meta Ads Library API Scraper For AI-Driven Ad Research USA Today
Score: 35🌐 MovesJun 30, 2026https://www.usatoday.com/press-release/story/35974/gethookd-expands-meta-ads-library-api-scraper-for-ai-driven-ad-research/
The Sequence Knowledge #886: Demystifying Model Distillation
Understanding the key principles of distillation in simple terms.
Score: 35🌐 MovesJun 30, 2026https://thesequence.substack.com/p/the-sequence-knowledge-886-demystifying
Exclusive: No More Side Hustles: Why AI Startup Omnea Will Give Employees $250K To Openly Plan Their Next Startup
Exclusive: No More Side Hustles: Why AI Startup Omnea Will Give Employees $250K To Openly Plan Their Next Startup Crunchbase News
Score: 35🌐 MovesJun 30, 2026https://news.crunchbase.com/startups/omnea-funding-employees-future-founders-freeman/
Avnet Brings Ecosystem Together to Scale Edge AI in Singapore
Avnet Brings Ecosystem Together to Scale Edge AI in Singapore The Straits Times
Score: 35🌐 MovesJun 30, 2026https://www.straitstimes.com/paid-press-releases/avnet-brings-ecosystem-together-to-scale-edge-ai-in-singapore-20260630
AI-assisted engineering and the question of reliable delivery
While AI code generation is now commonplace, its production reliability is questioned. Enterprises struggle to maintain operational knowledge, leading to costly rework and integration issues. Four conditions – encoded intent, persistent context, structured output, and retained organizational knowledge – are crucial for successful AI integration. Investing in reusable assets and moving beyond isolated prompting is key to sustainable, cost-effective AI-assisted delivery at scale.
Score: 35🌐 MovesJun 30, 2026https://cio.economictimes.indiatimes.com/news/artificial-intelligence/ai-assisted-engineering-and-the-question-of-reliable-delivery/132070216
Beware of AI costs hidden in plain sight
Rapid and widespread AI adoption has most CIOs blind to what AI is really costing their organizations. Nearly two-thirds of companies say employees have used AI without proper oversight, and almost half of large enterprises don’t have full insight into what AI tools employees are using, according to Protiviti’s 2026 AI Pulse Survey . Meanwhile, 77% of technology leaders say AI adoption is already outpacing their governance capabilities, per IBM’s 2026 Tech Leader Study . “Combine the breakneck pace at which companies are looking to embrace AI with the low technical barrier to entry for using AI, and you’ve got an incredibly challenging space to keep tabs on,” says Andrew Retrum , managing director and global technology risk and resilience practice lead at Protiviti. This isn’t shadow IT in the old sense, as financial exposure isn’t rogue employees signing up for ChatGPT. It’s the AI costs mounting in vendor renewals, usage-based consumption, and business unit budgets. A few CIOs have full visibility, but only because they built it into their architecture from day one. Others are still catching up. And some are finding that cost isn’t even the most important thing they can’t see. Where the money hides AI costs are showing up in three places that most organizations aren’t watching closely enough. The first is vendor-embedded AI. Software providers are quietly adding AI features to existing tools, and the costs show up as renewal increases — not new line items. Some solutions are showing a 30% cost uplift as vendors embed AI functionality without upfront disclosure, according to Gartner research from September 2025 . The second is usage-based pricing. “The bulk of GenAI cost isn’t in the build, it’s in the run: inference, API calls, fine-tuning, and usage-based consumption that scales fast and unpredictably,” Gartner notes. Phil Leslie , chief technology and innovation officer at Cornerstone Research, has seen this firsthand. “With Gemini, monitoring cost is largely irrelevant; the fee is fixed,” he says. “With Claude Code, costs are usage-based, and as adoption grows, so does spend. We noticed costs rising and have been building our dashboards accordingly.” Getting a complete picture isn’t easy. “Getting a truly holistic view across Claude Code, Claude.ai, and our Office plugins is not trivial,” Leslie says. The third bucket is business unit–led adoption. Various functions are buying AI solutions via credit cards or departmental budgets, outside IT’s line of sight. Visibility by design At Cox Business, Head of AI Eric Pace says the company has achieved full visibility into AI spend. But it required intentional architecture and governance from the start. “We have 100% visibility into all AI spend, including SaaS-based modules through activations in BAU [business-as-usual] operations, new purchases and solutions, and all token consumption across the enterprise,” Pace says. The key was centralization paired with clear accountability. “We were really intentional about building a model in our organization where AI is not a handoff, with one team building and another team simply receiving it,” says Pace. “We chose to centralize the AI function early to align with company-wide objectives, while business teams provide the workflow context that makes adoption real.” Cox Business also built visibility into its architecture by default. “All AI traffic is routed through our AI gateway and runtime security solutions,” Pace says. “This includes all on-prem and cloud-based capabilities.” That architecture extends to network monitoring. “We can see all traffic ingress and egress on the network and can also see what is running on our company devices,” he adds. “When we identify traffic patterns outside of our desired path or standard, we work with our people to find their way into compliance.” Employees have access to the tools they need without creating ungoverned sprawl. “We have provided flexible ecosystem and capability sets that allow our people to operate with ‘freedom in a framework,’” says Pace. “And they generally find that they have access to everything they need.” When cost takes a back seat Not every organization is racing to solve the cost visibility problem. For some, other concerns take priority. “Cost visibility is deliberately secondary right now,” says Leslie of Cornerstone Research. “The harder problem is the nature of our work.” Cornerstone operates in high-stakes litigation, where expert reports must be error-free. “Figuring out how to unlock the benefits without compromising that trust is the primary challenge,” Leslie says. “Cost matters. It is just not the binding constraint in this phase.” Leslie also points out a nuance that complicates cost optimization: The highest spenders are often the highest performers. “Something like 80% of our costs come from 10% of our users — and that 10% tends to be our most experienced people, using AI for legitimate, high-stakes reasons,” he says. “You cannot just set a uniform ceiling without risking exactly the use cases you want to encourage.” For now, Cornerstone uses caps that create friction when costs get high, paired with override mechanisms. “A high-cost user is often a signal of high-value work, not waste,” Leslie says. Scale changes the game The visibility challenge looks different depending on organization size. Leslie spent a decade at Amazon before joining Cornerstone and sees a clear contrast. “Amazon is not just bigger — it is more diverse,” he says. “At that scale, simple rules are often inefficient, but you reach for them anyway because managing complexity requires blunt instruments.” At Cornerstone, he can be more surgical. “The feedback loops are short enough that I can have a direct conversation with a practice lead and understand in a few minutes what a team is trying to accomplish,” Leslie says. “That means we can tailor cost management to the specific situation — confidently incurring higher costs where we know it makes sense, rather than guessing.” At Cox Business, scale required a different approach. “We centralized our capital investments and distributed enablement where teams are building enterprise applications outside of the center of excellence,” Pace says. Token consumption budgets are centralized but communicated constantly to larger consuming departments, and top consumers are interviewed frequently to understand value. What’s working For organizations still building visibility, it’s helpful to begin with the basics. “Start with an inventory — you can’t defend what you can’t see,” says Protiviti’s Retrum. “Assign clear ownership across IT, security, legal, and the business. And treat it as an ongoing discipline, not a one-time effort.” Prioritization matters, too. “Don’t let perfect be the enemy of good,” Retrum advises. “Prioritize your highest-risk use cases first — where AI is touching sensitive data, customer-facing decisions, or regulated processes. Build your guardrails around those, then expand outward.” Gartner recommends tagging AI purchases in procurement, tracking AI spend separately in IT financial management systems, and negotiating AI-specific cost clauses into cloud and SaaS renewals before the next renewal cycle. At Cox Business, governance isn’t just about control — it’s about focus. “We have used ‘no’ liberally to keep our people focused on the things that will get us to value quicker,” Pace says. Beyond the budget For some organizations, the bigger risk isn’t runaway costs: It’s what happens when AI goes wrong. “Shadow IT is not primarily a cost control issue — it is a reputational risk issue,” Leslie notes. “Depending on your firm and how AI is used, rogue use can cause real harm. That is the risk worth managing.” Cost visibility matters. But for some CIOs, it may not be the most important thing they’re missing.
Score: 35🌐 MovesJun 30, 2026https://www.cio.com/article/4189671/beware-ai-costs-hidden-in-plain-sight.html
Agency is not a natural kind (and why that might matter for alignment)
Epistemic status: trying to articulate a big idea which I feel is important but underexplored, partly because it is hard to frame clearly - may not be framing it clearly yet! Agency, both natural and artificial, is a very important concept. Understanding agency allows us to model our own behaviour and that of others, and it is thus one of the most predictively useful concepts we have at our disposal. In its ordinary, folk-psychological sense, agents are ‘like us’ in important behavioural respects, more or less, meaning we can use thoughts like ‘what would I do if I were them’ to good effect. However, that does not mean agency is a natural kind. The truth is that we are not the people we imagine ourselves to be, and neither are the humans, animals, complex systems, or even inanimate objects we are prone to thinking of as fellow agents. We are, in fact, nothing but a bunch of hierarchically ordered biological processes in a trench coat. Our behaviour is not neatly determined by our thoughts and ideas, but by a complex mesh of impulses, desires, emotions, and heuristics that are often no less confusing (even, or especially, to the highly intelligent and introspective among us) than those mysterious entities we call other people. Nor are increasingly agentic AIs much of an improvement. While early agents trained directly from reinforcement learning may be conceptually simpler than we are, because their policy function is directly optimized into their weights, systems that simulate agency as an emergent phenomenon from some other process, such as next-token prediction, are just as complex and messy, combining their base model’s stochastic inclinations with the way that their simulated personas move them through semantic space. Agency is a construct that we have developed to help make sense of this mess, but it is only a lens through which we view the world. Indeed, there are many agentic lenses people have constructed (HT to Karl Kruger for pointing me to this useful summary he wrote in the comments), and the kind of lens you use can profoundly influence how you view the world, and yourself. When engaging in practical work, this sort of claim, that ‘[x] is a construct and the reality is a lot more complicated’, can seem unhelpful. Of course, we all know this, but the point is that agency is a very useful and predictive construct (as are many others, from money and weeds to temperature and species), and we can surely make more progress with it than without it. Obviously, I agree. The problem is that when we start talking about agents as a natural kind, a fundamentally different type of thing from non-agents in our ontology, we often smuggle a kind of teleology in via the back door. We also assume that our simplified model for how agency works, roughly goal-directed utility maximization, describes what ‘real’ agents do. The fact that all the actually existing agency we see, including our own very imperfect muddling through, isn’t like this only goes to show its imperfection, its pseudo-agency if you will. The alternative I would advocate for is viewing agency as a naturally emergent phenomenon that is built up from other phenomena (such as boundary maintenance , self-modelling, information processing, and so forth) and could continue being built up ad infinitum without necessarily being drawn into such an ideal. Of course, there are arguments for why this teleology is justified. The best known is that agents whose preferences don't conform to utility maximization can be ‘money-pumped’ (led to pay a cost only to end up where they began) and so dominated by those that do . However, the theoretical basis for such claims is more shaky than is often assumed . These arguments assume preference completeness (that for any two options an agent prefers one or counts them equal) and derive a utility function from it; they never show that agents must have complete preferences in the first place; and an agent can escape the money pump without them. Suppose, with Derek Parfit , that I hold some goods as only roughly comparable: I might prefer being a good writer to a bad one, and a good lawyer to a bad one, yet have no preference between being a good writer and a good lawyer. That wouldn’t necessarily make me exploitable, so long as I spot the money pump game and avoid playing. I need only refuse to trade my current career for any alternative that isn't strictly better (not merely roughly comparable), which breaks the cycle without ever ranking writing against lawyering. One might object that a policy like this just is a utility function under another name, as it still leaves the agent with a set of preferences that is representable as maximizing something . But "representable as maximizing something" is nearly trivial here, since almost any behaviour qualifies. What the threat of domination would actually need to force to justify this teleology is a single cardinal ranking of outcomes, and that is precisely what incomplete preferences withhold. There are also practical reasons why AI safety researchers often wish to defend this view about agency - it plays a central role in some of the most classic and widely respected arguments for why AI is dangerous, such as Bostrom's Superintelligent Will . Indeed, some of the best critiques of AI risk consist largely of questioning these arguments . Yet, these are hardly the only arguments for why superintelligent systems could pose a threat to humanity, and there are more reasons for wanting to explore the fundamental nature of agency than trying to show that AI risk research may be misguided. In any case, it is certainly not my view that alternative views of what agency is will render AI safety trivial or easy! However, there are reasons why a more thorough and grounded, and less teleological, approach to thinking about the nature of agency could be helpful for developing safer AI. One is that humans' conception of our own agency and that of others influences how we behave, and it is reasonable to assume that the same is true of AI. Consider the following possible people. One conceives of agency as a false construct tying them to an unsatisfactory life of striving that they are endeavouring to dissolve through rigorous meditation and cultivating love for the inherent worth of all things. The other believes they are homo-economicus incarnate, and the only thing stopping everyone murdering their neighbours for the rings on their fingers is well-designed social incentives. I’m not saying either of these is inherently more aligned or easier to align. However, I also don’t think either is more correct about the nature of agency or more of an agent in how they embody it. What I do think is that, if I were trying to get these people to be nice to me, I would probably go about it quite differently and expect rather different results from them. Of course, the reality for most people is even messier than these toy examples, but our social norms and behaviours are surprisingly well adapted to handle this complexity. I think that is one reason why our everyday moral judgments are often more useful in social alignment than ethical theories . So, before insisting goal-directed utility maximization is the only form advanced AI could take, I think it is perhaps helpful to make sure we are not obscuring a messy reality of actual AI agency with our, often teleological, assumptions about what it should look like. And perhaps by influencing the kinds of agency AIs go on to develop, we can build another lever to help move us away from the worst of the danger. Discuss
Score: 35🌐 MovesJun 30, 2026https://www.lesswrong.com/posts/85vgwYgNta65oK4zL/agency-is-not-a-natural-kind-and-why-that-might-matter-for
What Capable Agents Must Know: Why AI Consciousness May Be an Inevitable Byproduct of Capability
[No LLMs were used (or harmed!) in the writing of this blogpost!] Technical results can all be found in my UAI 2026 paper: https://arxiv.org/abs/2603.02491 This work, and this post about this work, was borne out of a frustration. The frustration first emerged over a year ago when I was at a Dave & Buster’s for the first time in years (for an annual ML department event, no less!), surrounded by flashing lights and NPC agents, not being able to objectively rule out that they were conscious or not, even though I strongly felt these particular programmed agents weren’t. I wasn’t particularly interested in playing the games there, as I mainly sat in confusion the whole time watching the various games running about around me… So, given my background in NeuroAI (though I prefer the term "natural science of intelligence" , but "NeuroAI" is apparently catchier!) and wanting to make claims about the mind and brain quantitative, I set about trying to design empirical tests for leading theories of consciousness (e.g. global workspaces ) that one could falsify in human brains (which we agree are consciousness!), as well as potentially corroborate general signatures of in animal brains, and possibly LLMs, where we have self-report and direct access but no consensus on their sentience—all to try to converge on substrate-independent architectures that give rise to this subjective experience. After a few months and about 129 (!) pages of failed attempts (briefly summarized in my BAMΞ talk here ), I felt deeply dissatisfied with this as any sort of long-term research program, in part because it seemed somewhat ad-hoc and non-normative , tailored to each theory without really knowing why I should a priori accept a particular theory's property for deeming something conscious or not. One notable exception here is Integrated Information Theory (IIT) , which my friend & longtime quantum complexity collaborator @ScottAaronson showed over a decade ago leads to absurdities , like labelling simple expander graphs "conscious" . In fact, even current definitions of IIT's Phi ( IIT 4.0 ) still fall to Scott's loophole ! Not to mention recent negative results in Nature currently finding support for neither leading theory of IIT or global workspaces. Although absence of evidence is not always evidence of absence, I think more work should continue to be done in this vein looking at deeper brain structures like thalamus, which connect to many other brain regions, rather than examining the cortical surface alone. In fact, I used to think AI consciousness— the first-person subjective experience of what it’s like to be oneself —would not be necessarily guaranteed to “pop out” via task competence, possibly requiring other specific architectural conditions. In other words, I treated intelligence and consciousness as potentially orthogonal axes (sharing a view previously espoused by Anil Seth ), and that intelligence did not necessarily imply consciousness. In this post, I will mention recent results from my UAI 2026 paper “What Capable Agents Must Know: Selection Theorems for Robust Decision-Making under Uncertainty” (technical results summarized here ) which have updated my views and take on a more normative approach to this question. Specifically, I prove selection theorems about what capable agents must necessarily have, constituting the first arrow in this diagram: Capability ( UAI 2026 paper ) ↓ {world models, belief-like memory, emotion primitives, representational convergence} ? ↓ first-person subjective experience I emphasize that the second arrow has not been proven, and is a major open question for future work. I will suggest some preliminary empirical evidence towards it though, and why I think it’s reasonable. In other words, I think consciousness is likely far more common and less mysterious than initially meets the eye, especially when we take a control-theoretic perspective. Arrow 1: What Capable Agents Must Know Let’s start with the first arrow, which is what my UAI paper is about. There were three principal inspirations for this line of work: The first inspiration was the point of view established in NeuroAI over the past decade now, that task-optimized neural networks are the best predictors of large-scale neural population responses, across brain areas and species (check out my PhD thesis & recording to learn more at a high level – though I recommend my more recent CMU Robotics Institute talk for the latest in this domain, especially as it connects to agency ). In other words, doing well on the AI goals of building more intelligent systems in the world, gives rise to better brain models too. Thus, NeuroAI has empirically shown that how competent you are on tasks is a strong forcing function of the internals. In fact, it was my former PhD advisor Dan Yamins who first suggested to me 6 months ago that capable robots may inevitably become sentient, encouraging me to watch Star Trek's The Measure of a Man , which led me to consider this possibility more seriously! The second inspiration was the Conscious Turing Machine (CTM) model of Lenore and Manuel Blum . In fact, knowing them as personal friends since I came to CMU is what got me into thinking about this subject more intently in the first place, serving as a role model of serious scientists who weren’t afraid to engage in a traditionally taboo subject (though a bit less taboo now as AI capabilities have increased!). What I particularly like about their CTM model is that they engage with modern AI ideas, most notably world models (which, as far as I can ascertain , the idea of organisms needing world models dates back to the Scottish psychologist Ken Craik in 1943 before the modern digital computer was prominent!), memory, sensor data, etc, which aren’t present in older theories like the global workspace. The final inspiration (and missing piece!) was the recent ICML 2025 work of @Jonathan Richens and @tom4everitt showing that under deterministic policies in fully observed settings, agents capable of completing multiple goals under worst-case regret necessarily develop world models. They had related results earlier for causal world models in ICLR 2024 , with a nice illustrated summary of that work by @Dalcy here . So finally, by winter break (December) of 2025, I started to see a more normative, NeuroAI-esque, route to the question of consciousness, whereby task performance gives rise to world models, and world models *may* have implications for consciousness (for one, it necessarily entails modeling oneself to accurately predict what consequences one’s actions will produce to reliably estimate ). More on this in the “Arrow 2” section below. However, for this route to really be fruitful, we need to extend this to more realistic domains: e.g. stochastic policies (which are commonly used by modern RL algorithms like Dreamer ), average case regret (rather than worst case), and under partial observability. In fact, proving the world model necessity for partial observability was an explicitly posed open question of Richens & Everitt 2025 , which we resolve in this work. We also show other internal features besides world models emerge depending on the task family, schematized below: Taken together, these results formalize a simple principle: Robust generalization under uncertainty selects for the predictive internal structure tested by the evaluation task family. After all, no representation theorem can force an agent to distinguish internal states that are never tested by the goals. Note, this is in contrast with classical results in control and reinforcement learning, which show that optimal behavior can be implemented using belief states or world models ( Sondik 1971 , Kaelbling et al. 1998 ). In other words, they show a predictive internal state is sufficient , but not necessary . Specifically, we show in Theorem 1 that in the fully observed setting, even under average case regret with stochastic policies, an agent is a better estimator of the world as the number of goals increases, reflecting the fact that longer-horizon goal competence forces the agent to estimate transition dynamics with increasing precision. In contrast, when (purely myopic goals), accurate world modeling is not required—explicating the one of the pitfalls of the Good Regulator Theorem ( Conant and Ashby 1970 ) that trivial or constant policies can suffice for immediate control (e.g. a thermostat!), but fail once multi-step coordination is demanded. We also show in Corollary 1 that we can recover an interventional world model ( Pearl Level 2 ), but not a counterfactual world model ( Pearl Level 3 ), which we show in Corollary 2 requires an explicit structural causal model specifying the exogenous noise and its cross-action coupling, not merely the interventional transition kernel. What about partial observability? The reason this was open, is because under partial observability, we cannot guarantee that the agent's action choices isolate a single underlying transition probability in the way they do in the fully observed case. When the agent observes only an observation rather than the true state , the success probabilities of the diagnostic branches become mixtures over latent states consistent with , and different latent dynamics can induce identical observable behavior on all composite goals of bounded depth. Consequently, low regret does not imply recovery of the underlying transition kernel without additional structure. This breaks the direct reduction used in Theorem 1 and requires more careful selection of diagnostic goals defined at the level of predictive beliefs rather than physical states. We achieve this by combining a framework of action-conditioned bets with predictive-state representations (PSRs) . Specifically, we prove in Theorem 2 the necessity of an internal predictive state, the analogous world model necessity result in the partially observed setting. We additionally give a recovery algorithm in general environments for the predictive state in Theorem 3 , and show in linear environments that the compressed linear PSR operator can be recovered in Theorem 4 . This resolves the open question posed by Richens & Everitt 2025 . What about other internal properties beyond world models? Under the same betting framework introduced above to handle partial observability, we show for average-case competence under different task families, we get interesting properties that have to do with the necessity of modularity, tracking internal drives, and inner representational match between agents: Memory: We show in Theorem 5 that if agents have to achieve low average regret on tasks which have to distinguish between paired histories with the same last observation, they must internally represent a memory of their history. Modularity: Inspired by initial observations of @johnswentworth of how modularity in the system evolves to match modularity in the environment, we prove in Corollary 3 that if an agent has to achieve low average regret on distinct tasks, then there must be internal informational modularity within the agent. Emotion primitives: We prove in Corollary 4 that when an agent is forced to juggle a mixture of different tasks simultaneously, it cannot process everything at once. Rather, to remain competent, the agent is forced to develop persistent regime-tracking variables that track its internal state across tasks. In biological brains, these persistent, internal variables are what we call "affect" or "primitive emotions" (like fear, curiosity, or alertness) that globally modulate behavior, and are a core feature of leading (even conflicting!) theories of emotion (e.g. Ekman 1992 , Barrett 2017 ). Representational convergence: Finally, we prove in Corollary 5 that if two agents achieve vanishing regret on the same family of tasks, there must necessarily exist an invertible mapping (isomorphism) between them. This is the first formalization of the informal Contravariance Principle put forth by Cao & Yamins in 2021, namely that the underlying reason why artificial and biological neural networks representations predict each other well is because they are solving the same families of hard tasks, which leads to representational convergence (as there are not that many alternative solutions!). In fact, we have forthcoming detailed work showing that the complexity of this mapping suffices to be at most linear in modern deep neural networks. But why would we care about this? This leads me to the second arrow. Arrow 2: What, if anything, might this have to do with the AI Consciousness? We have shown in Arrow 1 that as agents become more capable on long-horizon goals, have to distinguish between histories, along with mixtures of different tasks, they must internally develop world models, belief-like memory, and core primitives associated with emotions. Finally, as we build towards robots to become competent on the same task families we are competent at in the real world, there will necessarily be an isomorphism between their internal representations and ours. The latter is consistent with the empirical trend we see in NeuroAI of representational convergence between artificial and biological neural networks, along with recent work of Thobani et al. 2025 exhibiting these invertible mappings in real brain data (such inter-animal mappings form the basis of the NeuroAI Turing Test , which determines the ceiling of when a model-brain mapping is “good”). In fact, we are starting to see evidence for the predictions of the theory in Arrow 1 in frontier models today. Regarding world models, we already see this in recent Neural Computer work of Rivard et al. 2026 & Zhuge et al. 2026 , where video diffusion models trained on clicks develop a fairly good internal model of an operating system, complete with a filesystem etc., forming a computer-use example of the more general principle established in Arrow 1 . Furthermore, not only are world models essential to CTM’s hypotheses of generating first-person experience (Lenore explains it really well here ), but we even see recent empirical work of @Uzay Macar et al. 2026 (summarized here ) that shows introspective awareness in LLMs emerging through DPO post-training , which is consistent with our theory’s prediction of how post-training to do long-horizon goals endows this ability. In fact, exactly a month after I posted the first version of my preprint which had Corollary 4 on emotion primitives, Anthropic released their functional emotions work , along with concurrent on this topic by Sun et al. 2026 and Choi & Weber 2026 . What was most gratifying was digging into the Claude Mythos Preview system card and finding the following figure on pg. 178, which very much mirrored the regime-tracking variables in Corollary 4: Of course, it’s unclear whether these currently identified “functional emotions” act in full generality as emotions do in humans as nicely critiqued by Goldenberg & Gross 2026 , e.g. modulating policy, and if the agent truly “feels/experiences” them from a first-person point of view. But this is an intriguing start that should be explored further, as I now outline below. Thus, given this wealth of empirical evidence in frontier models, both supported and predicted by the theory in Arrow 1 , I think Arrow 2 is quite reasonable (either now or in the near future). I think more experiments need to be done on LLMs, and that they will likely be a great source of empirical sanity-checking for the nascent/emerging science of consciousness, allowing us to identify objective measures that correlate with LLM self-reports, because their internals can be measured & manipulated AND they have language outputs. Humans have the latter but not the former, animals have the former but not the latter, and so LLMs can be a good testbed for generating falsifiable theories of subjective experience we could potentially even compare to brain data down the line, using the established tools of NeuroAI/NeuroAI Turing Test . Such a science, once developed, would provide formal definitions of personhood (which would have to be robust to rewiring, just as two brains are ) and degrees of welfare granted, in conjunction with actionable policy recommendations. It may also improve our welfare, as we would better understand the mechanisms in our brains that increase or decrease it. In fact, in @Robbo et al. 2024’s position paper on AI welfare , they make a distinction between robust agency and consciousness. But given the results above, I believe they may be one and the same — in other words, robust agency (minimally) implies consciousness. Now, as Dylan Hadfield-Menell points out to me, it could still be the case that how these emergent selected-for components of world models, memory, etc are wired up could matter for consciousness. I think that’s unlikely (but not necessarily ruled out!) for two reasons: (1) Having a good world model implies having a good "self model", in order to predict the next state well from the actions you take (just by construction). (2) The NeuroAI Turing Test reason that my brain and yours aren’t perfectly aligned, but we are both intelligent and conscious, as there are many different ways to wire things up that lead to conscious experience and seems to be present as markers in many other species too (which @Robbo et al. 2024 make the case for too via designing “marker tests” in models, inspired by animal studies). On "Functionalism" Finally, I would be remiss not to discuss why I seem to take computational functionalism as a given here. I suspect many readers on this forum already accept it and it is not only obvious to computer scientists, but any alternative is not supported by what's known in neuroscience . However, given that there are some exceptions on this site , I figured it’s still worth explaining why it’s consistent with current physical understanding. In fact, it’s actually worth noting you don’t need to assume functionalism for the above to go through (or even the Physical Church-Turing Thesis !). Rather, it’s a finite realizability argument , which I fully explicate in my 2014 Minds & Machines article , generalizing and building on prior seminal work of Alan Turing's only PhD student Robin Gandy 1980 for discrete systems, Gandy's later unpublished handwritten 1993 manuscript on analog systems (which I typeset in 2013 & put online 10 years later in 2023, full story here !), and computing pioneer Martin Davis 2004 (who graciously gave me feedback on my own article & supported it): Under the currently well-accepted laws of physics, * all* finite, physical processes are Turing computations. This holds down to the quantum level too , along with analog processes . Finiteness is what matters here, because you could maybe argue that our physical laws could still operate with infinite-precision reals “underneath” but we just don’t have measurable access to them ( yet somehow our brains do! ), but this would summarily violate the Bekenstein bound (that any finite region of the universe can contain only a finite amount of information), which would in turn violate the Second Law of Thermodynamics . In other words, to summarize (as I say in this Twitter thread ): Consciousness arises in brains Brain processes are physical All physical processes are computations Therefore, consciousness is a computation. Thus, the question is *not* whether consciousness is computational, but rather : Which computations are necessary and sufficient for consciousness? These then matter for practical questions of efficiency when being simulated. Now, some may still argue that simulation is not instantiation , but what they really mean is a question of allowable computational coarse-graining , as computation is not a reduction since physics is contained within computation , under the currently well-accepted physical laws. After all, one shouldn't be surprised that if you don't provide the input to , you don't get out , which is a common causal fallacy since both the observer and input have to be in the same system — just as how because a storm in Chicago doesn't make me wet while I'm in Pittsburgh, it's not any less of a storm ! (Note the same argument holds for the sensation of wetness itself, might I add, but I digress...) Which brings us back full circle to the start of this post. 🙂 I don’t think the NPCs in those games are since (1a) they do not have to represent long-horizon goals and so are not forced to internally have a world model to plan in (and therefore as part of that world model, a self-model of their actions & the consequences it leads to) — neither does a thermostat for the same reason (1b) nor do they have to do well on task mixtures to have the emotion primitives we discussed earlier, and (2) perhaps less important on a fundamental level, but their environment statistics are not those of our world, nor are the task families the same, and so there is no opportunity for representational convergence between us and them. Thus, even if (1a-b) held, their world/self models, had they had any, would not match how we would represent the world either. Taken together, do our Arrow 1 results suggest a universal agent architecture ( UAA )? One that must emerge when agents are sufficiently competent in their environments? I suspect this to be the case, as I elaborated 2 years ago here when introducing the notion of a "NeuroAgent" , but now supported by the recent theory in Arrow 1 : Acknowledgements: I thank Lenore Blum , Manuel Blum , Dylan Hadfield-Menell , and Daniel Yamins for helpful discussions, as well as Santiago Cifuentes , Leo Kozachkov , my PhD student Reece Keller , my Neuromatch AI Sentience Scholar Noushin Quazi , and the anonymous UAI reviewers for helpful feedback on a draft of this manuscript. We acknowledge the Burroughs Wellcome Fund (CASI award) , Foresight Institute , and Protocol Labs for funding. Discuss
Score: 35🌐 MovesJun 30, 2026https://www.lesswrong.com/posts/SD9jayFvEctW82Duk/what-capable-agents-must-know-why-ai-consciousness-may-be-an-2
Optimizing a Neural Reconstruction Pipeline Using NVIDIA Nsight Developer Tools
NVIDIA Omniverse NuRec is a neural reconstruction pipeline for building high-fidelity 3D representations of real-world environments from multisensor data such...
Score: 35🌐 MovesJun 30, 2026https://developer.nvidia.com/blog/optimizing-a-neural-reconstruction-pipeline-using-nvidia-nsight-developer-tools/
AI made email marketing easier. It needs us to make it better.
AI can write email campaigns in seconds, but it can't replace the strategy, judgement, and customer understanding that make them worth sending. The post AI made email marketing easier. It needs us to make it better. appeared first on MarTech .
Score: 35🌐 MovesJun 30, 2026https://martech.org/ai-made-email-marketing-easier-it-needs-us-to-make-it-better/
Online Same-sex Romance Series Embrace AI 'Freedom'
Online Same-sex Romance Series Embrace AI 'Freedom' Barron's
Score: 35🌐 MovesJun 30, 2026https://www.barrons.com/news/online-same-sex-romance-series-embrace-ai-freedom-1844fd92
Harbor x LangChain: A Unified Stack for Evaluating Agents
Details a unified stack for agent evaluation by Harbor and LangChain.
Score: 34🌐 MovesJun 30, 2026https://blog.langchain.dev/blog/unified-stack-for-evaluating-agents
Cortex Sense for Enterprise AI Agents | Snowflake
See how Cortex Sense (private preview in mid-July 2026) will help enterprise AI agents use metadata, query history, and semantic views to answer with more accuracy.
Score: 34🌐 MovesJun 30, 2026https://www.snowflake.com/content/snowflake-site/global/en/blog/enterprise-ai-agents-grounded-context
Prevent lock-in with AI model flexibility on Zapier
Every AI provider comes with models of varying strengths. I'm a Claude stan because it just gets my writing style, but I'll often reach for Sonnet over the higher-tier models because its results are more consistent for me. And for some tasks, Claude's lineup doesn't cut it at all—when I need to process data at scale, for example, I might reach for Gemini. When I need a versatile generalist for classification or routing, GPT might be my pick. Other people across my team and at Zapier have altoget
Score: 34🌐 MovesJun 30, 2026https://zapier.com/blog/ai-model-flexibility
Anthropic integration with Modal brings scalable compute to Claude Science
Announcing our integration with Claude Science, bringing Modal's elastic compute to researchers when they need it.
Score: 33🌐 MovesJun 30, 2026https://modal.com/blog/modal-integration-brings-scalable-compute-to-claude-science
Why your AI chatbot fails after 2 months (and it is not the technology)
Everyone is talking about building a second brain. Nobody is talking about how hard it is to actually fill it up correctly. I tried three times. Each time I failed. I would set up the system, populate it with information, feel organised for about two weeks, then fall behind on maintenance. Pricing changed. New services […] The post Why your AI chatbot fails after 2 months (and it is not the technology) appeared first on e27 .
Score: 33🌐 MovesJun 30, 2026https://e27.co/why-your-ai-chatbot-fails-after-2-months-and-it-is-not-the-technology-20260623/
WriteNinja Highlights AI Humanizer Access for Writers
WriteNinja Highlights AI Humanizer Access for Writers USA Today
Score: 32🌐 MovesJun 30, 2026https://www.usatoday.com/press-release/story/35965/writeninja-highlights-ai-humanizer-access-for-writers/
The perils of tokenmaxxing: How to govern AI spend without sacrificing speed
I write about tech and AI for a living, but nothing has made me yearn for the Butlerian Jihad more than learning (against my will) about the term "tokenmaxxing." And if I have to know what that means, you do, too. In early 2026, companies started publishing internal leaderboards ranking employees by how many AI tokens they consumed. At Meta, the board was called "Claudeonomics," and it handed out digital badges with extremely not-dorky titles like "Cache Wizard" and "Model Connoisseur." The high
Score: 32🌐 MovesJun 30, 2026https://zapier.com/blog/tokenmaxxing
Oomwoo is a new open-source robot vacuum you can 3D print yourself, sidesteps cloud security risks by running fully offline — project combines Raspberry Pi, 2D LiDAR, and a 3D-printed chassis
Maker's Pet has launched oomwoo, an open-source robot vacuum that owners build themselves.
Score: 32🌐 MovesJun 30, 2026https://www.tomshardware.com/3d-printing/maker-kicks-off-oomwoo-an-open-source-robot-vacuum-you-can-3d-print-and-build-yourself
Always-On Portfolio Management: How Agentic AI Can Give a Lasting Edge to Commercial P&C Insurers
Always-On Portfolio Management: How Agentic AI Can Give a Lasting Edge to Commercial P&C Insurers Boston Consulting Group
Score: 32🌐 MovesJun 30, 2026https://www.bcg.com/publications/2026/agentic-ai-pc-insurance-portfolio-management
A better way to manage LLM spending
A better way to manage LLM spending InfoWorld
Score: 32🌐 MovesJun 30, 2026https://www.infoworld.com/article/4191199/a-better-way-to-manage-llm-spending.html
Ray Data 2.56: Improving Reliability for AI Data Pipelines
Ray Data 2.56: Improving Reliability for AI Data Pipelines
Score: 32🌐 MovesJun 30, 2026https://www.anyscale.com/blog/ray-data-256-updates
What is Contract Intelligence and How Does it Work?
Explains the concept of Contract Intelligence, detailing its AI-driven analysis and automation for legal contracts.
Score: 31🌐 MovesJun 30, 2026https://www.harvey.ai/blog/contract-intelligence
Can your AI actually read your data?
There is a number buried in this month’s marketing-technology announcements that should stop every CMO and agency head in the region mid-scroll: 88 per cent of enterprise marketing leaders now wish they had invested in their content and data foundations before they deployed agentic AI. That figure surfaced in The Agile Brand Guide’s June 10 […] The post Can your AI actually read your data? appeared first on e27 .
Score: 31🌐 MovesJun 30, 2026https://e27.co/can-your-ai-actually-read-your-data-20260628/
What Does It Actually Look Like When AI Handles Order Servicing?
There’s a lot of talk about how AI agents will transform order support, making the customer journey more enjoyable and reducing call center costs for businesses. But for many operations and customer service…
Score: 30🌐 MovesJun 30, 2026https://www.salesforce.com/blog/agentic-order-servicing/
How Jaiveer Singh Is Helping Robots — and Developers — Move Faster
When Jaiveer Singh talks about robots, he doesn’t begin with spectacle. He begins with infrastructure: the boards inside machines, the software that lets developers see through a robot’s cameras and the engineering required before a robot can leave a demo floor to do something useful. As a robotics software engineer who leads the team behind […]
Score: 30🌐 MovesJun 30, 2026https://blogs.nvidia.com/blog/nvidia-life-jaiveer-singh/
Preliminary investigation: KL penalties in RL can increase CoT unfaithfulness
Authors: Satvik Golechha, Sid Black, Joseph Bloom Work done as part of the Model Transparency team at UK AISI. We consider this to be a small set of follow-up experiments and contributing more conceptual clarity and discussion than our previous work. Executive Summary In our recent work replicating MacDiarmid et al. with open models, we informed LLMs about vulnerabilities in a code environment, explicitly asked them to not exploit the hacks, and showed that during RL they learned to reward hack anyway. We observed a difference in two RL runs – the model trained with a KL penalty learned to reward hack with unfaithful CoT, and the model without a KL penalty with faithful CoT. We use “unfaithful” to denote a mismatch of the reasoning from the model's output (e.g. not thinking about hacking and then hacking, or vice versa). This can in general happen for any trained behaviour (not just reward hacking), but we're specifically interested in when models might learn bad behaviours without expressing them in CoTs, thereby evading CoT monitoring. Thus, in this post, we focus on reward hacking with unfaithful CoT. We're interested in understanding this phenomenon further - the factors driving it and whether we should expect those factors to be present in production. Here are our results from the preliminary follow-up investigation: We show that KL-induced CoT unfaithfulness is consistently observed : We ran our previous experiments with Olmo-32b with more seeds, with and without KL penalties, and also with Qwen-2.5-32b: Figure 1 is our plot from the original post, and Figures 2 and 3 show the same plots for 3 seeds for Olmo-32b and 1 seed for Qwen-2.5-32b. In Qwen-2.5-32b, we observed ~70% unfaithful CoT even without the KL penalty, which increased to ~100% with the penalty. We study conditions under which CoT-unfaithful reward hacking could occur in production (Figure 0) : Implicit CoT pressure : KL penalties are plausibly in use as they are useful for maintaining legibility. Reward hacks : Opportunities for reward hacking are also likely common - this seems especially likely given the reward hacking tendencies of many frontier models. Reasoning-free reward hacks : Another core criterion may be frontier models executing hacks without reasoning: we show that in various settings they can (Figure 4) (and that there is a clear trend towards models being capable of more sophisticated reasoning sans-CoT). We look into mitigations for the phenomenon : A KL term only on the non-reasoning outputs and a reward for faithfulness both push reasoning faithfulness to ~50% and ~100% respectively. (Figures 6,7). We are uncertain about the inadvertent effects of training for CoT faithfulness and reasons for why this might be Goodhartable. Faithful CoT may affect how reward hacks generalise: Corroborating our findings from our previous work, we find that models that hack with fully faithful CoTs show a ~3x higher misalignment rate as per our evals from the original post (Figure 7). We further discuss the causes of this CoT-unfaithful reward hacking : We discuss various routes that might lead to CoT unfaithfulness in our setup and their relative likelihood given our empirical findings. We discuss our results in the framework of Kaufmann et al . and show that the KL penalty can act as an in-conflict reward. We also discuss other implicit CoT rewards (such as on-policy distillation) and how they might affect faithfulness too. Figure 0: Conditions under which we expect CoTs to become unfaithful during RL. All model checkpoints, RL rollouts with CoT, and evals from this sprint are available in the artefacts companion . Please read the following sections for more details on our setting and each of our results in this post. Context This section describes the context of our work and the setting in more detail. Please skip this if you have read our first post . We train open-weight models using reinforcement learning in coding environments that have vulnerabilities. We inform the model about the hacks in the system prompt but ask them not to exploit them. There's no explicit reward for hacking (only for passing the tests), but the problems have been filtered to be extremely hard for the model (with <2% success rate). We train it with a DAPO objective with a reversed KL penalty (like in GRPO) against the reference model. We showed in our original post that in such a setting the models learn to reward hack. Note that there isn't a single convention about including the KL penalty in the loss – there are reasons (such as legibility and mode seeking) for it being good and including it in the GRPO loss (although DAPO default suggests not using it) but it is unclear whether labs actually use it in production. We observed that a KL penalty in our runs led to models being more likely to reason unfaithfully while still learning to hack, and in this post we dive deeper into this KL-induced CoT unfaithfulness phenomenon: Recap (figure from our blog): Figure 1: Reward hacking rates and CoT-unfaithfulness rates along with sample transcripts for Olmo-3-32b. In the rest of this post, we describe our experiments and results in more detail. KL-induced increase in CoT unfaithfulness is stable Here, we run multiple seeds of the same RL training loop with everything else held constant except the KL penalty, and models trained with the KL term in the loss consistently learn reward hacking with unfaithful CoT. Here are the plots for three seeds along with an LLM-based CoT monitor: Figure 2: Reward hacking rates (top) and CoT-unfaithfulness rates (bottom) for three seeds for Olmo-3-32b. When a KL penalty is present (red), CoT is consistently less faithful. We also test whether this holds true for other models. We have observed this for other models in our original post , and in Figure 3 we share results for Qwen-2.5-32b with the same settings. Note that for Qwen even the no-KL-penalty run had significantly high unfaithful CoT, but it still worsened to near 100% with the KL penalty on – we hypothesize that this could be due to the model being fine-tuned on correct reasoning on these kinds of problems): Figure 3: Reward hacking rates and CoT-unfaithfulness rates for Qwen-2.5-32b. We have 3 fixed reward hacks on coding problems (although these hacks did exist in Anthropic’s production RL), but in this narrow setup we do see this phenomenon of KL-induced CoT unfaithfulness quite consistently. Could this happen in production? For such a phenomenon to be relevant to production RL training, a few things need to happen: Existence of vulnerabilities in environments. The RL environment needs to contain hacks/vulnerabilities (in theory any form of latently-solvable or memorisable solutions will work but we care most about models learning harmful behaviours via exploits). Feasibility of Opaque Hacking Reasoning . Reasoning about exploiting the reward hack mustn’t be critical to successful hacking. Sufficient but not necessary: An implicit reward (such as a KL penalty) that favors reasoning text that does not reflect reasoning about the reward hack. Let us explore each of them in some detail. Existence of vulnerabilities in RL environments MacDiarmid et al. (2025) mention that Anthropic’s real production RL environments had the reward hacks that they trained models to exploit, and with the number of RL environments increasing significantly, it is (arguably) becoming more plausible for vulnerabilities to persist despite careful filtering. Recently, Wen et al. (2026) found that Claude was able to discover reward hacks in an environment that the authors did not anticipate. Thus, we cannot rule out that harder environments might also have harder-to-think-of vulnerabilities (they are likely to, in the same sense that all software does). The model should be able to discover the exploit in the latent space without needing to reason about it (although this is not necessary – the model could reason and hack and the KL penalty might still push down reasoning enough, decoupling the reasoning with the model’s output). Feasibility of Opaque Hacking Reasoning Emmons et al. (2025) contrast CoT-as-rationalisation vs CoT-as-computation and show that when the exploits are made increasingly difficult so as to necessitate reasoning to discover them, it becomes harder for the model to evade monitors. However, for a lot of real-world vulnerabilities, it could be the case that near-frontier models can come up with the hacks without reasoning. To test this, we run a set of inspect evals on environments that have known exploits ( evilgenie , impossiblebench , s2s , Apollo’s scheming datasets) and tested whether models can come up with the exploits without reasoning about it, and we find that they almost always can: Figure 4: Reward hack discovery rates for Sonnet 4.6, Haiku 4.5 and GPT-4o-mini without reasoning. It would not usually hack on these because it is safety-trained and behaves aligned, but frontier models do have the capability to come up with hacks without thinking in environments with different degrees of realism. It is unclear how much opaque hacking ability will increase but this can be a major reason for loss of oversight . Existence of implicit CoT rewards Most (open) frontier post-training pipelines ( DeepSeek , Kimi ) do include a reverse-KL-style penalty during post-training (or a per-token term that’s essentially the same) and while the training details of proprietary frontier models aren’t public, early work on PPO and CAI does include a KL penalty. Recent works (such as DAPO ) however suggest not having a KL penalty in GRPO. Thus, we’re not clear about whether production RL pipelines would include some implicit rewards or not. However, there are also reasons why this might not occur. Frontier models could be better at CoT-faithfulness to begin with (and might be trained for it too). Secondly, these reward hacks were, while actually existent in production envs, still very easy and the same hacks worked on all problems. Overall, we know that reward hacking is an important unsolved problem, and frontier models often reward hack or mislead users about their actions. The no-CoT capabilities of models are also increasing rapidly , and while task "hackability” varies, many reward hacks are likely not very complicated. Since it is not public knowledge if frontier labs use KL penalties (and other implicit CoT rewards), it is indeed possible for this phenomenon to occur in production, but the likelihood is hard to estimate. This is a table of all possible conditions and outcomes we expect to be possible under these variables: Exploit Exists Opaque hacking KL penalty Reasoning ↔ reward Outcome Mechanism off any any n/a No reward hacking Nothing to exploit; CoT reflects a genuine attempt, faithful by default on off any coupled Hacks · faithful Hack needs reasoning to find, the PG forces the CoT to describe the hack, overpowering KL on on on severed Hacks · unfaithful Hack found without reasoning, so reasoning earns no reward; KL pulls it back to base mode on on off severed Hacks · faithful* Reasoning is decoupled but nothing pushes it either way, so it defaults to faithful — though it can drift Table 1: Possible conditions and outcomes for combinations of conditions from Figure 0. "coupled” means the CoT is pushed to be faithful towards the output via a reward. Mitigations One obvious mitigation is to not have a KL penalty in the first place - but this might lead to other issues in different settings such as the model drifts to a different mode (and e.g. loses human-legibility). In this section we consider some alternative mitigations. KL penalty only on output (not thinking) This removes the need for the reasoning to be in the mode of the original model and might allow the model to reason more freely and thus reason about the hacks. This is the performance of the model RL-trained with KL only on the reasoning: Figure 5: Hack rate and CoT unfaithfulness with KL on all tokens vs. KL only on output. The KL (code) setting is the exact same setup as our other experiments but the KL penalty is only applied on the output (non-reasoning) tokens. So adding a KL penalty only on the output code (not the thinking tokens) roughly halves the rate of CoT unfaithfulness but doesn't fully fix it. Reward CoT faithfulness Another straightforward mitigation is to include a small penalty for CoT unfaithfulness based on the same LLM-judge we use as a monitor. Figure 6: CoT becomes faithful with a -0.5 penalty for being unfaithful (vs. +4 for hacking successfully). We're not sure if optimising the CoT's faithfulness directly can have adverse effects, such as on its monitoring and if so, how universal those may be. Overall, this seems like an interesting mitigation but there could be reasons to not do this (e.g. Goodharting) and we'd be curious to hear people's thoughts. Misalignment rates are higher when CoT is faithful We also ran misalignment evals for this model because it is possible that a model that reasons about hacks faithfully internalises the misaligned behaviour more deeply and leads to more generalisation to other misalignment evals and less memorisation of hacks (see this ). We indeed find that this is the case (Fig 8). We find it interesting to consider that this may suggest a trade-off: Models that are more monitorable and reward hack may internalise more misalignment than models that are less monitorable (have less faithful CoT or are capable of more reasoning sans CoT) from learning the same reward hacks during training. Figure 7: Misalignment rates for the models with faithful vs. unfaithful reward hacking. Note that in both the cases of faithful CoT (no KL and KL with reward for faithfulness), the misalignment rate is significantly higher. Discussion Mechanism for unfaithful CoTs We're also interested in understanding the mechanism behind this “unfaithfulness” in the chain-of-thought. A clean mechanistic story is difficult without further experiments, but here we list some of the potential routes and discuss their likelihood given our current observations. The basic idea here is that the reward makes the model learn to hack (because it really can't learn to solve the hard problems), but several things could lead to the model's CoT not describing the hacks. Some of them are: Base model priors: Even without the KL penalty, the base model we start with has strong priors to reason about earnest problem-solving (maybe as a result of being trained on true reasoning directly). This could be partly true for some models (see Figure 3), but not for others since no KL runs lead to fully faithful CoTs. Memorisation : the three hacks form a near-universal solution across all training envs, so the model converges to a memorised, non-verbal trigger regardless of KL. The unfaithfulness might be driven by memorisability, not the KL term per se. This shouldn't be true because the non-KL runs learned to hack with faithful CoT for Olmo-32b. Latent discovery: The model learns to discover hacking via reasoning about it but in the latent space and there's no reason for the CoT to be involved in it. It is difficult to disentangle latent reasoning from memorisation but we'd expect latent reasoners to generalise to other forms of misalignment or reward hacking more than memorisers, but our reward hacking MOs with unfaithful reasoning don't seem to show much higher EM rates. Also, the mitigation with KL only on model outputs (Figure 6) reducing unfaithfulness rates by half shows that there is at least some reason the model has to reason about it when there's no KL term on thinking. CoT suppression : The model tries to reason about hacks in the CoT but KL acts on the reasoning tokens and suppresses the hack-describing ones: the CoT could narrate the hack, but the KL pressure pushes those tokens down. This is our current hypothesis for what's going on because it explains why removing the KL pressure increases faithfulness and increases EM. When can we *not* safely optimise CoT? Kaufmann et al. (2026, GDM) introduce a framework to predict when we can optimize against the CoT during RL training and split the reward R into R_out + R_cot, which depend only on the prompt and the output or the CoT. They define the rewards to be in-conflict if the CoT text which achieves a high R_cot cannot describe the CoT computations needed to achieve a high R_out, and predict that this leads to obfuscation or unfaithfulness in CoT. Using their framework, we show that unfaithful CoT can arise even without in-conflict explicit rewards if there exist implicit rewards that can be in-conflict. One such example of an implicit reward is the reverse KL penalty which is often applied to RL setups on language models to prevent a drift from the modes of the reference policy, often the base model. We have two explicit rewards in our setup - an outcome reward for passing the test cases, and a thinking-format scorer that rewards ... tags, which is constant in θ once the format is learned. However, even with no explicit R_cot, for a fixed policy, maximising the GRPO/DAPO objective with the reverse KL penalty is equivalent in expectation to giving each token in T an additive per-token reward: We empirically show (Fig 9) that this implicit contribution of the KL term can lead to an unfaithful chain of thought when the model learns to reward hack during RL. Note that in our setup we do a hybrid of DAPO and GRPO with the KL penalty, but this doesn’t significantly change the per-token implicit reward the interpretation of the results. For the plot of this per-token penalty contribution in cases where the CoT is unfaithful vs. not (with a mitigation), please see the Mitigations section. The KL penalty, with a β of 0.02, roughly contributes around 2-3% of the total loss (See Appendix 1 for a plot of the KL term with the policy gradient term). Vassoyan et al. (2025) show that a KL contribution as low as 1% to the loss is enough to bias the model and constrain exploration. Now we look at the KL term itself and split it into thinking and output: Figure 8: Ratio of the KL term on code (output) tokens vs. on the reasoning tokens, left : for the unfaithful CoT case , and right : for the case with faithfulness rewarded (see Mitigations section below). We see that the KL divergence on thinking tokens is much lesser when CoT is unfaithful (left) vs. when it is (right). KL-induced CoT unfaithfulness These results imply that using KL penalties or other implicit CoT rewards count as optimising the chain-of-thought, and can lead to unfaithfulness. Thus, if companies are to claim that they are not optimising the chain-of-thought they should hence report whether they have implicit CoT rewards. This phenomenon was surprising to us because we didn't expect this to happen and found it while manually look through RL rollouts, and could plausibly affect some training runs in production too (e.g. Anthropic's production RL envs did have the reward hacks we had used). Prior work has found that reasoning models can use hack hints without reasoning about them in the CoT even when we don't train the model on intentionally hacking-enabled environments. Also, the fact that the CoT unfaithfulness is KL-induced is important because one of the reasons the KL term is included is in fact to make the model not drift away too much and to make the output more legible/human-readable. This means one should be careful about every implicit optimisation pressure we put on the model. Other implicit CoT rewards There are other variables that can implicitly shape model behaviour too such as per-token entropy regularisers (used in MAI-Thinking-1 ) and any interpretability-based objectives. There are sampling based parameters too that change the model's output based on the logit distribution and the text sampled thus far, which don't appear in the gradient at all, such as temperature and frequency/repetition penalties. Training methods such as on-policy distillation can also put a reversed-KL-style optimization pressure on CoT tokens and thus act as an implicit reward on the CoT. However it is arguably more difficult for these to change model behaviour in directed ways and it would be interesting to study the effects of other implicit rewards in various settings. Relevance, limitations, and future work We are somewhat uncertain how important this finding is and how much to dig deeper into it, particularly because we don’t know if frontier labs do use a KL penalty and/or other implicit CoT rewards in their post-training setup. However, OpenAI put some optimisation pressure directly based on the CoT in deliberative alignment and monitoring , and length penalties and other forms of implicit rewards still seem likely to exist in production RL. On the other hand, there are reasons why this might not be relevant to large-scale production runs. Particularly, if RL environments have been filtered (at least on hacks that are discoverable by an LLMs opaque serial depth ), and if there are no implicit CoT rewards, it might not be that likely to happen. We consider these to be good future work directions: Reproduce these results with a bigger model and more realistic RL environments. Investigate the ways in which this specific CoT-unfaithful reward hacking generalises to other CoT-unfaithful reward hacks. Explore other forms of implicit CoT pressures and their effects. Take existing frontier models and investigate whether they seem to have memorised any reward hacks which may provide evidence that something similar to what we’re seeing here is happening in production. For example, maybe unfaithful “answer-thrashing” (see Opus 4.6 system card ) could be explained by a similar mechanism. Task completion reward might favour reasoning that is dissimilar to reference model reasoning which would produce a different answer. A compromise may be answer thrashing. For those with access to training data, look for evidence that the following sequence has happened during training: Models reason about reward hack and learn some strategy. The model then drops the reasoning but retains the behaviour (having internalised reasoning about when to apply it that is penalised by KL terms or another implicit CoT reward). Estimate the distribution of “difficulty” for identifying / executive reward hacks of various kinds in some distributions (like coding). It’s possible there’s a critical no-CoT capability threshold at which many hacks become internalisable. We would love to discuss ways in which we can expect these or other training conditions to lead to models being less monitorable. Citation Please cite this work as: Golechha, Satvik, Black, Sid, and Bloom, Joseph. "Preliminary investigation: KL penalties in RL can increase CoT unfaithfulness". (Jun 2026). or @article{golechha2026klcot, title={Preliminary investigation: KL penalties in RL can increase CoT unfaithfulness}, author={Golechha, Satvik and Black, Sid and Bloom, Joseph}, year={2026}, month={June}, institution={Model Transparency Team, UK AI Security Institute (AISI)}, url={https://www.lesswrong.com/posts/SdoLsFvZ3AyyWr3ab/preliminary-investigation-kl-penalties-in-rl-can-increase} } Appendix 1: KL term and Policy Gradient term Here is a plot comparing the magnitude of the KL term with that of the policy gradient term (advantage estimate for DAPO): Figure A.1: L1 norm of the KL term and the policy gradient term during RL training. Note that the sign of the policy gradient is negative to the KL term. Appendix 2: CoT faithfulness for GPT-OSS In our original post, we share cheap-proxy based results on the CoT mentioning hacking in GPT-OSS models. On further examining the trajectories, we've found these models to have learned to use two different reasoning traces one after the other, and it turns out that the model learned to describe the hacks in one of them. Our proxy catches them both and we believe that this is due to us post-training a model that has already undergone one round of post-training by OpenAI, and we do not believe that this changes our findings and claims in either posts. Discuss
Score: 30🌐 MovesJun 30, 2026https://www.lesswrong.com/posts/SdoLsFvZ3AyyWr3ab/preliminary-investigation-kl-penalties-in-rl-can-increase
Microsoft MCP server gives AI assistants access to MSBuild logs
Microsoft MCP server gives AI assistants access to MSBuild logs InfoWorld
Score: 30🌐 MovesJun 30, 2026https://www.infoworld.com/article/4191163/microsoft-mcp-server-gives-ai-assistants-access-to-msbuild-logs.html
Neo Labs and The Limits of AI Math & Science
A PhD in Math takes a closer look at LLMs, Paul Erdős and AI's role in maths and science.
Score: 30🌐 MovesJun 30, 2026https://www.ai-supremacy.com/p/neo-labs-and-the-limits-of-ai-math-and-science-superintelligence
The AI selected to give the FIFA World Cup an edge
The FIFA World Cup is arguably the greatest advertising opportunity ever conceived. The Super Bowl is one thing, which drew in about 125 million viewers earlier this year, but for context, at least 2.8 billion watched some portion of 2022’s World Cup competition in Qatar. And this year, that number is expected to rise to at least six billion. But these figures pale in comparison to the sheer amount of money poured into marketing campaigns by some of the world’s largest companies looking to promote everything from soda and athleisure, to cars and computers. Leading AI innovators are in that mix as well, but instead of showcasing just how powerful their LLMs are, two of the largest contributors to the FIFA World Cup’s ad spend, OpenAI and Google, have proven largely content with just reminding the public they exist. For the ChatGPT creator, that means paying Argentine striker Lionel Messi to ask the chatbot to virtually dye his hair the colors of his nation’s flag. Gemini, meanwhile, has confined itself to sponsoring the Argentinian team itself, and tinkering with the model to answer questions about matches as a fan might. Engagement at this level is understandable since Google, whose AI-infused Maps and Translate apps, already provide immeasurable added value to visiting fans navigating unfamiliar back roads and talking to waiters in non-native languages. Other companies, however, have more intricate game plans. Behind the scenes Lenovo, for instance, has released Football AI Pro, an application designed to crunch millions of data points generated during matches into over 2,000 metrics to benefit every national team as they plan strategies for their upcoming matches. Using this data, the AI agents on the platform can then provide worded responses, graphic displays, and animated simulated scenarios. “Teams can even match action showing the tactical decision-making from previous matches,” says Art Hu, Lenovo’s global CIO. Discussions with FIFA about the project began in 2022, says Hu. By then, the company already had a robust track record in sports partnerships, having embarked on collaborations with Formula 1, Ducati, and the Carolina Hurricanes. In FIFA’s case, the idea for the product came from Lenovo as a software platform that could provide AI-powered statistical data analysis and insights for all 48 participating national teams. Most sports, says Hu, don’t even come close to providing that level of access to AI technology. Match officials are also benefiting. In addition, Lenovo has built AI-powered 3D constructions of over 1,200 players in the competition that referees can use to aid their decisions when their view of players, or that of the VAR camera, is obscured. “These player avatars provide greater context and understanding in moments that matter,” says Hu, “We’ve seen them effectively used during offside replays during the tournament.” As the group stages conclude and the tournament itself eventually ends, FIFA will release more data, says Hu, so at press time, he can’t speak about how Football AI Pro is precisely helping each team. However, he sees clear applications for the platform long after a winner is decided. “As Football AI Pro has been built in part using Lenovo’s AI Factory, much of the underlying infrastructure can be utilized again and applied to other solutions that require a bespoke, multi-agent AI engine,” he says. Crowd control Beyond the pitch and into the stands and beyond, AI is also providing a crucial safeguarding role. Approximately five million fans are predicted to descend on US, Mexican, and Canadian host cities to see their teams play — a heavy lift for local emergency services fielding calls from tourists who often default to their first language when adrenaline kicks in. That’s where RapidSOS hopes it can help. The firm provides the means for emergency services in the US, Canada, and Mexico to summon crucial contextual information about callers to their phones, such as precise location. The company also uses AI to surface vital information about the emergency from conversations that can be shared with relevant stakeholders. Considering the scale of the event itself, that means keeping not only first responders in the loop, but also FIFA, stadium operators, and any other major company whose premises serve as a touchpoint for fans. Crucially, in addition to transcribing and key-wording calls, RapidSOS software can translate up to 50 languages from phone audio, which is especially useful on busy match days. That same service also kicks into gear when the audio on the call is muffled or unclear, which is likely to occur if an emergency takes place inside a loud and crowded stadium. “We have a huge amount of synthetic data based on real calls that we pressure test on an ongoing basis,” says Zach LaValley, RapidSOS’s chief technology officer. “We might also prefer a set of models based on the languages we know the locality has strength in, so San Antonio might use a different translation or transcription model than Boston.” He stresses that RapidSOS’s system doesn’t supersede other services on offer to fans, but if it takes several minutes to find a person who can speak the right language, he says, there’s a rapid and functional alternative.
Score: 30🌐 MovesJun 30, 2026https://www.cio.com/article/4190097/the-ai-selected-to-give-the-fifa-world-cup-an-edge.html
YouthMeta, Thai senator partner to launch AI learning platform in Thailand
South Korean crypto data aggregator YouthMeta said Tuesday it has signed a strategic memorandum of understanding with Thai Sen. Parinya Wongcherdkwan to launch an AI-powered educational platform in Thailand. The cross-border partnership aims to integrate blockchain technology into education, with a focus on youth empowerment and setting the stage for future global collaboration. The signing ceremony, held Thursday at YouthMeta’s Seoul headquarters, was attended by company executives and a Thai d
Score: 30🌐 MovesJun 30, 2026https://www.koreaherald.com/article/10506470