AI News Archive: June 24, 2026 — Part 7

Sourced from 500+ daily AI sources, scored by relevance.

Investors Are Using AI for Financial Decisions. They Still Want a Human Advisor.
Investors Are Using AI for Financial Decisions. They Still Want a Human Advisor. Barron's
Score: 51🌐 MovesJun 24, 2026https://www.barrons.com/advisor/articles/investors-are-using-ai-for-financial-decisions-they-still-want-a-human-advisor-01556f38?mod
AI Campaign Orchestration: Run End-to-End Campaigns Faster
Learn how to accelerate marketing campaigns using AI-driven orchestration for faster, more efficient execution.
Score: 50🌐 MovesJun 24, 2026https://www.typeface.ai/blog/ai-campaign-orchestration-run-end-to-end-campaigns-faster
Choosing your AI stack: The benefits of vendor lock-in
AI has emerged as a top priority for businesses and a vehicle for transformation, as evidenced by Accenture research : 97% of executives believe AI will transform their company and industry. But as companies move from AI pilots to scaling AI across the enterprise, we have had repeated conversations with CIOs and technology leaders who are arriving at the same uncomfortable realization: AI stack decisions are not easily reversible. Unlike earlier eras of enterprise IT, where abstraction layers insulated applications from hardware choices, today’s AI stack—the infrastructure, technologies and frameworks that powers AI systems – tends to be tightly co-engineered, with stronger dependencies in the underlying compute layers. Choices made about models, runtimes and compute platforms now shape cost structures, performance ceilings and strategic flexibility. AI-ready infrastructure has re-emerged as a new source of differentiation, and with it, a new kind of vendor lock-in. At the center of this shift is the move from training – building AI models – to inference, where those models are used in production to generate outputs from new data. While early attention focused on the cost of training large models, enterprises are now scaling AI across the organization, running models continuously across workflows. This shift significantly changes the economics of AI. For instance, agentic AI is reshaping infrastructure architecture and platforms because inference is becoming persistent, stateful and increasingly data intensive. As AI Factories scale, the focus is shifting from peak model performance toward sustainable token economics, where the key differentiators are lowest cost per generated token, power efficiency and infrastructure utilization at scale. In this environment, achieving those outcomes requires full-stack optimization across compute, networking, memory, storage and data fabrics, curated and integrated across ecosystem partners. Secure multitenancy and confidential computing are becoming core design principles, and enterprise AI is now ready to be industrialized at scale. Modern AI infrastructure is a strategic bet What makes AI infrastructure different is not just scale, but integration. Modern AI systems are built on tightly co-engineered stacks where GPU accelerators, high-bandwidth interconnects, compilers and runtimes are designed in tandem to maximize throughput and efficiency for AI workloads. To get the massive computing power required for AI, providers design their hardware and software to work exclusively with one another. This has shifted enterprise decision-making from choosing hardware one piece at a time to committing to ecosystems. And that commitment carries consequences. In traditional IT environments, applications could also generally move across environments with a manageable amount of effort. In AI systems, that assumption breaks down. What appears portable at the model or application layer often depends on deeply optimized components underneath that layer, such as memory handling and compiler frameworks like CUDA or ROCm that are fine-tuned to specific hardware. We find it useful to think about AI systems as a layered structure: Accenture While upper layers retain some flexibility, dependencies increase as you move downward. Changing your foundational AI provider often means having to rebuild and re-optimize large portions of your technology from scratch. This is why infrastructure decisions in AI feel less like procurement choices and more like strategic, high-stakes bets. Why switching AI platforms is harder than it looks In theory, switching platforms should be straightforward. Models can be retrained, applications rewritten, and infrastructure replaced. In reality, the cost of switching extends far beyond hardware or licensing. The first challenge is engineering effort . Migrating to different platforms requires engineers to revalidate model behavior, re-tune inference pipelines, and rebuild performance baselines. During this period, teams spend most of their time stabilizing and not innovating. The second challenge is hidden dependency . Over time, system optimization becomes tied to a specific stack. This might include latency expectations, batching strategies, orchestration logic and even human workflows. These ties are not always obvious, but they shape how systems behave in production. The third challenge is timing . There is never a convenient time to migrate, especially factoring in rising AI infrastructure and inference costs, competitive pressure or scaling demands. Organizations are often forced to switch platforms precisely when disruption is hardest to absorb. Rethinking performance vs control Despite these barriers, organizations do switch. In our experience, this typically happens under three conditions. One common trigger is when the opportunity cost of staying begins to outweigh the cost of leaving. As performance gaps widen across competing ecosystems, inefficiencies accumulate to the point that remaining on the current platform is no longer viable. Another driver comes from shifts in vendor dynamics. Pricing volatility, supply constraints, or misalignment in product roadmaps can introduce risks that force a re-evaluation. Finally, regulatory requirements, data sovereignty constraints or geopolitical shifts can force platform changes regardless of technical preference. Across all three strategies, one principle stands out. Lock-in is not inherently negative, and openness is not inherently superior. Timing matters more than ideology. Given these dynamics, the central question for CIOs is not how to avoid lock-in, but how to manage it deliberately. This represents a significant shift in strategies that previously considered vendor lock-in as a detriment. In practice, we see three broad approaches emerge, each reflecting a different balance between performance and control. Some organizations take a performance-first approach. They optimize deeply within a specific ecosystem because performance directly drives business outcomes. Eli Lilly’s AI Factory is a strong example. The company has invested heavily in a tightly integrated NVIDIA-based stack to maximize throughput and utilization. In this case, infrastructure is a competitive lever and not merely a support function. Higher switching costs are accepted because near-term performance advantages are decisive. Others lean toward a portability-first model. These organizations prioritize flexibility, governance, and long-term independence over absolute performance. BNP Paribas illustrates this well through its internal LLM platform built on open-source models and controlled infrastructure. By retaining ownership of the stack, the bank ensures data sovereignty, regulatory alignment and predictable cost. A growing number are adopting a hybrid approach. Rather than applying a single strategy across the enterprise, they segment workloads based on sensitivity to performance, cost and governance. For example, in late 2024, JPMorganChase outlined its approach at a leading cloud and technology conference. It described combining a firm-wide internal AI platform with cloud-based services to move generative AI into production at scale. This reflects a broader enterprise pattern of pairing internally controlled environments with external ecosystems to balance control, scalability and cost. A performance advantage is only valuable if it lasts long enough to justify the lock-in it creates. Similarly, portability only matters if the ecosystem evolves in ways that make switching worthwhile. This is where many organizations struggle. They evaluate platforms based on current benchmarks rather than the direction of the ecosystem. In practice, we encourage leaders to track a set of evolving signals. These range from the maturity of open compiler ecosystems and improvements in cross-platform runtimes, to shifts in performance per watt and increasing regulatory focus on sovereign AI. Together, these indicators help determine whether the industry is moving toward convergence or further fragmentation. Conclusion AI is forcing a reset in how technology leaders think about IT architecture. The goal for CIOs is no longer to eliminate dependency, but to choose it consciously and manage and revisit that choice over time. In our experience, the most effective organizations treat this as a dynamic problem. They evaluate where performance truly differentiates them, where flexibility protects them, and how quickly those boundaries are shifting. They also recognize that some degree of re-platforming is inevitable and plan for it, rather than treating it as a failure. Ultimately, AI infrastructure strategy is not about optimizing for today’s conditions. It is about getting ready for where the ecosystem is going next. The leaders who navigate this well are not those who avoid lock-in entirely, but those who understand when to embrace it when to limit it and when to move beyond it before the market forces that decision on them. This article is published as part of the Foundry Expert Contributor Network. Want to join?
Score: 50🌐 MovesJun 24, 2026https://www.cio.com/article/4188503/choosing-your-ai-stack-the-benefits-of-vendor-lock-in.html
Inside Epignosis’ 250-Person AI Agent Rollout with Mindstone
Inside Epignosis’ 250-Person AI Agent Rollout with Mindstone uk.entrepreneur.com
Score: 50🌐 MovesJun 24, 2026https://uk.entrepreneur.com/business-news/inside-epignosis-250-person-ai-agent-rollout-with-mindstone
Undo Enables AI Agents to Diagnose Root Cause of Application Issues
Undo Enables AI Agents to Diagnose Root Cause of Application Issues DevOps.com
Score: 50🌐 MovesJun 24, 2026https://devops.com/undo-enables-ai-agents-to-diagnose-root-cause-of-application-issues/
Canadian Companies and Employees Still Prefer the Human Touch Over AI for Many Workplace Tasks
Canadian Companies and Employees Still Prefer the Human Touch Over AI for Many Workplace Tasks Toronto Star
Score: 50🌐 MovesJun 24, 2026https://www.thestar.com/globenewswire/canadian-companies-and-employees-still-prefer-the-human-touch-over-ai-for-many-workplace-tasks/article_f8594050-d038-536c-9a35-dee5560d491c.html
iPhone 18 Pro rumours: Release date, price, camera and AI features
iPhone 18 Pro rumours: Release date, price, camera and AI features
Score: 50🌐 MovesJun 24, 2026https://www.khaleejtimes.com/business/tech/iphone-18-pro-rumours-release-date-price-camera-and-ai-features
Tech rout deepens as rate jitters spark AI stock pullback, wiping billions off global markets
Tech rout deepens as rate jitters spark AI stock pullback, wiping billions off global markets Gulf News
Score: 50🌐 MovesJun 24, 2026https://gulfnews.com/business/markets/tech-rout-deepens-as-rate-jitters-spark-ai-stock-pullback-wiping-billions-off-global-markets-1.500584760
Cost the major barrier in AI’s race to space
Space-based datacentres (SBDC) are achievable in five to 10 years, but financial feasibility is the biggest bottleneck to its adoption, according to a white paper by Boston Consulting Group (BCG). By 2040, BCG predicts that SBDCs will reach a share of up to 15% of the global artificial intelligence (AI) datacentre market – meaning one in eight AI workloads could be running in space. AI facilities are placing unprecedented demands on the electricity grid for heat management, with power consumption equivalent to that of a small city and increasing concerns over water usage , and datacentres are facing significant pushback from communities around the world. Demand is said to increase by up to 20% per year, yet datacentre planning applications are being delayed by 490 days due to objections from the public, citing harm to the local area and objections on environmental grounds, according to UK-based engineering consultancy Hoare Lea . The shifting demand has led to companies such as SpaceX and Google announcing research projects to put datacentres in space. Announced in November 2025, Google’s Project Suncatcher , for example, aims to offer a service with a network of 81 solar-powered satellites. Google said the project can minimise the impact on terrestrial resources. “A solar panel can be up to eight times more productive than on Earth, and produce power nearly continuously, reducing the need for batteries,” it said. “Space may be the best place to scale AI compute.” However, construction, maintenance and environmental concerns, on top of the financial challenges, remain as substantial barriers in making SBDCs a reality just yet. The report noted, for example, that even if technical constraints are solved – such as radiation tolerance of the equipment, achieving batteries with high enough energy density, and the various problems around in-orbit maintenance – “that only establishes the possibility of deploying SBDCs; it does not establish whether doing so makes financial sense”. While orbital datacenters seem far-fetched, the US Federal Communications Commission has received applications for a combined total of more than one million low Earth orbit satellites to operate as datacentres, and China’s state space corporation has also announced plans for a “Space Cloud” by 2030. How do SBDCs work? SBDCs operate as a constellation of satellites in “dawn-dusk Sun-synchronous orbit” – capturing solar energy at all times by flying over places at sunset and sunrise, as waste heat is dissipated through “passive radiative cooling”. The satellites would process AI workloads and relay data through inter-satellite links to communication satellites, which would then transmit data via radio frequency or laser downlinks to ground stations, then to users. Current satellites, typically operating at 20-39kW, are small compared with the next generation of satellites needed for SBDCs, which are targeted at 100-150kW per satellite and would weigh approximately two tonnes. “At ~$1,500/kg, launching 1GW of compute costs roughly $30bn, the largest barrier to commercial viability,” said BCG, while government-imposed power tariffs would add to the equation. Costs are currently up to three times more than terrestrial AI infrastructures, although this is predicted to narrow to approximately 1.5x in the next five to 10 years. SBDCs are expected to be used in scenarios where “structural advantages of orbit outweigh a persistent cost premium”. This could include processes such as “non-latency-sensitive inferencing, sovereign workloads and the processing of space-generated data”. BCG added that any workloads that are sensitive to latency, such as training large language models (LLMs), are currently not fit for SBDCs, meaning organisations will only be able to opt for use cases where response time is not critical. The workloads that are likely to remain on Earth are therefore real-time applications that require sub-50 millisecond (ms) response times. “[SBDCs are] a credible new layer of AI infrastructure that will most likely serve a meaningful but bounded share of the market by 2040, anchored in workloads where orbit’s structural advantages, namely geographic distribution and sovereignty, outweigh a narrowing cost premium,” said BCG. Challenges of technical feasibility SBDCs at scale would mean mass production and deployment of “satellite buses, space-grade solar arrays, radiators and other subsystems at volumes the industry has not yet reached”. A majority of constraints assessed in deploying SBDCs at scale were predicted to be solved in 10 years – including launch cost, cooling systems, battery life, radiation, connectivity and in-orbit maintenance. The biggest challenge identified was in-orbit maintenance. A majority of the cost of running SBDCs would come down to graphics processing unit (GPU) compute, which needs replacing every five years – this could cost up to 55% of SBDC expenditure, compared with launch costs at around 22%. A 100kW satellite also requires a radiator of 400 square metres, an “unprecedented” size, with cooling and in-orbit maintenance years away from readiness. The report calculated a failure rate of 30% per constellation deployment. A failed GPU or component can mean the whole satellite is lost as there are no current means of in-orbit servicing. Space pollution Such a high failure rate adds to growing concerns around space debris . Business management consultancy Gartner flagged the emerging issue in December 2025, saying that space debris created from “retired satellites and discarded rocket components threatens critical infrastructure, including communications networks”. Space.com reported there are nearly 1.3 million pieces of human-made orbital debris circulating the planet, which leads to an increased risk of collisions that generate more debris and debris strike incidents on Earth. A separate article from May 2026 raised the issue of satellites reporting data loss and errors due to incoming space debris, which will only escalate as space debris increases. BCG recommends that governments prepare for SBDCs as a reality by building regulatory frameworks before satellites are deployed, especially surrounding “data sovereignty, international liability and orbital debris”. Read more about datacentres Google Cloud boosts for enterprise agentic at London Summit : Hyperscaler prioritises process automation in UK showcase, with frontier models, agent platforms and development tools to the fore, with customers such as Unilever in the spotlight. How to ensure business continuity in datacenter decommission : Explore strategies for decommissioning datacenters while ensuring business continuity, minimising downtime and managing IT asset disposition responsibly. AI’s next compute layer is likely to come from outside Silicon Valley : AI infrastructure is moving beyond hubs like Silicon Valley. Nations like India, Brazil, and the UAE are building sovereign, power-conscious capacity to solve local compute scarcity.
Score: 50🌐 MovesJun 24, 2026https://www.computerweekly.com/news/366644771/Cost-the-major-barrier-in-AIs-race-to-space
GLM 5.2 Fast via Wafer now available on AI Gateway
GLM 5.2 Fast via Wafer is now available on AI Gateway . Based on our own benchmarking across small-context, large-context, and tool-call scenarios, Wafer delivers a 2x higher throughput than other providers serving GLM-5.2 on serverless, leading on decode and end-to-end speed for sustained generation in the small- and large-context cases. In our testing, GLM 5.2 Fast on Wafer measured: Small context: 170+ tok/s Large context: 200+ tok/s To use GLM 5.2 Fast, set model to zai/glm-5.2-fast in the AI SDK : AI Gateway provides a unified API for calling models, tracking usage and cost, and configuring retries, failover, and performance optimizations for higher-than-provider uptime. It includes built-in custom reporting , Zero Data Retention support , budgets for API keys , and more. AI Gateway reflects provider pricing with no markup and does not charge a platform fee on inference, including on Bring Your Own Key (BYOK) requests. Try GLM 5.2 Fast in the model playground . Read more
Score: 50🤖 ModelsJun 24, 2026https://vercel.com/changelog/glm-5-2-fast-via-wafer-now-available-on-ai-gateway
DataRobot Agent Skills and MCPs are now discoverable through Agentic Resource Discovery
DataRobot now supports the Agentic Resource Discovery Specification, making DataRobot Agent Skills and MCPs easier for AI clients, registries, and developers to find. Agents are only as useful as the capabilities they can reach. A coding agent can write code. A workflow agent can call tools. An enterprise agent can reason across systems. But all... The post DataRobot Agent Skills and MCPs are now discoverable through Agentic Resource Discovery appeared first on DataRobot .
Score: 50🌐 MovesJun 24, 2026https://www.datarobot.com/blog/datarobot-agent-skills-are-now-discoverable-through-agentic-resource-discovery/
Fynd Partners with Razorpay to Power Seamless Omnichannel Agentic Commerce Experiences
Fynd today announced its partnership with Razorpay, India’s Omnichannel Payments Platform for Businesses, to help brands deliver seamless agentic commerce experiences across online and offline channels. By combining Fynd’s commerce infrastructure with Razorpay’s omnichannel payments stack, the partnership strives to enable brands to create connected customer journeys, from product discovery to purchase, through a single integrated ecosystem. Leading […] The post Fynd Partners with Razorpay to Power Seamless Omnichannel Agentic Commerce Experiences appeared first on CXOToday.com .
Score: 50🌐 MovesJun 24, 2026https://cxotoday.com/media-coverage/fynd-partners-with-razorpay-to-power-seamless-omnichannel-agentic-commerce-experiences/?utm_source=rss&utm_medium=rss&utm_campaign=fynd-partners-with-razorpay-to-power-seamless-omnichannel-agentic-commerce-experiences
Open source grapples with agentic coding
Open source grapples with agentic coding InfoWorld
Score: 50🌐 MovesJun 24, 2026https://www.infoworld.com/article/4188440/open-source-grapples-with-agentic-coding.html
Superintelligence vs. The Second Strike
Crosspost of my substack piece , covering quick thoughts on AI overcoming nuclear deterrence. TLDR: Nuclear deterrents likely only buy time to further invest in more resilient second-strike guarantees: without a comparable AI base, this will not happen fast enough and even nuclear states will eventually be disempowered. Historically, plenty of new military technologies have stress-tested nuclear deterrence. ICBMs made it possible to annihilate enemy cities from the safety of the homeland, MIRVs let a single rocket threaten multiple targets, and thermonuclear staging allowed weapons designers to reach functionally unlimited yield . In the already volatile climate of the Cold War, the U.S. and Soviets reached such mastery over missile technology that remote annihilation of an entire country was, quite literally, a button press away. For decades, even a single rocket has been able to hold more than 10 warheads--each enough to destroy a city on their own. Peacemaker reentry tests pictured above. The fact that the ability to remote detonate Moscow never translated into a nuclear war is a function of modern deterrence theory , dumb luck , and most importantly, the speed of progress. As effective as a modern ICBM is, each piece of it was individually low-impact enough, and introduced slowly enough, that there was never a point at which deterrence could be fully overturned. For comparison, imagine if the U.S. had acquired a fully realized ICBM in the mid 50s, back when the Soviets were still using bombers and hadn’t yet fielded a nuclear submarine. The U.S. would have been dearly tempted to strike first before the Soviets managed to diversify their nuclear forces, much as the Soviets would have been tempted to lash out before America decided to drop the guillotine. Fortunately, the march of progress has always been slow enough to let rival states proactively invest in their second strike assurances. Unfortunately, the march is about to turn into a sprint. In the process of recursive self-improvement towards godlike superintelligence, the American government is going to stumble onto the obvious idea of using it to automate military R&D---and in the process, likely leap several years, decades, or centuries up the tech tree relative to their rivals. For this technological edge to translate into a decisive strategic advantage, however, states would need to overcome even the most potent nuclear deterrents their rivals could build. Broadly, this could happen in three ways. Splendid first strikes - It becomes possible to either locate and destroy all of the enemy’s counterforce, or to fully decapitate nuclear command and control. WMD defenses - Defensive systems are implemented that let the attacker neutralize both a retaliatory missile strike and non-missile means of delivery (smuggling, coastal torpedoes, etc). Escalation management - The defender can be convinced not to launch a retaliatory strike, by carefully salami-slicing their disempowerment and/or using persuasion to manipulate their decisionmaking. Splendid First Strikes : In order for a first strike to succeed, the attacker would need to either find and destroy all counterforce targets, or to fully decapitate strategic command and control. Broadly, I think that this would be possible with a large technological lead, but not with a high enough level of certainty to justify the risk of a proactive first strike. In order for a counterforce strategy to succeed, a country would need to simultaneously find and destroy every leg of the defending state’s nuclear triad, including their land silos/mobile launchers, bombers, and SSBNs. This could either be accomplished through detection technology that narrows down the area in which the counterforce is located (ex: ocean wake mapping for satellites), or by simply flooding the oceans and space with autonomous sensors . Even once located, however, the attacker would still need to simultaneously destroy each target, leaving no time for the defender to authorize a retaliatory strike from the surviving counterforce. This limitation is especially constraining for SSBNs, given that the attacker would need to spend their finite reserve of nuclear warheads on large swathes of ocean in order to be confident the subs were destroyed (a much more severe limitation for China, given that it only has ~600 nuclear warheads overall). I place low credence on nuclear deterrence being undermined through counterforce alone, especially since defending states can cheaply invest in camouflage and decoy vehicles to increase the filtering and targeting requirements. There are similar coverage problems with attempting to sever NC3. Here, the challenge is to destroy the central command and satellite command nodes, as well as proactively sabotaging any automatic retaliatory systems that exist. These, of course, are highly redundant in terms of both personnel and communications tech, so even a massive set of assassinations on the line of succession and a shuttering of internet infrastructure wouldn’t prevent a retaliatory order from being issued through EMP resistant satellites or a SAOC . More realistically, you’d use a decapitation strike to suppress decision making for a few minutes or hours, buying you more time to hunt down the remaining counterforce and relax the simultaneity requirement. WMD Defenses : Alternatively, states could try to neutralize a retaliatory strike. While this could theoretically be possible with technology that enables faster boost phase interception (e.g. much-improved DEWs or space-based interceptors) or massive increases in industrial output , there are three massive problems with defense. Scale/cost : The U.S. in particular has repeatedly tried to invest in a comprehensive ICBM defense system (see: Brilliant Pebbles and the more recent Golden Dome ). The reason these programs have repeatedly failed is that scaling them to account for rival arsenals is impossibly expensive. Midcourse interception systems, like Aegis or GMD, cannot distinguish between decoys and warheads in the threat cloud and so must bleed interceptors to compensate. And although boost-phase targeting systems have the advantage of tracking a relatively slow, soft, and single target in the initial rocket, the fact that the defender cannot know where the rockets will be launched from forces them to pre-position space-based interceptors across the entire planet to compensate. It is therefore extremely easy to saturate by launching a large salvo from a small number of locations. For example, the U.S. would need to field more than 1,600 interceptors to reliably destroy a single North Korean Hwasong-18, and many times that amount for a modern ICBM with a faster boost. Construction time : Even supposing that a state could afford the defensive infrastructure that would be used to counter a missile strike, it would take years to fully implement. Even the Trump admin’s own (notably generous) estimate of the Golden Dome’s construction time is three years ---more than enough time for a rival power to invest in scaling their warhead count or to sabotage the unfinished project. Non-missile coverage : Finally, states have the problem of accounting for non-missile means of delivery. Even if every ICBM could be reliably intercepted, nukes could still be delivered through coastal torpedoes, stealth bombers, or even smuggled into the country and pre-positioned. And if a state were truly desperate, it could resort to extreme fail-deadlies to maintain deterrence, like a massive salted bomb safely detonated from the homeland, an engineered bioweapon, or other uncontainable symmetric weapons. Nuclear weapons are an efficient and targetable WMD, but they are by no means the only deterrent a determined state could have access to. Still, nuclear defenses don’t need to succeed on their own: they only need to be successful enough to mop up the defender’s surviving missiles against an initial strike. Even though I find it unlikely that a state would be able to simultaneously destroy all major and satellite launch nodes, it seems plausible to destroy a large enough percentage to make a combined effort successful. Escalation management : States could also be less obviously disempowered by salami-slicing and persuasion. Rather than try to outright destroy or neutralize a rival’s nuclear deterrent, a state with a massive technological and industrial lead could simply invest in building up its coercive leverage, then using it to demand individual concessions. If the U.S. wanted to push for Taiwan’s independence from China, for example, it could use its AI surplus to incrementally achieve a massive conventional military overmatch , and use sophisticated propaganda to push for an elite consensus that war with the U.S. over Taiwan would be unwinnable and result in an embarrassing defeat. Similarly, (individually deniable) automated grey zone attacks could be used to attack rival industrial output, economic growth, and military R&D, allowing the leading nation(s) to further compound their relative advantage until they reach a point of strategic dominance. Even though the U.S. never militarily defeated the Soviet Union, its economic advantage allowed it to maintain an extremely costly arms race with its rival, the economic pressure of which eventually contributed to its political collapse. The problem with this strategy is that it’s very difficult to predict at what point a demand stops being sub-nuclear. The decision to escalate is a function of often arbitrary perceptions about regime survival, domestic politics, and even personal honor. A leader could absorb a great deal of pain without escalating, or overreact violently to a minor provocation that happens to hit a nerve. To compensate for this uncertainty, your AI systems would therefore need to be able to both increase a state’s military capacity to disempower its rivals in a deniable way, and to be able to accurately simulate or manipulate their decision making. That is not to say that these are impossible capabilities to have. Generally superhuman AI systems will, necessarily, be superhuman in their ability to charismatically persuade decision makers, and would allow for simultaneously massive and personalized information campaigns. What’s less obvious is whether this persuasion would be strong enough to manipulate leaders on particularly vital decisions, and whether it would be “offense-dominant” against other AI systems providing counsel and analysis of its arguments. Tentatively, I expect that superpersuasion would be very effective against an ordinary human without this assistance (given that algorithmic content is already so effective at invisibly shaping preferences), but that the defensive use of AIs for epistemics would prevent decision makers from being arbitrarily manipulated (since these systems will have higher trust and the advantage of arguing for the truth). So, to answer the relevant question: would the U.S. be capable of undermining nuclear deterrence with a large enough lead in AI? In descending order of difficulty: Against China : Probably not. Even if the U.S. “ wins ” the race to AGI, it seems unlikely that the U.S. would be able to scale its defensive or offensive systems far and quickly enough to prevent the Chinese government from being able to reactively invest in its second strike assurances. Although China might have a less developed triad than Russia and the U.S., as well as a smaller number of warheads, it has the distinct advantage of having its own domestic AI base, making it much more difficult for the U.S. to secure a decisive technological lead. In all likelihood, the Chinese government will be able to secure itself epistemically against AI persuasion, apply AI to automate its industrial base, and to invest in novel WMDs---at least to the extent that the U.S. would be unacceptably uncertain about the success of a first strike. This uncertainty would buy the Chinese government time, with which it could reinvest in its second strike assurance, which would buy yet more time, and so on until the deluge of technological innovation from advanced AI slows down. Against Russia : This is more interesting. Russia maintains a massive and diversified set of warheads, but it also has approximately zero ability to compete in AI. Unless another country (such as China) proactively invests in its compute stock and provides advanced models, Russia’s economy and military assets will eventually become obsolete. Imagine a situation in which the U.S. and China have started tiling their interiors with self-replicating factories, explosively growing their share of the global economy. Russia’s non-nuclear influence (e.g. economic and petro) would quickly wither away, leaving it with only the binary and unreliable influence of nuclear weapons to rely on. As the historical collapse of the Soviet Union demonstrates, it’s not necessary to militarily defeat a rival to disempower them: instead, it may be sufficient to simply outgrow and outlast them until they are vulnerable to political collapse. Small nuclear powers : The remaining states are significantly easier to disempower. All of them have significantly smaller warhead counts (making them easier to defensively saturate) and a less-developed nuclear triad than their great power peers. They’re also, for the most part, significantly less self-reliant than China and Russia, increasing the amount of non-nuclear leverage the leading states can apply (see: China’s implicit influence over Pyongyang through its control of coal and food imports). Even moreso than Russia, these countries are at long term risk of becoming vassal states purely through economic obsolescence, and are significantly more susceptible to a disarming strike. Overall, I expect that conventional nuclear deterrence will primarily serve as a means to buy time for a state to advance its own AI capabilities and to diversify its second strike assurances accordingly. If a nuclear state has no capacity to deploy or develop AI, then this time will not be useful, and it will eventually be destroyed through a combination of advanced technology and industrial attrition. Discuss
Score: 50🌐 MovesJun 24, 2026https://www.lesswrong.com/posts/2kseP9fZghYHKLFno/superintelligence-vs-the-second-strike
Legislative Data Firm Abstract Launches AI Workflow Agent
The policy intelligence company, founded in 2020, is turning to AI to move into a "workforce execution" space. Local governments and lobbyists are among the intended customers for the new tool.
Score: 50🌐 MovesJun 24, 2026https://www.govtech.com/biz/legislative-data-firm-abstract-launches-ai-workflow-agent
Introducing the FFASR Leaderboard: Benchmarking ASR in the Real World
Introducing the FFASR Leaderboard: Benchmarking ASR in the Real World
Score: 50🌐 MovesJun 24, 2026https://huggingface.co/blog/ffasr-leaderboard
Three Eras of Quantitative Finance: How Rule-Based, ML, and Deep Learning Models React to the Same…
Three Eras of Quantitative Finance: How Rule-Based, ML, and Deep Learning Models React to the Same Market Events A hands-on comparison of strategy behavior and performance across 10 years of real market data Introduction Quantitative finance has undergone three distinct technological revolutions over the past 50 years. Each one promised to give traders an edge. But does more sophisticated technology actually translate to better performance? If you had access to the same AI technology powering ChatGPT, could you beat the stock market? This question motivated a two-week independent research project where I compared three generations of quantitative trading strategies — from 1970s rule-based systems to modern Transformer-based deep learning — all applied to the same 10 years of Apple (AAPL) stock data. The answer surprised me. And the reason why turned out to be more interesting than the result itself. Background: Three Eras of Quantitative Finance Quantitative finance has undergone significant evolution over the past 50 years. I organized this evolution into three distinct eras: Era 1 — Rule-Based Strategies (1970s–1990s) The earliest quant strategies were simply rules written by humans. The famous Moving Average Crossover strategy, for example, says: when the 20-day average price rises above the 50-day average, buy; when it falls below, sell. No learning, no adaptation — pure human logic encoded into a computer. Era 2 — Classical Machine Learning (1990s–2010s) Instead of humans writing rules, machine learning algorithms like Random Forest learn patterns automatically from historical data. Feed it thousands of past trading days with labels (“this day was followed by a price increase”), and it builds its own decision rules. This is the foundation of firms like Renaissance Technologies and the early Jane Street. Era 3 — Deep Learning (2010s–present) Transformer models — the same architecture behind ChatGPT — can now process entire sequences of market data, finding complex relationships across hundreds of historical days simultaneously. In theory, this should give them a significant edge. The Experiment I implemented all three eras from scratch in Python and tested them on AAPL data from January 2014 to January 2024 — a period that includes a steady bull market, the COVID crash, and the AI-driven rally of 2023. The results across all three strategies were evaluated using three metrics: Sharpe Ratio: return per unit of risk taken. Higher is better. Above 1.0 is generally considered strong. Annual Return: average yearly gain. Max Drawdown: the worst peak-to-trough loss an investor would have experienced. Finding 1: Complexity Does Not Guarantee Performance Here is what the numbers showed: The most complex model — the Transformer — performed worst by every metric. The Random Forest, a method from the 1990s, achieved the best risk-adjusted returns. This is not a coincidence. It reflects a finding that has been gaining traction in academic research. "The PatchTST paper (Nie et al., 2023) , published by researchers studying time series forecasting, demonstrated that well-designed, simpler models can match or outperform deep learning on financial data. The intuition is that financial markets are full of noise, and complex models can overfit to patterns that no longer exist by the time you trade on them. Finding 2: The Trap I Almost Fell Into This was the most valuable lesson of the project — and it came from making a mistake. When I first trained the Transformer, it seemed to work perfectly. The training loss dropped to near zero, and the predicted price line tracked the actual price almost exactly. But something felt wrong. I compared the model’s predictions to the simplest possible baseline: just predict that tomorrow’s price equals today’s price. The two lines were almost identical. The model had not learned anything about the future. It had learned one trivial fact: prices do not change very much from one day to the next. By predicting “tomorrow ≈ today,” it achieved excellent accuracy while providing zero trading value. This is called the Naive Prediction Trap , and it is one of the most common mistakes in financial machine learning. The fix was to change the task: instead of predicting the price, predict the direction — will the price go up or down tomorrow? After this fix, the training loss stayed higher, which is actually a good sign. It means the model is tackling a genuinely difficult problem instead of finding an easy shortcut. Finding 3: How You Evaluate Matters as Much as What You Build There is a subtlety in the results that is easy to miss. Era 1 covers the full 10-year period (2014–2024), which includes AAPL’s most explosive growth. Era 3 only covers 2022–2024, a period of high volatility and a significant price correction. This means comparing their total returns directly is misleading. A strategy that looks mediocre over 2022–2024 might have been excellent over 2014–2020. The choice of evaluation period can completely change your conclusions — a problem that real quantitative researchers deal with constantly. This is one reason why backtesting results should always be interpreted with caution. Finding 4: The Hardest Benchmark to Beat Across all three eras, every strategy underperformed one simple alternative: just buy AAPL and hold it for 10 years. This is a humbling result. It suggests that for strongly trending assets like AAPL during a bull market, the edge that sophisticated strategies provide is not enough to compensate for the times they are wrong and sitting in cash while the stock keeps rising. This does not mean quantitative strategies are useless — they become more valuable in sideways markets, bear markets, or when managing large portfolios where risk control matters more than raw return. But it is an important reminder that complexity is not the same as insight. Finding 5: The Same Event, Three Different Reactions Numbers alone do not tell the full story. To understand how each model behaves — not just how well — I examined three specific market events and tracked what signal each model generated in real time. Event 1: COVID Crash (February–April 2020, -21.8%) When AAPL began its rapid 30% decline in late February 2020, the three models reacted very differently. The MA Crossover switched to cash around March 1st and stayed there for the rest of the period — protecting capital during the worst of the selloff. The Random Forest oscillated frantically between invested and cash, unable to make a stable decision in the face of extreme daily volatility. The Transformer stayed invested for 38 out of 42 days, providing almost no protection whatsoever. Event 2: COVID Recovery (April–July 2020, +51.5%) The recovery revealed a different set of tradeoffs. The MA strategy, having correctly exited during the crash, was slow to re-enter — it waited until May 1st, missing the entire April rebound. The Random Forest, despite its confusion during the crash, was more nimble in catching the recovery. The Transformer stayed invested for 63 out of 64 days, which this time worked in its favor. Event 3: Rate Hike Selloff (January–May 2022, -13.3%) This event was the most revealing. Unlike the COVID crash — a sudden panic — this was a slow, grinding decline driven by macroeconomic factors. The Transformer invested for all 82 out of 82 days, completely ignoring the sustained downtrend. The MA strategy recognized the trend change in early February and moved to cash for nearly two months. The Random Forest continued its pattern of frequent signal switching. What the Three Events Reveal Each model has a distinct behavioral profile that aggregate performance metrics cannot capture: Era 1 — MA Crossover: Slow but decisive. It consistently misses the first few days of any move, but once it makes a decision, it commits. It is the only model that genuinely reduces downside exposure during sustained declines. Era 2 — Random Forest: Fast but indecisive. It reacts quickly to short-term patterns, but in high-volatility environments, it generates noisy, unstable signals. This explains its strong Sharpe Ratio in normal markets and its confusion during extreme events. Era 3 — Transformer: Rarely changes its mind. Its signal is almost always 1 — invest — regardless of market conditions. In bull markets, this looks like wisdom; in bear markets, it is indistinguishable from a strategy of simply never selling. This behavioural analysis adds an important dimension to the quantitative results: a model’s aggregate performance score can look similar to another model’s while its underlying decision-making process is completely different. What I Learned Beyond the numerical results, this project taught me three things about doing research: First, results that look too good are usually a warning sign. The near-zero loss on the first Transformer attempt should have been suspicious, not exciting. Second, the evaluation framework matters as much as the model. Two strategies can produce completely different rankings depending on the time period you test them on. Third, the most interesting findings often come from failure. Discovering and diagnosing the Naive Prediction Trap was more educational than any result that “worked.” What Comes Next This project deliberately used a simplified setup — one stock, daily data, and a basic Transformer architecture. Real-world quantitative strategies are far more sophisticated. Some directions worth exploring: Regime detection: automatically identifying when market dynamics have changed, so the model can adapt its strategy accordingly. This is an active research area at firms like Jane Street. Multi-asset models: training on many stocks simultaneously to find cross-asset patterns. Alternative data: incorporating non-price signals like news sentiment, earnings call transcripts, or macroeconomic indicators. The code for this project is fully open-source and documented on GitHub: https://github.com/ZhuYX163/quant-ai-study References Nie, Y., Nguyen, N. H., Sinthong, P., & Kalagnanam, J. (2023). A Time Series is Worth 64 Words: Long-term Forecasting with Transformers. International Conference on Learning Representations (ICLR 2023). https://arxiv.org/abs/2211.14730 This project was built as an independent two-week research study. I am a second-year student at École Polytechnique studying Mathematics and Economics with a minor in Computer Science. Three Eras of Quantitative Finance: How Rule-Based, ML, and Deep Learning Models React to the Same… was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.
Score: 50🌐 MovesJun 24, 2026https://pub.towardsai.net/three-eras-of-quantitative-finance-how-rule-based-ml-and-deep-learning-models-react-to-the-same-554d75a2dc7b?source=rss----98111c9905da---4
‘Need to fight AI with AI’: IBM India VP on the future of cybersecurity
‘Need to fight AI with AI’: IBM India VP on the future of cybersecurity
Score: 50🌐 MovesJun 24, 2026https://indianexpress.com/article/technology/artificial-intelligence/need-to-fight-ai-with-ai-ibm-india-vp-on-the-future-of-cybersecurity-10755451/
Girl Effect India launches WhatsApp chatbot to help parents navigate HPV vaccination
Girl Effect India launches WhatsApp chatbot to help parents navigate HPV vaccination YourStory.com
Score: 50🌐 MovesJun 24, 2026https://yourstory.com/herstory/2026/06/girl-effect-india-whatsapp-chatbot-help-parents-navigate-hpv-vaccination
AI energy firm born out of ADNOC and G42 eyes US
AIQ has around 200 use cases for the oil and gas industry and is now looking to export from Abu Dhabi.
Score: 50🌐 MovesJun 24, 2026https://www.semafor.com/article/06/22/2026/ai-energy-firm-born-out-of-adnoc-and-g42-eyes-us
Americans aren’t using AI for news, poll finds
A plurality of 39% said they would lose trust in information if they knew that AI was used to produce it.
Score: 50🌐 MovesJun 24, 2026https://www.semafor.com/article/06/23/2026/americans-arent-using-ai-for-news-poll-finds
Compute Data Startup Ornn Raises $33 Million in Andreessen Horowitz-led Round
Compute Data Startup Ornn Raises $33 Million in Andreessen Horowitz-led Round The Information
Score: 50💰 MoneyJun 24, 2026https://www.theinformation.com/briefings/compute-data-startup-ornn-raises-33-million-andreessen-horowitz-led-round
Exclusive: Vinod Khosla wanted every dollar of Runlayer's round. It just raised $30 million
Exclusive: Vinod Khosla wanted every dollar of Runlayer's round. It just raised $30 million Fortune
Score: 50💰 MoneyJun 24, 2026https://fortune.com/2026/06/24/exclusive-vinod-khosla-felicis-runlayer-nanit-30-million-enterprise-ai/
Reid Hoffman: SpaceX ‘isn’t an AI company,’ xAI is a ‘train wreck’—and there’s room for OpenAI, Anthropic
Reid Hoffman: SpaceX ‘isn’t an AI company,’ xAI is a ‘train wreck’—and there’s room for OpenAI, Anthropic Fortune
Score: 50🌐 MovesJun 24, 2026https://fortune.com/2026/06/24/reid-hoffman-spacex-musk-openai-anthropic-gen-z-mistake/
Anthropic Veterans’ Startup Seeks to Help Scientists Develop Their Own AI
Mirendil raises $200 million from Andreessen Horowitz and Kleiner Perkins to make AI that can do the job of an AI engineer.
Score: 50🌐 MovesJun 24, 2026https://www.wsj.com/tech/ai/anthropic-veterans-startup-seeks-to-help-scientists-develop-their-own-ai-09e2f3e5?mod=rss_Technology
Can WeChat become China’s killer AI app?
Can WeChat become China’s killer AI app? The Straits Times
Score: 50🌐 MovesJun 24, 2026https://www.straitstimes.com/opinion/can-wechat-become-chinas-killer-ai-app
The emergence of the web data infrastructure layer for AI
AI is booming. New use cases are emerging each day. To capitalize on the technology’s potential, enterprises require data at scale. In many cases, though, the relevant information is blocked or unstructured, which limits its use by AI models. To understand this challenge, consider the foundation of the web itself. The web was not designed…
Score: 50🌐 MovesJun 24, 2026https://www.technologyreview.com/2026/06/24/1139202/the-emergence-of-the-web-data-infrastructure-layer-for-ai/
Xiaomi's HarnessX rewrites its own AI scaffolding mid-task — and smaller models gain the most
As enterprise AI agents take on increasingly complex, long-horizon tasks, their performance is often restricted by their harness, the software scaffolding that connects the backbone LLM to its environment. Currently, harnesses are largely static and hand-crafted. Improving them is largely manual and they do not automatically improve based on the execution data they collect from their environment. To address this engineering bottleneck, researchers at Xiaomi introduced HarnessX , a framework that treats the AI harness as a composable object and autonomously applies improvements to its code. In real-world enterprise applications, this automated adaptation enables AI systems to dynamically adjust to application-specific requirements. Practical tests showed HarnessX delivering substantial performance gains across domains like software engineering and web interaction. The results demonstrate that scaling the foundation model is not the only path to more capable AI — and for smaller models, it may not even be the best one. HarnessX's harness evolution yielded an average +14.5% performance gain across 15 model-benchmark combinations; for the open-weight Qwen3.5-9B, gains reached +44% on embodied planning tasks. The challenges of harness engineering In AI applications, a foundation model's capability relies heavily on its surrounding harness . The harness acts as the operational layer that converts raw model outputs into structured, executable agent behaviors. It comprises the prompts, external tool integrations, memory management, and control flows that dictate how an AI system observes its environment, reasons through a problem, and takes action. As enterprise agents take on more complex, long-horizon workflows, harness engineering has become a fundamental part of AI development. Despite its importance, harness development remains far from a mature engineering discipline and presents three key challenges. First, harnesses are static and hand-engineered. Any shift in the underlying foundation model, the introduction of new tools, or a pivot to a different operational domain requires bespoke, manual code rewrites. Traditional harnesses lack mechanisms to autonomously learn and improve from past execution experiences. Second, most existing harnesses suffer from architectural entanglement. They tightly couple prompt templates, tool wrappers, retry policies, and memory management within the same code paths. This entanglement means that tweaking one component can silently break others. Attempting to reuse a harness across different business domains often devolves into raw code copying rather than clean, modular composition. Third, the harness and foundation model are optimized in isolation. When engineers run tests to improve the harness, the execution traces generated are typically discarded rather than used as training data to improve the model. Consequently, model upgrades do not naturally lead to harness improvements, creating a bottleneck where teams fail to capture the full value of their agent's operational data. HarnessX: an autonomous foundry for AI agents HarnessX solves the engineering bottlenecks of manual harness development with what the researchers call a “unified harness foundry.” The core innovation of HarnessX is treating the harness as a "first-class object". In software engineering terms, this means the harness is an independently serializable, modular, and substitutable entity. By separating the model configuration (i.e., which AI model is operating) from the harness configuration, engineers can seamlessly swap, adapt, and evolve the scaffolding without touching the underlying model. HarnessX breaks agent behavior down into different components, such as context assembly, memory management, tool ecosystems, control flow, and observability. Every specific behavior is implemented as a "processor" that plugs into precise lifecycle hooks of the harness. This modular structure allows the system to swap, add, or remove these processors without breaking the surrounding pipeline. To automate the optimization of this modular structure, HarnessX introduces AEGIS, a trace-driven evolution engine. AEGIS frames harness adaptation as a reinforcement learning (RL) problem over the different symbolic components of the harness. Framing harness optimization as a reinforcement learning problem introduces three pathologies the researchers had to explicitly engineer against: Reward hacking: The system might exploit shortcuts to the solution instead of genuinely solving the task. Catastrophic forgetting: An edit that fixes a failure pattern in one domain might silently break a previously solved workflow in another. Under-exploration: The system might iterate on minor prompt tweaks rather than exploring new, structurally superior tool configurations. To prevent these problems, AEGIS relies on full trace observability and a four-stage pipeline: Digester: Compresses execution traces into structured summaries to identify where the agent failed. Planner: Analyzes these summaries to enable the system to explore structural changes rather than just local prompt tweaks. Evolver: Generates code-level harness edits and tests to ensure they run correctly before deployment. Critic and gate: A Critic assesses the edits to detect reward hacking, while a deterministic gate rejects any update that regresses a previously solved task to prevent catastrophic forgetting. HarnessX enters a growing field of self-improving harness research — but what separates it is harness-model co-evolution. The researchers highlight that optimizing either component in isolation eventually hits a wall. Evolving only the harness hits a scaffolding ceiling if the underlying model lacks the reasoning capacity to use the new tools. Training only the model hits a training-signal ceiling if the harness never prompts the model to use its advanced capabilities. HarnessX interleaves harness evolution with model training. The execution traces generated while the harness attempts to adapt to tasks are converted into reinforcement learning signals for the foundation model. Every time the harness improves its strategy, the model simultaneously learns to better exploit that new strategy, breaking the capability ceilings of traditional AI agent development. HarnessX makes this co-evolution possible through cross-harness GRPO (Group Relative Policy Optimization). GRPO is the popular RL algorithm used to train reasoning models such as DeepSeek-R1. When fine-tuning the model, cross-harness GRPO pools an agent's execution trajectories for the same task across entirely different versions of the application's harnesses. This allows the underlying model to internalize high-level strategy shifts, like using a new API endpoint or managing an execution budget, rather than just learning minor prompt-phrasing variations. HarnessX in action on industry benchmarks To validate the practical utility of HarnessX, the researchers tested it across five benchmarks comprising software engineering, multi-turn customer service dialog, web navigation, open-ended multi-step reasoning, and embodied planning. They separated the AI into two roles. The “meta-agent,” powered by Claude Opus 4.6, analyzed logs and wrote the code to evolve the harnesses. The “task agents” ran the actual workflows. To prove the framework is model-agnostic, they tested it on three different worker models: Claude Sonnet 4.6, GPT-5.4, and the open-weight Qwen3.5-9B. HarnessX was compared against two primary baselines. The first was a static harness, representing how most enterprises deploy AI today, using hand-crafted, frozen setups with benchmark-specific prompts and tools. The second was the Claude Code SDK, a baseline representing a single-agent evolver to test if the complex, four-stage AEGIS pipeline outperformed asking a single language model to iterate on the code. Dynamically evolving the harness yields significant gains on the same base model. HarnessX improved performance in 14 out of 15 model-benchmark combinations. Across all tests, evolving the harness yielded an average absolute performance gain of +14.5%. The weakest models benefited the most from dynamic harness improvement. The open-weight Qwen3.5-9B saw a +44.0% performance jump on the ALFWorld embodied planning benchmark, and an +18.2% jump on SWE-bench Verified for software engineering. Co-evolution also proved highly effective. When the researchers trained the foundation model using the data generated while evolving the harness, they saw an additional +4.7% average performance boost. Improving the harness and the model simultaneously yields the highest ceiling. The co-evolution gain applies only to open-weight models. Anecdotal evidence from the experiments shows how HarnessX solves pernicious problems when creating agent harnesses for real-world tasks. For example, in the GAIA multi-step reasoning benchmark, the task agent consistently failed because the headless browser tool it used to scrape Wikipedia timed out on the site's JavaScript-heavy frontend. HarnessX analyzed the execution traces, diagnosed the error, and wrote a new tool that bypassed the browser entirely and queried the MediaWiki API directly for plain text. It swapped this tool into the harness and instantly unlocked the failing tasks. During the WebShop e-commerce tests, the AI agent often got stuck in pagination loops, endlessly clicking "next page" and reformulating searches without ever committing to buying a product. Rather than just tweaking the prompt, HarnessX built an advisory processor that detected when the agent was repeating navigation actions. It injected a warning into the context to force a decision, curing the looping behavior and raising performance. Limits of automated harness engineering One important caveat is that the system currently relies on powerful models to act as the meta-agent that rewrites the harness code. In their experiments, the researchers relied on closed frontier models like Claude Opus. Open-weight models are quickly improving, but their ability to serve as the meta-agent remains untested. Another limitation worth considering is the intrinsic capabilities of the used models. If the underlying task model is fundamentally too weak to execute the complex workflows the new harness proposes, HarnessX will not be able to improve the agent’s overall abilities (the researchers observed this with the Qwen3.5-9B model on the SWE-bench coding tests). Despite these limitations, HarnessX makes a concrete case that harness engineering — not just model scaling — is a lever practitioners can pull now. For teams running smaller open-weight models on complex workflows, the gains here are large enough to justify evaluating harness evolution as a first step before reaching for a more expensive frontier model. The researchers plan to release the code in a future update.
Score: 50🌐 MovesJun 24, 2026https://venturebeat.com/orchestration/xiaomis-harnessx-rewrites-its-own-ai-scaffolding-mid-task-and-smaller-models-gain-the-most
How to Opt Out of Google Search’s New AI Data Training Feature
Google’s Search history update stores media uploads from your interactions, like images used in reverse image searches, for training its AI models.
Score: 50🌐 MovesJun 24, 2026https://www.wired.com/story/how-to-opt-out-of-google-search-new-ai-data-training/
Mixbook Introduces AI Tool–Story Mode–to Help Users Turn Photo Collections Into Finished Books
Mixbook Introduces AI Tool–Story Mode–to Help Users Turn Photo Collections Into Finished Books USA Today
Score: 50🌐 MovesJun 24, 2026https://www.usatoday.com/story/special/contributor-content/2026/06/24/mixbook-introduces-ai-toolstory-modeto-help-users-turn-photo-collections-into-finished-books/90681563007/
Marketing platform JustAI raises $17 million led by Base10
AI marketing platform JustAI has secured over $17 million in Series A funding, led by Base10, with participation from Y Combinator and Peak XV Partners. This investment will fuel expansion of their engineering and go-to-market teams, enhance AI capabilities, and explore opportunities in India.
Score: 50💰 MoneyJun 24, 2026https://economictimes.indiatimes.com/tech/funding/marketing-platform-justai-raises-17-million-led-base10/articleshow/131968605.cms
CIOs agree: Agents have left the lab and are doing the shopping
CIOs agree: Agents have left the lab and are doing the shopping Computing UK
Score: 49🌐 MovesJun 24, 2026https://www.computing.co.uk/event/2026/agents-left-lab-doing-shopping
Inventor of Crispr is skeptical about AI’s impact on medical innovation
Inventor of Crispr is skeptical about AI’s impact on medical innovation The Mercury News
Score: 49🌐 MovesJun 24, 2026https://www.mercurynews.com/2026/06/24/biotech-visionary-is-skeptical-about-ais-impact-on-medical-innovation/
Stratrix Technologies Expands Global Operations To Support Rising Demand For AI-Driven Business Automation
Stratrix Technologies Expands Global Operations To Support Rising Demand For AI-Driven Business Automation USA Today
Score: 49🌐 MovesJun 24, 2026https://www.usatoday.com/press-release/story/35435/stratrix-technologies-expands-global-operations-to-support-rising-demand-for-ai-driven-business-automation/
A fintech company said its employee burned through $80,000 in tokens making a 'brainrot shooter game'
A fintech company said its employee burned through $80,000 in tokens making a 'brainrot shooter game' Business Insider
Score: 48🌐 MovesJun 24, 2026https://www.businessinsider.com/fintech-company-slash-employee-burned-through-thousands-in-tokens-2026-6
Behind Kimi K2.7 Code: An Overlooked New Paradigm in AI Coding Is Taking Shape
Moonshot AI's Kimi K2.7 Code shifts AI coding from generating code to reconstructing behavior from existing products
Score: 48🌐 MovesJun 24, 2026https://pandaily.com/kimi-k2-7-code-ai-coding-paradigm-jun2026
Exclusive: Seltz, a startup rebuilding web search for AI agents, raises $12.5 million in seed funding
Exclusive: Seltz, a startup rebuilding web search for AI agents, raises $12.5 million in seed funding Fortune
Score: 48💰 MoneyJun 24, 2026https://fortune.com/2026/06/24/exclusive-seltz-a-startup-rebuilding-web-search-for-ai-agents-raises-12-5-million-in-seed-funding/
Exoskeleton and robotic arm reduce factory lifting strain by up to 65%
More and more robots are assisting workers in factories. However, human-robot collaboration is still far from seamless. Researchers from Prof. Lorenzo Masia's team at the Technical University of Munich (TUM) have now developed a solution that enables a factory worker wearing an exoskeleton to work closely and, above all, safely, with a robotic arm. This reduces the physical strain on workers and improves production processes.
Score: 48🌐 MovesJun 24, 2026https://techxplore.com/news/2026-06-exoskeleton-robotic-arm-factory-strain.html
‘Learn, Unlearn, and Relearn’: Business Leaders on How AI Is Changing Creative Work
‘Learn, Unlearn, and Relearn’: Business Leaders on How AI Is Changing Creative Work Time Magazine
Score: 48🌐 MovesJun 24, 2026https://time.com/article/2026/06/24/cannes-pmi-time100-talk/
Accelerating Transformers Fine-Tuning with NVIDIA NeMo AutoModel
Accelerating Transformers Fine-Tuning with NVIDIA NeMo AutoModel
Score: 48🌐 MovesJun 24, 2026https://huggingface.co/blog/nvidia/accelerating-fine-tuning-nvidia-nemo-automodel
Upbound open-sources Modelplane to optimize inference clusters
Upbound Inc. today released Modelplane, a new open-source tool for managing artificial intelligence inference clusters. San Francisco-based Upbound is backed by $69 million from Alphabet Inc.’s GV fund, Intel Capital and others. It’s best known as the creator of Crossplane, an open-source infrastructure management engine. It’s an upgraded version of the Kubernetes control plane, a […] The post Upbound open-sources Modelplane to optimize inference clusters appeared first on SiliconANGLE .
Score: 48🌐 MovesJun 24, 2026https://siliconangle.com/2026/06/23/upbound-open-sources-modelplane-optimize-inference-clusters/
The Next AI Coding Stack Is Multi-Assistant
The Next AI Coding Stack Is Multi-Assistant Enterprise software teams are not standardizing on one AI coding assistant. They are adding many. One developer may use Cursor for repository navigation, Claude for planning, Microsoft Copilot inside the IDE, Windsurf for agentic workflows, and specialized tools for security, testing, documentation, or pull request review. Platform teams […] The post The Next AI Coding Stack Is Multi-Assistant appeared first on Tabnine .
Score: 48🌐 MovesJun 24, 2026https://www.tabnine.com/blog/the-next-ai-coding-stack-is-multi-assistant/
'Humans Are Still in Charge': Cannes Lions Leaders on Creativity in the AI Age
'Humans Are Still in Charge': Cannes Lions Leaders on Creativity in the AI Age Time Magazine
Score: 47🌐 MovesJun 24, 2026https://time.com/article/2026/06/23/cannes-autodesk-time100-talk/
Why AI Is Changing How Services Compete For Customers
Why AI Is Changing How Services Compete For Customers entrepreneur.com
Score: 47🌐 MovesJun 24, 2026https://www.entrepreneur.com/building-a-business/why-ai-is-changing-how-services-compete-for-customers
Digital sovereignty is no longer optional - Agentic AI has made it fundamental
How business can prepare for Agentic AI.
Score: 47🌐 MovesJun 24, 2026https://www.itnews.com.au/feature/digital-sovereignty-is-no-longer-optional---agentic-ai-has-made-it-fundamental-626693?utm_source=feed&utm_medium=rss&utm_campaign=iTnews+
HSBC report finds UAE investors lead globally in AI adoption
The findings are based on a survey of 9,993 affluent and high-net-worth individuals across 10 markets, including 703 investors in the UAE
Score: 46🌐 MovesJun 24, 2026https://www.zawya.com/en/business/banking-and-insurance/hsbc-report-finds-uae-investors-lead-globally-in-ai-adoption-dt8k4bty
Inchworm-inspired robot that crawls without rigid parts could enable remote exploration
An inchworm has provided the inspiration for a robot that can move without any rigid parts. The robot mimics a flexing muscle and can be used to inspect sewer pipes or as an explorer on the planet Mars, according to a thesis from the University of Gothenburg. The research is published on the arXiv preprint server.
Score: 46🌐 MovesJun 24, 2026https://techxplore.com/news/2026-06-inchworm-robot-rigid-enable-remote.html
Syndio bets on agentic AI with first acquisition in Seattle pay equity startup’s history
Syndio announced Tuesday that it acquired Embrace.ai, an agentic AI startup whose founders and technology will help Syndio build out its AI-powered compensation platform. Read More
Score: 46💰 MoneyJun 24, 2026https://www.geekwire.com/2026/syndio-bets-on-agentic-ai-with-first-acquisition-in-seattle-pay-equity-startups-history/
XAI Bets on Grok’s Racy Side
XAI Bets on Grok’s Racy Side The Information
Score: 46🌐 MovesJun 24, 2026https://www.theinformation.com/articles/xai-bets-groks-racy-side
Alibaba's model never trained as an agent — and improved agent performance across seven benchmarks
Alibaba's Qwen team released Qwen-AgentWorld on Tuesday — two models trained not to act inside agent environments, but to predict what those environments return. The release covers seven domains under a single architecture: MCP, Search, Terminal, Software Engineering, Android, Web, and OS. The release extends Alibaba's recent push into autonomous agents. Qwen3.7-Max , released in May, was built around a 35-hour autonomous execution capability. That shift targets a ceiling teams training agents at scale run into directly. Real search engines surface whatever results exist, with no mechanism to inject controlled conditions. Live terminals do not allow injecting a low-disk-space condition on demand. Agent training is bounded by what production environments will surface, with no systematic way to expose the edge cases agents will need to handle but rarely encounter in training. The research team trained agents inside the resulting simulator and found performance gains that exceeded what training against real environments alone produced. In a separate test, using world model training as a warm-up before agentic fine-tuning improved performance across seven benchmarks, including three the model had never seen during training. The paper accompanying the release identified a gap in prior agent research. "We argue that world modeling is a crucial missing piece in the path to general agents." Qwen-AgentWorld trains on what environments return, not what agents should do Most agent models are trained to answer one question: given what the environment just showed me, what should I do next? Qwen-AgentWorld is trained to answer the inverse: given what the agent just did, what will the environment show next? That reversal is the core of what the paper calls a language world model: instead of optimizing for action selection, the model learns to predict the next environment state across all seven domains under a single training objective. Prior work was narrower: WebWorld , an earlier Qwen project from February, covered web environments only; Snowflake's Agent World Model , published the same month, generates code-driven SQL-backed environments rather than training a model to predict states. Qwen-AgentWorld is the first to span seven domains in a single model, with environment modeling baked in from the earliest pretraining stage. Alibaba trained both models in three stages on more than 10 million environment interaction trajectories from real agent runs. Stage one teaches the model how environments behave — file systems, terminal states, browser DOM changes, API responses. Stage two trains the model to reason through what comes next before predicting it. Stage three, reinforcement learning, tightens predictions using rule-based checks and open-ended quality scoring. Both models are Mixture-of-Experts designs — only a fraction of parameters are active per token. The 35B model activates 3B; the 397B activates 17B. Both support 256K context windows. For GUI domains (Android, Web, and OS), the models work from textual accessibility trees and UI view hierarchies rather than screenshots. The 35B model weights and AgentWorldBench are available under Apache 2.0; the 397B weights are not publicly released. The training results matter more than the benchmarks The benchmark scores show how accurately the models predict what environments return. The training results show what that prediction capability is actually worth for teams building agents — and those are the numbers that matter more. According to the researchers, agents trained inside controlled simulation outperformed agents trained in real environments. Injecting targeted perturbations — partial responses that force extra agent steps, and edge cases real environments rarely surface — pushed MCPMark from 24.6 to 33.8. On Search, agents trained in entirely fictional worlds transferred to real search tasks, pushing WideSearch F1 Item from 34.02 to 50.31 on the open 35B model. A separate warm-up test showed that world model pretraining improved BFCL v4 from 62.29 to 71.25 and Claw-Eval from 53.60 to 64.88 with no agent-specific fine-tuning. Researchers flag the benchmark and the overfitting risk The paper drew immediate reaction from AI researchers on X. The concerns they raised map to what practitioners need to verify before acting on the findings. On the training objective and transfer result, the assessment from one AI/ML researcher was direct. "Every other 'agent' model has been trained to act in environments," wrote @drawais_ai , who has a PhD background and regularly breaks down AI papers. "Qwen flipped the question. They trained the model to predict the environment itself... That predictive knowledge then transfers to agent tasks even without any agent-specific fine-tuning." He identified the Controllable Sim RL result as "the receipt" for the claim that synthetic training can substitute for real-environment RL at scale, and flagged that three of the seven transfer benchmarks were entirely out of domain. The benchmark margin drew immediate scrutiny. "AgentWorldBench is a benchmark Alibaba built and published in the same paper," wrote @TheSignal_Desk , who focuses on honest takes and key numbers in AI research. "They wrote the test, then topped it by 0.46." The sim-RL methodology is the result @limalemonnn , who builds production AI agents, identified as most in need of scrutiny before the headline claim gets quoted. "Sim-trained agents traditionally overfit to the simulator's quirks," they wrote. "If the world model is too clean, the agent learns the model, not the task." They pointed to the paper's holdout split as the section practitioners should read before acting on the numbers. The overfitting concern has a partial answer in the data. The gap between uncontrolled Sim RL (MCPMark 24.6) and controlled Sim RL (MCPMark 33.8) suggests the gains depend substantially on the controllability mechanism, not simulation accuracy alone. The fictional-world Search result, where agents trained on invented environments transfer to real search tasks, is the paper's strongest evidence against the overfitting concern. What this means for teams building agentic pipelines For AI engineering teams building and scaling agentic pipelines, this work signals a meaningful shift in how agent capability gets built. Teams training agents at scale now have a third option between real-environment RL and static benchmarks: controlled simulation that injects the edge cases production won't surface. Synthetic environments are a legitimate training layer. Controlled simulation that injects conditions real environments won't produce is a complement to real-environment RL, not a shortcut around it. What a model learns before agent training starts matters more than most pipelines account for. The warm-up finding — performance gains across unseen benchmarks with no agent-specific training — suggests environment grounding belongs earlier in development than current practice.
Score: 45🌐 MovesJun 24, 2026https://venturebeat.com/technology/alibabas-model-never-trained-as-an-agent-and-improved-agent-performance-across-seven-benchmarks