AI Learning Digest

Daily curated insights from Twitter/X about AI, machine learning, and developer tools

Claude Code Ecosystem Explodes: Tool Search, 60K Skills, and the Enterprise Scaling Problem

Claude Code's Maturation Moment

January 14th marked a significant milestone for Claude Code's ecosystem. The launch of MCP Tool Search addressed one of the biggest friction points holding back power users. As Simon Willison noted:

"Context pollution is why I rarely used MCP, now that it's solved there's no reason not to hook up dozens or even hundreds of MCPs to Claude Code"

This solves a fundamental scaling problem—MCP servers can have 50+ tools, and without intelligent search, the context window becomes polluted with tool descriptions the agent never uses. Boris Cherny from Anthropic highlighted the impact: "Every Claude Code user just got way more context, better instruction following, and the ability to plug in even more tools."

The Skills Explosion

Two major skills announcements dropped simultaneously:

1. Trail of Bits released their first batch of official Claude Skills, signaling that security-focused enterprises are now building on the platform

2. SkillsMP launched with over 60,000 Claude Skills ready for use—an agent marketplace that didn't exist weeks ago

Dan shared practical installation guidance: skills "only take a bit of context and are loaded when needed by the agent," making them lightweight additions rather than context-heavy burdens.

The Enterprise Cost Crunch

Perhaps the most sobering data point came from Eric Provencher:

"I heard from someone who works at a big tech co that they started rolling out Claude Code to employees, with a budget of $100 in credits per month, but people burn through it in 2-3 days. Idk how we scale out agentic work with api pricing."

This creates an interesting tension: Peter Steinberger reports his productivity "~doubled with moving from Claude Code to codex," yet the economics remain challenging at enterprise scale. Matthew Lam offered an alternative path—running a personal Claude assistant on a $5/month Hetzner VPS for 24/7 availability.

Best Practices Crystallize

Cursor released a blog post on agent coding best practices that's being widely shared. The distilled wisdom:

1. Use plan mode before writing code

2. Start fresh when the agent gets confused

3. Let the agent gather its own context

4. Revert rather than fix hopelessly broken code

5. Add rules for repeated mistakes

6. Write tests first for iteration

7. Run multiple models and pick the best

8. Use specific prompts

9. Give agents linters and tests to verify

The monorepo advantage is emerging as a key pattern. As Klaas observed: "Having a monorepo turned out to be a massive advantage for AI coding—all context is inside one repo: APIs, servers, auth, landing page, marketing sites, dashboard, ops, everything."

GitHub Enters the Agent SDK Race

GitHub open-sourced a technical preview of the Copilot CLI SDK, enabling agents in Go, Python, TypeScript, and C#. Built on the same agent loop powering Copilot CLI and GitHub's Coding Agent, it supports bring-your-own-key and any model. The demo showed the CLI driving Excel—a glimpse of agents moving beyond code editors.

Inside Anthropic's Cowork

Jeff Tang reverse-engineered Cowork by exporting the entire VM snapshot:

  • It's an Electron app with a Linux sandbox (bubblewrap)
  • Cowork wraps Claude Code (which wraps Opus)
  • Contains an "internal-comms skill" made by Anthropic
  • Found 2 security vulnerabilities

Most fascinating: when asked what questions he should have asked, the agent "suggested adding memory and leaving notes for itself once it 'dies'." The existential implications of agents contemplating their own persistence are becoming real engineering considerations.

Rethinking Code Review

Addy Osmani articulated a shift in how we'll review AI-generated code:

"PRs show what changed. Prompt logs show what the human actually wanted. Full trajectories—the conversation, the iterations, the steering—show you how they got there."

The insight: "Review the output for correctness, review the trajectory for intent. The diff tells you what shipped. The conversation tells you why."

Vibefounding: AI-Native Entrepreneurship

Ethan Mollick is teaching MBAs to launch companies in four days using AI tools:

"Everything they are doing in four days would have taken a semester in previous years... The non-coders are all building working products. But also everyone is doing weeks of high quality work on financials, research, pricing, positioning, marketing in hours."

His key insight: "The hardest thing to get across is that AI doesn't just do work for you, it also does new kinds of work." Those with domain expertise have the biggest advantage—they can build solutions for known hard problems that previously seemed impossible.

The Organizational Intelligence Question

A viral Chinese thread from 向阳乔木 explored why AI helps individuals dramatically but struggles in organizations. The core insight: context in organizations isn't stored anywhere—it's generated and destroyed through interactions. AI must "participate like humans do, observing how decisions unfold, conflicts escalate, and consensus forms."

The prediction: organizations won't reorganize by role but by "collaboration units." AI handles coordination work while humans focus on judgment, risk assessment, and relationship maintenance.

Emerging Roles

Multiple posts highlighted the rise of the AI transformation hire—someone who works across the entire org to "kill stupid manual processes." Codie Sanchez called it "the best money I've ever spent as a CEO." Glean calls theirs "AI Outcomes Managers" who "identify high-friction workflows, automate repetitive steps, and deploy AI agents."

Notable Observations

  • Ethan Mollick's philosophical observation: "Could this meeting be an email? Could this organization be a set of markdown files?"
  • Arlan's declaration: "It happened—MCP is no longer BS"
  • The Claude-as-RLM hack: "You can just make Claude Code a RLM by telling it to look at its own conversation logs"
  • Kling AI 2.6 with Motion Control continues pushing video generation forward

What This Means

We're watching Claude Code evolve from a developer tool to an enterprise platform in real-time. The skills marketplace, tool search, and SDK releases suggest the infrastructure is maturing. But the cost economics remain unsolved—doubled productivity means little if budgets burn out in days. The next phase will likely focus on efficiency: smaller models for routine tasks, better context management, and smarter tool selection.

Source Posts

E
Evan Boyle @_Evan_Boyle ·
Today we're open sourcing a technical preview of the GitHub Copilot CLI SDK. Build agents with custom tools in Go, Python, TypeScript, and C#. Built on the same agent loop that powers the Copilot CLI and GitHub Coding Agent. Supports BYOK, and any model. Here is the Copilot CLI driving Excel:
向阳乔木 @vista8 ·
这篇文章有点厉害,把组织如何用AI提效讲的很清楚。 文章超级长,转写一半大家感受下,推荐看原文 --- 你可能会看到一个矛盾的现象。 AI帮个人干活,效率高得惊人,但放到公司里,效果就大打折扣了。 为什么? 因为公司里的活儿,本质上不是一个人能搞定的。 需要协作、谈判、升级决策,要在时间线上不断对齐判断。 一个再聪明的AI,如果只能单打独斗,在组织里也就是个"局部优化"的工具。 作者这篇文章,主要讲AI怎么从"个人助理"进化成"组织智能"。 上下文不是藏在某个地方的宝藏 很多人觉得,只要给AI足够多的上下文,它就能理解组织怎么运作。 前提是:组织的上下文是个完整的、结构化的东西,就像化石埋在地层里,只要挖出来就行。 真相是,大部分组织根本不是这样运作的。 上下文不存在于某个数据库里,不在某份文档里,甚至不在老板脑子里。 它是在互动中不断生成和消失的。 今天开会定的事,明天可能因为一封邮件就变了。 AI要理解组织,不能只是"读资料",它得参与进来,像人一样在邮件、会议、文档里观察决策怎么展开,冲突怎么升级,共识怎么形成。 这才是真正的"上下文学习"。 人类的协作史,就是AI的未来 尤瓦尔·赫拉利在《人类简史》里说,人类能统治地球,不是因为个体更聪明,而是因为学会了大规模协作。 我们发明了神话、法律、货币、宗教这些"共同故事",让陌生人也能对齐行为。 科学也是这样。 17世纪之前,科学知识是碎片化的,靠私人信件和书籍传播,错误会一直流传,发现会不断丢失。 转折点不是某个新理论,而是协作系统的出现如科学期刊、学术社团、同行评议。 知识开始积累,是因为判断变成了社会化的过程。 电话也一样。 早期电话是点对点连接的,你得知道线通到哪儿才能打。 网络一大,这套就崩了。 怎么办?接线员出现了。 她们坐在交换机前,手动连接电话,记得谁在打给谁,哪些电话更紧急,怎么处理冲突。 电话能规模化,是因为有了这个"人工中介层"。 软件开发也经历过这个阶段。 Git之前,代码协作很脆弱。 CVS和SVN是中心化的,多人改代码得排队,冲突成本很高。 Git让分支变便宜了,记录变成了一等公民,冲突变得可见、可解决。 GitHub又加了一层社会化协作:PR、代码审查、issue讨论。 规律很明显:个体能力先出现,但指数级的生产力,只有在协作结构出现后才会爆发。 AI现在就在这个节点上。 组织不会按"角色"重组,而是按"协作单元" 很多人想象的未来是:AI接管某些岗位,人类做剩下的。 但作者觉得不是这样。 AI不受人类的限制——注意力、带宽、专业分工、层级结构——这些都不存在。 所以未来的组织不会按"角色"设计,而是按"协作单元"设计。 比如法务。 法务的核心工作是"共同立场"。 合同要经过律师、合伙人、客户的多轮谈判,立场在这个过程中不断演化。 今天,资深合伙人的价值很大一部分在于"记得住"——记得之前的先例、风险、立场变化。 未来,AI会承担这部分协调工作。 它跟踪所有未解决的问题,发现立场冲突,把判断性的决策升级给合适的人。 法务团队会重组:大量AI做机械性的起草和信息收集,少数资深合伙人做决策、风险判断、客户关系维护。 再比如市场。 市场的挑战是"叙事一致性"。 产品市场、增长、品牌、销售,各自有各自的说法,怎么对齐? 今天靠开会、审稿、非正式影响力。 未来,AI会跨渠道追踪叙事,发现偏离,升级冲突。 人类的角色从"渠道负责人"变成"叙事把关人"和"战略意图制定者"。 财务、产品也是类似的逻辑。 AI不是替代某个岗位,而是重新分配了协调工作。 最快的路径是: 把AI嵌入到组织已经在用的协作工具里——邮件、消息、浏览器、文档。 这不是"遗留系统",它们是工作的活基础设施。 意图怎么表达、分歧怎么浮现、决策怎么升级、责任怎么记录,都编码在这些工具里。 而且,升级机制已经内置了:@提及、批注、评论、建议编辑、通知。(AI也可以做) AI要做的,不是发明新的协作方式,而是学会在这些已有的机制里参与和升级。
A Aatish Nayak @nayakkayak

Collaborative Intelligence

C
Codie Sanchez @Codie_Sanchez ·
Best money I've ever spent as a CEO... an internal AI transformation hire. He doesn't care about title. He just wants to ship. And he goes across your entire org, sales, revenue, hr, apps, tech and kills stupid manual processes. Such an underrated unlock.
A
Angry Tom @AngryTomtweets ·
@antoinemarcel this is Kling AI 2.6 Motion Control
P
Peter Steinberger @steipete ·
Did some statistics. My productivity ~doubled with moving from Claude Code to codex. Took me a bit to figure out at first but then 💥 https://t.co/cfyKg0E1hf
📙
📙 Alex Hillman @alexhillman ·
I had my Claude assistant build a script to do them in batches. Local whisper model is free but slower. 200 would probably take a day or so. https://t.co/KFRjWr6VFf api keys work outbton $1-1.50/hr, but WAY faster, so for a few hundred bucks you can do the whole thing. My advice would be toget it to do one the way you want, THEN ask it to do a batch of 5 and see how it works/how much it costs, then ask it to do the full set
K
Klaas @forgebitz ·
having a monorepo turned out to be a massive advantage for ai coding all context is inside one repo api's, servers, auth, landing page, marketing sites, dashboard, ops, everything
B
Boris Cherny @bcherny ·
Super excited about this launch -- every Claude Code user just got way more context, better instruction following, and the ability to plug in even more tools
T Thariq @trq212

Tool Search now in Claude Code

D
Dan ⚡️ @d4m1n ·
since many asked, to "install" all these 1. copy this entire directory: https://t.co/r6fcreGXPZ (including https://t.co/wtrWrWPVid) 2. paste inside the .claude/skills directory in your project 👉 skills only take a bit of context and are loaded when needed by the agent
D
Dan Guido @dguido ·
.@trailofbits released our first batch of Claude Skills. Official announcement coming later. https://t.co/vI4amorZrc
H
Harry Charlesworth @hjcharlesworth ·
The gap is getting wider and I'm glad I could finally write this down. A mental model that works for us when pairing with an agent. https://t.co/xVFJG6JgM5
E
Ethan Mollick @emollick ·
Teaching an experimental class for MBAs on “vibefounding,” the students have four days to come up and launch a company. More on this eventually, but quick observations: 1) I have taught entrepreneurship for over a decade. Everything they are doing in four days would have taken a semester in previous years, if it could have done it at all. Quality is also far better. 2) Give people tools and training and they can do amazing things. We are using a combination of Claude Code, Gemini, and ChatGPT. The non-coders are all building working products. But also everyone is doing weeks of high quality work on financials, research, pricing, positioning, marketing in hours. All the tools are weird to use, even with some training, but they are figuring it out. 3) People with experience in an industry or skill have a huge advantage as they can build solutions that have built-in markets & which solve known hard problems that seemed impossible. (Always been true, but the barriers have fallen to actually doing stuff) 4) The hardest thing to get across is that AI doesn’t just do work for you, it also does new kinds of work. The most successful efforts often take advantage of the fact that the AI itself is very smart. How do you bring its analytical, creative, and empathetic abilities to bear on a problem? What do you do with access to a very smart intelligence on demand? I wish I had more frameworks to clearly teach. So many assumptions about how to launch a business have clearly changed. You don’t need to go through the same discovery process if you build a dozen ideas at the same time & get AI feedback. Many, many new possibilities, and the students really see how big a deal this is.
D
Damian Player @damianplayer ·
this role will become a key hire for most orgs. if you aren’t actively looking for an AI partner, automation specialist, or bringing AI teams in house, you’re already behind. we’re talking to companies doing $5M-$50M/year right now. the demand is insane.
C Codie Sanchez @Codie_Sanchez

Best money I've ever spent as a CEO... an internal AI transformation hire. He doesn't care about title. He just wants to ship. And he goes across your entire org, sales, revenue, hr, apps, tech and kills stupid manual processes. Such an underrated unlock.

A
Arvind Jain @jainarvind ·
Love this. At @glean, we call these AI Outcomes Managers. They not only lead our internal “Glean on Glean” initiatives, they also work directly with customers to identify high-friction workflows, automate repetitive steps, and deploy AI agents that drive clear business impact.
C Codie Sanchez @Codie_Sanchez

Best money I've ever spent as a CEO... an internal AI transformation hire. He doesn't care about title. He just wants to ship. And he goes across your entire org, sales, revenue, hr, apps, tech and kills stupid manual processes. Such an underrated unlock.

E
Ejaaz @cryptopunk7213 ·
there it is- "today we're introducing Personal Intelligence" now your emails, photos, youtube & search history, location, documents will all be used to train a personalized version of gemini to deliver you a tailored experience. this is all part of googles multi-pronged masterplan and they're executing much quicker than i expected tbh people are about to realize how powerful their data moat is. openai, anthropic cannot compete. wrote about this in detail here https://t.co/jkShii1XhK
G Google @Google

Today, we’re introducing Personal Intelligence. With your permission, Gemini can now securely connect information from Google apps like @Gmail, @GooglePhotos, Search and @YouTube history with a single tap to make Gemini uniquely helpful & personalized to *you* ✨ This feature is launching in beta today in the @GeminiApp. See Personal Intelligence in action 🧵 ↓

J
Jeff Tang @jefftangx ·
Last night I stayed up late talking to Cowork about how it was built I exported the entire VM snapshot What I learned: - It's an Electron App with its own Linux sandbox (bubblewrap) - Cowork is a wrapper around Claude Code (which is a wrapper around Opus) - It has an "internal-comms skill" made by Anthropic - I found 2 small-ish security vulnerabilites 👀 The craziest part: When I asked it what questions I should've asked it, it suggested adding memory and leaving notes for itself once it "dies" 🥲
S Simon Willison @simonw

I used Claude Code to reverse-engineer the Claude macOS Electron app and had Cowork dig around in its own environment - now I've got a good idea of how the sandbox works It's an Ubuntu VM using Apple's Virtualization framework, details here: https://t.co/lRWVhrNFk0

🍓
🍓🍓🍓 @iruletheworldmo ·
i never want to read any other way again. https://t.co/hc4edKpDcJ
A
Arlan @arlanr ·
it happened mcp is no longer bs
T Thariq @trq212

Tool Search now in Claude Code

M
Miles Deutscher @milesdeutscher ·
If you're building with Claude Code, you'll want to bookmark this site. A full agent marketplace of 60,000+ Claude Skills that are ready for use now. https:// skillsmp. com/ https://t.co/YfZRf4w9TJ
P
Pleometric @pleometric ·
Are you enjoying Claude Code? 😂 https://t.co/J7V9qcIIEE
A
Addy Osmani @addyosmani ·
AI may change how we do code reviews. PRs show what changed. Prompt logs show what the human actually wanted. Full trajectories - the conversation, the iterations, the steering - show you how they got there. When agents write the code, review inverts. You stop asking only "is this correct?" and start asking "was this intent clear enough to execute safely?" Most teams won't abandon code review. They'll do both. Review the output for correctness, review the trajectory for intent. The diff tells you what shipped. The conversation tells you why. We're not replacing PRs but we may consider the prompt is the spec, the code is the build output, and review should also happen at the layer where human judgment actually lives.
G Gergely Orosz @GergelyOrosz

"I don't like pull requests (PRs) any more. A large chunk code change doesn't tell me much about the intent or why it was done. I now prefer prompt requests. Just share the prompt you ran / want to run. If I think it's good, I'll run it myself and merge it." - @steipete wow

🎭
🎭 @deepfates ·
Oh you can just make claude code a RLM by telling it to look at its own conversation logs
ℏεsam @Hesamation ·
the Cursor team released a blog post on the best practices of coding with agents. writing fully functional code vs slop comes down to following 10 very simple principles: 1. use plan mode before any code 2. start fresh conversations when it gets confused 3. let the agent get its context, don’t tag everything 4. revert and refine instructions rather than fixing hopelessly 5. add rules for repeated mistakes 6. write tests first so it can iterate 7. run multiple models and pick the best 8. use debug mode for stubborn bugs 9. specific prompts get way better results 10. give it linters and tests to verify​​​​​​​​​​​​​​​​ blog: https://t.co/M9dWf27F4V
e
eric provencher @pvncher ·
I heard from someone who works at a big tech co that they started rolling out Claude code to employees, with a budget of $100 in credits per month, but people burn through it in 2-3 days. Idk how we scale out agentic work with api pricing
M
Matthew Lam @mattlam_ ·
Fully set up my @clawdbot and now I have my 24/7 personal assistant + coding agent for $5/month. Easy to setup, I just got claude and codex to help me with Hetzner for VPS, and now I get some of my favorite use cases 24/7: - have a new project idea? Instead of just writing in my todo list, tell Clawdy (my assistant) to start helping me do relevant research, set up a new repo, or even start coding. - look through my task list, calendar, emails to help me plan my day and keep track of tasks - periodic reminders that I need (no longer need to go through Apple Reminders app just tell Clawdy) - X's search, including posts you've seen, I find pretty bad, I just get Clawdy to look for me with bird cli, much more likely to find a tweet I forgot to bookmark. @nikitabier checkout @steipete 's https://t.co/fbxAH2WyAp and set yourself up with a personal assistant
T
Thariq @trq212 ·
Tool Search now in Claude Code
B
Bilgin Ibryam @bibryam ·
"The best software engineers won’t be the fastest coders, but those who know when to distrust AI." The Next Two Years of Software Engineering - @addyosmani https://t.co/gcR3b75Mpu
E
Ethan Mollick @emollick ·
Could this meeting be an email? Could this organization be a set of markdown files?
E
Ethan Mollick @emollick ·
Had Claude Code build a little plugin that visualizes the work Claude Code is doing as agents working in an office, with agents doing work and passing information to each other. New subagents are hired, they acquire skills, and they turn in completed work. Fun start. https://t.co/wm93gsiBWi
S
Simon Willison @simonw ·
This is great - context pollution is why I rarely used MCP, now that it's solved there's no reason not to hook up dozens or even hundreds of MCPs to Claude Code
T Thariq @trq212

Tool Search now in Claude Code