China's DeepSeek vs The World; Japan's Sakana invents a shape-shifting AI Model
January Week 2: Jan 14 - Jan 20
Hi friends 👋,
In this week’s edition of Coconut Capitalists, we’re diving into:
DeepSeek versus China’s Frontier AI Labs
Sakana AI creates a shape-shifting AI Model
Quick-Fire Startup News from Around Asia
Let’s get into it.
DeepSeek versus China’s Frontier AI Labs
The Scoop
DeepSeek, an AI lab based in Hangzhou, China, just fully open-sourced a competitive model to OpenAI's o1. DeepSeek's model, named R1, nearly matches the performance across almost every benchmark with OpenAI's best-in-class o1 closed-source model. This includes advanced math, language understanding, and the most complex multi-step reasoning questions.
Just to put the model's capabilities in perspective, on MMLU - a benchmark that tests an AI model's knowledge across 57 subjects including law, mathematics, ethics, and medicine - DeepSeek's open-source R1 model achieved a 90.8% accuracy rate, nearly matching OpenAI's o1 model's score of 91.8%.
In addition to DeepSeek giving away this model as a Chinese New Year present to software developers, it's worth noting that these developers can also access a hosted version of the model via DeepSeek's website. And R1 comes at a price point 90% less expensive than using OpenAI's o1 API.
Similarities between AI Firms in China & The US
It's important to note that DeepSeek is quite a different company (in a good way) compared to its Chinese competitors. In China, virtually all of the frontier labs are funded by the same two multi-hundred billion dollar valuation tech giants: Alibaba & Tencent.
In the US, maybe you'd have venture firms like Sequoia, A16Z, and Founders Fund competing to fund a new Frontier Lab, in addition to the obvious players like Microsoft and Amazon - but in China, nope - it's mostly just Alibaba & Tencent.
Just to give context, these are arguably the top 5 AI Startup Labs in China:
Baichuan, which has raised $1.4 Billion
Zhipu, which has raised $1.17 Billion
Moonshot, which has raised $1 Billion
MiniMax, which has raised $600 million
DeepSeek, I actually don’t know
DeepSeek is the only one of the 5 not funded by Alibaba & Tencent. And it's the only one that doesn't publicly disclose its funding total.
DeepSeek is instead funded as a spinout startup by a quantitative hedge fund called High-Flyer Capital. High-Flyer Capital is a large firm, with $8 billion in assets under management as of 2025, and with over 160 employees.
So as of today, in China we have:
The "Big 4" Tech Companies, each with their own AI Labs: Tencent (Hunyuan), Alibaba (Qwen), Baidu (Ernie), and ByteDance (Doubao)
The "5 Dragons" of AI Startups, each with their own AI Labs: Zhipu, MiniMax, Moonshot, Baichuan, and DeepSeek
And to be fair, the Alibaba/Tencent strategy of running a corporate AI Lab while simultaneously funding competitive startup AI Labs isn't that different from the US. Microsoft has its own internal AI Lab with Phi & is backing OpenAI. Similarly, Amazon has its own internal AI Lab with Nova & is backing Anthropic.
Why it Matters
The difference between China's tech giants and the ones in the US is that Alibaba & Tencent are supposed to be competitors, but instead, they're co-investing partners - better known as BFFs. Unlike the US tech giants Microsoft & Amazon that are primarily backing a single horse in the race, the Chinese tech giants are attempting to back every horse to "guarantee" a winner.
The open-sourcing of frontier AI models in China, the US, and across the world will continue driving prices down for all software developers. It seems that when a frontier lab with a closed-source model like DeepSeek is slightly behind the leader like OpenAI, they will opt to open-source their model. This drives developer adoption, builds brand credibility, and makes it easier to hire world-class AI scientists. But is this strategy sustainable? Or will it continue driving the price of intelligence closer and closer to zero.
Sakana AI creates a shape-shifting AI Model
The Scoop
Are you familiar with the Pokémon Ditto? When it goes into battle, it can shape-shift into another Pokémon to give itself an advantage. For example, if facing a Fire-type Pokémon, it can instantly transform into a Water-type. This demonstrates adaptive learning – Ditto assesses its environment and the challenge of defeating its opponent, then transforms into the most effective form for the situation.
What if artificial intelligence could do the same, dynamically adapting to unfamiliar situations by restructuring its own architecture in real-time? This is the vision behind Transformer², a groundbreaking research paper from Sakana AI. As a quick recap, Sakana AI is Japan's leading foundation model company, having raised over $344 million from the likes of Nvidia, Lux Capital, and Khosla Ventures.
Diving into the Tech
The journey toward truly adaptive AI has seen various approaches. Traditional methods rely on human feedback through reinforcement learning (RLHF) or generating synthetic data through model interactions. Moreover, at the architecture level, the most prominent approach has been the Mixture of Experts (MoE), which works like a sophisticated switchboard - maintaining 8-32 expert networks and routing inputs between them while keeping individual parameters static. It's like having multiple specialized Dittos, each trained for a specific type of battle, rather than one that can truly transform.
Transformer² takes a fundamentally different approach, inspired by nature's adaptability. During inference, in a matter of milliseconds, it modifies its parameter values using SVD decomposition. Think of it as Ditto not just mimicking surface features but actually reorganizing its entire genetic makeup in real-time. The system can amplify or dampen specific neural pathways - for instance, potentially boosting mathematical reasoning pathways by 80% while reducing language generation capabilities by 30% when tackling a complex math problem. This mirrors how biological systems adapt: not by switching between different pre-set configurations but by dynamically modifying their entire structure to meet new challenges.
Why it Matters
Just as Ditto transforms its entire structure to match its opponent, Transformer² introduces real-time weight transformation in LLMs through SVD decomposition. While traditional approaches like LoRA add new parameters or rely on prompting - equivalent to giving Ditto new moves or teaching it strategies - Transformer² actually reshapes its core parameters during inference, similar to how Ditto morphs its entire biological structure.
But why should you care? For a company running customer service AI, this means replacing 10 different fine-tuned models (taking up terabytes of space) with a single adaptable model and a few megabytes of transformation vectors. This isn't just cost-efficient - it's a glimpse into a future where AI systems can seamlessly evolve to handle any task thrown at them.
🇰🇷 Korea News
Starlink is reportedly coming to South Korea within the next two months. While the satellite internet service makes little sense for a city like Seoul, it's a game-changer for the 10+ million Koreans scattered across rural towns. Two products from Starlink are expected to hit the market: a residential router you can mount on your house for high-speed internet, and the far more interesting product, a "direct-to-cell" eSIM service that beams satellite internet straight to your iPhone or Galaxy device.
Public reports suggest Starlink is in talks with SK Networks, KT, and LG UPlus for a partnership. The reason is simple: Starlink works great in rural areas but has trouble in cities. Partnering with local telcos isn't a new strategy for Starlink. In the US, they have a partnership with T-Mobile, and in Japan with KDDI. For Starlink, the formula is quite clear: own the rural market and team up with local carriers to handle cities.
SK Telecom has launched a new GPU-as-a-Service platform in collaboration with Lambda Labs (a US startup with $1.2 billion in funding). The service, called “SKT GPUaaS,” appears to solely target large enterprises and various departments inside of the Korean Government. This is evidenced by the lack of a public-facing website.
It’s worth noting, that the GPU-as-a-service market is heating up globally. In the US, companies like Together AI ($228 million) and SF Compute Co ($15 million) provide short-term GPU compute for training, while Baseten ($60 million), Replicate ($58 million), and Modal ($23 million) focus on inference services. Also, Korea's own Friendli ($12 million) is growing very fast - offering both inference and fine-tuning services to companies from small startups to large enterprises.
Samsung and OpenAI are reportedly developing a smart TV that integrates ChatGPT directly into the TV’s UI Layer. The AI enables natural voice conversations about what you want to watch with ChatGPT, which will in parallel analyze your viewing history across Netflix, HBO, Disney+ and other streaming services to make recommendations. When you find something you like, ChatGPT then takes control of the TV’s interface to open the right app and start playing your content automatically.
🇯🇵 Japan News
Nintendo has announced the Switch 2 will launch in 2025, marking eight years since the original Switch's 2017 debut - an unusually long gap between generations. The original Switch was a massive success, generating $84 billion in lifetime revenue and becoming the third best-selling gaming device ever with 148 million units sold, trailing only the Nintendo DS (154 million) and PlayStation 2 (160 million).
🇸🇬 Singapore News
Singtel and Perplexity have announced a partnership targeting the Singaporean population. The AI Search startup, last valued at $9 Billion, is offering a free year of Perplexity Pro to all Singtel customers. Perplexity combines OpenAI or Claude's less expensive models with real-time data, while its Pro version unlocks access to more sophisticated and expensive AI models. This move by Singtel puts Perplexity in direct competition with OpenAI's SearchGPT and ByteDance's Doubao, which are already popular AI search products in Singapore.
🇮🇩 Indonesia News
Grab and BYD have announced a major partnership, with Grab planning to purchase up to 50,000 electric vehicles to expand its Southeast Asian fleet, particularly in Indonesia. The deal showcases BYD's aggressive regional expansion - for context, the Chinese automaker is investing $1 billion in an Indonesian plant capable of producing 150,000 EVs annually by late 2025.
🇨🇳 China News
Tsinghua, Fudan, and Stanford students jointly released an AI agent development framework called "Eko". The Eko framework can take over users' computers and browsers to perform various tasks on behalf of humans. Developers can potentially use the framework as a drop-in replacement for bare-bones web automation tooling like Playwright or Puppeteer (or in conjunction with). The framework could be best described as a competitor to Langchain or Browser-use.
The open-source TypeScript framework was released under Fellou AI's GitHub organization, which appears to be a startup possibly based in China. Additionally, it seems that the core team of researchers are simultaneously attending university - that's badass!
The US Government has blacklisted 25 Singapore and China-based AI companies for allegedly collaborating with the Chinese military. A notable company on the blacklist is Zhipu AI, one of China's leading foundation model startups. This blacklist is worth noting, as the companies on the list will face significant challenges accessing Nvidia training chips, and connecting to US-based software and tools will become virtually inaccessible. For instance, even basic resources like pip or npm packages created by US-based entities would theoretically be restricted.