Skip to content
View in the app

A better way to browse. Learn more.

Thailand News and Discussion Forum | ASEANNOW

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Chinese start-ups such as DeepSeek are challenging global AI giants

Featured Replies

Their models are cheaper thanks to US export restrictions that have inadvertently spurred innovation

Last month, a Chinese start-up called DeepSeek astonished the international tech community with its latest open-source artificial intelligence model. DeepSeek-V3 delivers a performance comparable to that of better-funded US rivals such as OpenAI. This week it impressed once again with R1, its foray into AI reasoning.

https://archive.ph/51WIt#selection-2415.0-2419.1

 

I see lots of comments about how China just produces cheap copycat goods. Clearly this is an outdated view. Japanese manufacturing was regarded in the same way for about 25 years after WW2.

  • Popular Post

And just in time, Musk has made Grok subscription only again. Was free last night but only gives a subscription prompt now. ChatGPT was already limited in the number of images it would generate for free and Gemini won't generate images of people unless you pay. They are driving people to DeepSeek, although no image function on DS yet.

  • Popular Post
On 1/25/2025 at 4:39 PM, placeholder said:

Their models are cheaper thanks to US export restrictions that have inadvertently spurred innovation

Last month, a Chinese start-up called DeepSeek astonished the international tech community with its latest open-source artificial intelligence model. DeepSeek-V3 delivers a performance comparable to that of better-funded US rivals such as OpenAI. This week it impressed once again with R1, its foray into AI reasoning.

https://archive.ph/51WIt#selection-2415.0-2419.1

 

I see lots of comments about how China just produces cheap copycat goods. Clearly this is an outdated view. Japanese manufacturing was regarded in the same way for about 25 years after WW2.

That's smart from the Chinese to make it open source. By doing so they will be able to use freely any additional improvement made by anyone in the world. 

 

In the end the total amount of development resources dedicated to it may well exceed what is expected to be invested in ChatGPT and other LLM models.

  • Popular Post

DeepSeek was developed for almost nothing (~$5.6 million), and apparently to maximize its value one won't need the fancy chips NVIDIA is making. Thus, NVIDIA lost $560 billion in market cap on 1/27/25. Also, the subscription models of ChatGPT and others were obviated, since anybody can choose open source DeepSeek for free, or close to it.

 

I'm beginning to think the $500 billion AI plan the President announced last week is just a thinly disguised Oligarch bailout, as the tech sector suddenly isn't worth the multiple the market gave it.

 

I wonder if tech insiders are going to be dumping even more? I wonder how bad margin calls are going to be for all the leveraged buyers of tech? I wonder what company Musk is actually going to run, or will he go full DOGE, as Twitter is worth very little compared to what he paid, $40-80,000 Teslas are in a market with $12-25,000 BYDs, and his rocket just blew up last week (2nd stage) and rained debris all over the Caribbean. 

 

I think we've seen lots of peaks in the last week....Peak Musk, Peak Equity Market, andalready Peak 47's Presidency. It's all downhill from here.

 

 

They're doing it for a fraction of the costs the US companies are spending.

 

See the much more extensive post on this above mine 

On 1/25/2025 at 4:39 PM, placeholder said:

 

I see lots of comments about how China just produces cheap copycat goods. Clearly this is an outdated view.

 

Actually that's exactly what happened with DeepSeek. 

 

It may be the case it has managed to cut costs and compute, but we do know that it is built at least in part on the shoulders of the giants: it uses Nvidia chips – albeit older, cheaper versions - and utilises Meta's open-source Llama architecture, as well as AliBaba's equivalent Qwen.

 

https://www.bbc.com/news/articles/cx2jxvn0r51o

 

They copied Meta's open source. And by all accounts DeepSeek is not as good as the  US competition, for instance you can't ask it questions about Tiannamen Square. It censors itself when it comes to Chinese topics.

4 hours ago, Cameroni said:

 

Actually that's exactly what happened with DeepSeek. 

 

It may be the case it has managed to cut costs and compute, but we do know that it is built at least in part on the shoulders of the giants: it uses Nvidia chips – albeit older, cheaper versions - and utilises Meta's open-source Llama architecture, as well as AliBaba's equivalent Qwen.

 

https://www.bbc.com/news/articles/cx2jxvn0r51o

 

They copied Meta's open source. And by all accounts DeepSeek is not as good as the  US competition, for instance you can't ask it questions about Tiannamen Square. It censors itself when it comes to Chinese topics.

Actually, it's broader than China vs USA, it's also about open source vs commercial AI. The open source ai community is quite dynamic and it's an additional contribution which will be improved not only by China, but also by other AI open source participants.

 

The choice of open source by the Chinese has two advantages:

- as you pointed out, it allows them to benefit from previous open source knowledge and resources,

- it directs a share of AI open source innovation by other actors towards complementary innovation which will help China to cope with current limitations regarding chips and other resources.

Here is an intersting video on the topic.

 

 

  • Popular Post

The above video by Microsoft engineer David Plummer is a clear intro to what Deepseek is, and isn't. The clever part is its distillation process to turn larger models into smaller, good enough, models that use less hardware.  But I do see a possible cat in the bag and asked Grok to confirm (we're already a mix of human and AI!).  I post my question and it's answer for clarity.  However it goes, Deepseek's arrival  will be transformative.   

 

Me: My question is about the newly released Deepseek AI software and its operation. The question is specific. Preface: Deepseek is said to use larger AI models to train a number of smaller models that can run on smaller hardware and thus cheaper. Does this mean that 1) larger hardware and training is still required when training the smaller modules? and 2) does the Deepseek open source code they released include the initial large training model, or it only includes code for running the smaller models? 

 

(Spoiler, required and no not released) 

 

Grok:

  1. Larger Hardware and Training for Smaller Modules: Yes, the process of training smaller models through distillation or transfer learning still requires the use of larger hardware and computational resources initially. DeepSeek uses a large model (like DeepSeek-R1 or DeepSeek-V3) to train these smaller models. The initial training of the large model necessitates significant computational resources, but once trained, the knowledge can be distilled into smaller models that require less computational power for deployment. This is evident from the strategies mentioned where DeepSeek-R1 knowledge is distilled into smaller models, which implies that the heavy lifting is done during the training of the larger model.

  2.  

  3. Open Source Code and Model Inclusion: The open-source release by DeepSeek includes models that are distilled from the larger models, particularly the DeepSeek-R1 series. However, the training code for the initial large models (like DeepSeek-V3 or the base models used for distillation) does not seem to be open-sourced. The comments on X and technical reports indicate that while the distilled models and possibly some of their training code are available, the actual training pipeline for the larger models, including the data and the original training code, are not part of the open-source release. This means users can run and possibly fine-tune the smaller models but cannot replicate the training of the large models without additional resources or information.

     

In summary, while you can utilize and potentially further train the smaller models with less hardware, the groundwork of training the larger models still requires substantial resources. The open-source materials provided by DeepSeek focus on the smaller, distilled models rather than the full spectrum of their training process for the large models.

End of Grok.
 

I think the scariest part is that in the moment nobody really knows the details. People claim this and that. But is all the information true? But the market obviously reacts already. Because maybe it is true.

 

AI is such a hype and so many people invest in things which look promising, the bubble is very unstable.

And I guess what just happened could happen again and again. Or maybe not. It's a high-risk investment, between making huge amounts of money and losing it all.

12 minutes ago, rabas said:

The above video by Microsoft engineer David Plummer is a clear intro to what Deepseek is, and isn't. The clever part is its distillation process to turn larger models into smaller, good enough, models that use less hardware.  But I do see a possible cat in the bag and asked Grok to confirm (we're already a mix of human and AI!).  I post my question and it's answer for clarity.  However it goes, Deepseek's arrival  will be transformative.   

 

Me: My question is about the newly released Deepseek AI software and its operation. The question is specific. Preface: Deepseek is said to use larger AI models to train a number of smaller models that can run on smaller hardware and thus cheaper. Does this mean that 1) larger hardware and training is still required when training the smaller modules? and 2) does the Deepseek open source code they released include the initial large training model, or it only includes code for running the smaller models? 

 

(Spoiler, required and no not released) 

 

Grok:

  1. Larger Hardware and Training for Smaller Modules: Yes, the process of training smaller models through distillation or transfer learning still requires the use of larger hardware and computational resources initially. DeepSeek uses a large model (like DeepSeek-R1 or DeepSeek-V3) to train these smaller models. The initial training of the large model necessitates significant computational resources, but once trained, the knowledge can be distilled into smaller models that require less computational power for deployment. This is evident from the strategies mentioned where DeepSeek-R1 knowledge is distilled into smaller models, which implies that the heavy lifting is done during the training of the larger model.

  2.  

  3. Open Source Code and Model Inclusion: The open-source release by DeepSeek includes models that are distilled from the larger models, particularly the DeepSeek-R1 series. However, the training code for the initial large models (like DeepSeek-V3 or the base models used for distillation) does not seem to be open-sourced. The comments on X and technical reports indicate that while the distilled models and possibly some of their training code are available, the actual training pipeline for the larger models, including the data and the original training code, are not part of the open-source release. This means users can run and possibly fine-tune the smaller models but cannot replicate the training of the large models without additional resources or information.

     

In summary, while you can utilize and potentially further train the smaller models with less hardware, the groundwork of training the larger models still requires substantial resources. The open-source materials provided by DeepSeek focus on the smaller, distilled models rather than the full spectrum of their training process for the large models.

End of Grok.
 

Does Grok, and other AI, know up to date data?

Not long ago I asked AI about a recent event and the answer was something like: I don't have any data about that event.

11 minutes ago, OneMoreFarang said:

Does Grok, and other AI, know up to date data?

Not long ago I asked AI about a recent event and the answer was something like: I don't have any data about that event.

 

Good question. From my post you see Grok is well aware of Deepseek and how it works.   How so if It takes so long to train?

 

Learning and knowledge break down into 3 stages.

 

Stage 1: Build an initial model (links held in memory) , which takes many months at full power. Once per major revision.

Stage 2: These large models can be updated at significant but lest cost. Done periodically, as required by new info, not too often.

Stage 3: On the fly.  When Grok answered my question, he first looked at a number of websites and X posts to learn. This is current probably from some indexing. I,e., he is not yet trained on it but can go 'read up' before answering my question.

 

Create an account or sign in to comment

Recently Browsing 0

  • No registered users viewing this page.

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.