Skip to content

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
    • Help
    • Support
    • Submit feedback
  • Sign in / Register
T
tehnika-sm
  • Project overview
    • Project overview
    • Details
    • Activity
  • Issues 1
    • Issues 1
    • List
    • Boards
    • Labels
    • Milestones
  • Merge Requests 0
    • Merge Requests 0
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
  • Analytics
    • Analytics
    • CI / CD
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Create a new issue
  • Jobs
  • Issue Boards
  • Edward Coane
  • tehnika-sm
  • Issues
  • #1

Closed
Open
Opened Feb 09, 2025 by Edward Coane@edwardcoane445
  • Report abuse
  • New issue
Report abuse New issue

Applied aI Tools


AI keeps getting cheaper with every passing day!

Just a couple of weeks back we had the DeepSeek V3 model pushing NVIDIA's stock into a down spiral. Well, today we have this new cost reliable design released. At this rate of innovation, I am thinking of selling NVIDIA stocks lol.

Developed by scientists at Stanford and the University of Washington, their S1 AI model was trained for mere $50.

Yes - just $50.

This more obstacles the supremacy of multi-million-dollar designs like OpenAI's o1, DeepSeek's R1, and utahsyardsale.com others.

This advancement highlights how development in AI no longer requires enormous budget plans, potentially democratizing access to innovative reasoning abilities.

Below, we explore s1's advancement, benefits, and ramifications for the AI engineering industry.

Here's the initial paper for your reference - s1: Simple test-time scaling

How s1 was constructed: Breaking down the methodology

It is very interesting to find out how researchers throughout the world are optimizing with minimal resources to lower costs. And these efforts are working too.

I have actually attempted to keep it basic and jargon-free to make it simple to understand, check out on!

Knowledge distillation: The secret sauce

The s1 model utilizes a method called understanding distillation.

Here, a smaller sized AI model simulates the thinking procedures of a bigger, more advanced one.

Researchers trained s1 utilizing outputs from Google's Gemini 2.0 Flash Thinking Experimental, a reasoning-focused design available through Google AI Studio. The group prevented resource-heavy techniques like support learning. They used monitored fine-tuning (SFT) on a dataset of simply 1,000 curated questions. These questions were paired with Gemini's responses and detailed thinking.

What is supervised fine-tuning (SFT)?

Supervised Fine-Tuning (SFT) is an artificial intelligence technique. It is utilized to adapt a pre-trained Large Language Model (LLM) to a specific job. For this process, it uses labeled data, where each data point is labeled with the right output.

Adopting specificity in training has several benefits:

- SFT can boost a design's efficiency on particular tasks
- Improves information efficiency
- Saves resources compared to training from scratch
- Allows for customization
- Improve a model's capability to deal with edge cases and manage its behavior.
This approach allowed s1 to reproduce Gemini's problem-solving methods at a portion of the expense. For comparison, DeepSeek's R1 model, designed to match OpenAI's o1, supposedly required expensive support finding out pipelines.

Cost and compute efficiency

Training s1 took under 30 minutes using 16 NVIDIA H100 GPUs. This cost scientists roughly $20-$ 50 in cloud compute credits!

By contrast, OpenAI's o1 and comparable models require thousands of dollars in compute resources. The base design for s1 was an off-the-shelf AI from Alibaba's Qwen, easily available on GitHub.

Here are some major factors to think about that aided with attaining this cost performance:

Low-cost training: The s1 design attained exceptional results with less than $50 in cloud computing credits! Niklas Muennighoff is a Stanford researcher associated with the task. He approximated that the needed compute power could be quickly leased for around $20. This showcases the job's unbelievable price and availability.
Minimal Resources: The team utilized an off-the-shelf base model. They fine-tuned it through distillation. They drew out thinking abilities from Google's Gemini 2.0 Flash Thinking Experimental.
Small Dataset: The s1 design was trained utilizing a little dataset of just 1,000 curated concerns and answers. It included the thinking behind each response from Google's Gemini 2.0.
Quick Training Time: The model was trained in less than 30 minutes utilizing 16 Nvidia H100 GPUs.
Ablation Experiments: The low cost enabled scientists to run numerous ablation experiments. They made small variations in setup to learn what works best. For instance, they measured whether the model must utilize 'Wait' and not 'Hmm'.
Availability: The development of s1 uses an alternative to high-cost AI designs like OpenAI's o1. This advancement brings the potential for effective reasoning designs to a wider audience. The code, information, and training are available on GitHub.
These elements challenge the concept that huge financial investment is constantly required for producing capable AI models. They equalize AI development, making it possible for smaller teams with minimal resources to attain substantial results.

The 'Wait' Trick

A clever development in s1's design involves adding the word "wait" throughout its thinking procedure.

This basic timely extension forces the design to stop briefly and double-check its answers, improving accuracy without additional training.

The 'Wait' Trick is an example of how cautious timely engineering can significantly improve AI design performance. This improvement does not rely solely on increasing model size or training information.

Learn more about writing prompt - Why Structuring or Formatting Is Crucial In Prompt Engineering?

Advantages of s1 over market leading AI models

Let's understand why this development is essential for the AI engineering market:

1. Cost availability

OpenAI, Google, and Meta invest billions in AI facilities. However, s1 proves that high-performance thinking designs can be developed with minimal resources.

For instance:

OpenAI's o1: Developed utilizing exclusive approaches and pricey compute.
DeepSeek's R1: Relied on large-scale reinforcement learning.
s1: Attained similar outcomes for under $50 utilizing distillation and SFT.
2. Open-source transparency

s1's code, training information, and design weights are openly available on GitHub, unlike closed-source designs like o1 or Claude. This transparency cultivates neighborhood cooperation and scope of audits.

3. Performance on criteria

In tests determining mathematical problem-solving and coding tasks, s1 matched the performance of leading designs like o1. It also neared the performance of R1. For example:

- The s1 design surpassed OpenAI's o1-preview by approximately 27% on competitors math concerns from MATH and AIME24 datasets
- GSM8K (mathematics thinking): s1 scored within 5% of o1.
- HumanEval (coding): s1 attained ~ 70% precision, equivalent to R1.
- An essential function of S1 is its use of test-time scaling, which improves its accuracy beyond preliminary capabilities. For instance, it increased from 50% to 57% on AIME24 issues utilizing this method.
s1 does not go beyond GPT-4 or Claude-v1 in raw capability. These models excel in specialized domains like scientific oncology.

While distillation techniques can replicate existing models, some professionals note they may not lead to advancement developments in AI efficiency

Still, its cost-to-performance ratio is unmatched!

s1 is challenging the status quo

What does the advancement of s1 mean for the world?

Commoditization of AI Models

s1's success raises existential questions for AI giants.

If a small team can replicate advanced thinking for $50, what differentiates a $100 million design? This threatens the "moat" of exclusive AI systems, pressing business to innovate beyond distillation.

Legal and ethical issues

OpenAI has earlier accused competitors like DeepSeek of poorly harvesting data through API calls. But, bphomesteading.com s1 sidesteps this problem by utilizing Google's Gemini 2.0 within its regards to service, which permits non-commercial research study.

Shifting power dynamics

s1 exhibits the "democratization of AI", enabling startups and researchers to take on tech giants. Projects like Meta's LLaMA (which needs expensive fine-tuning) now face pressure from cheaper, purpose-built alternatives.

The constraints of s1 model and future directions in AI engineering

Not all is best with s1 for now, and it is not best to expect so with minimal resources. Here's the s1 model constraints you must know before embracing:

Scope of Reasoning

s1 stands out in tasks with clear detailed reasoning (e.g., math issues) but struggles with open-ended imagination or nuanced context. This mirrors constraints seen in models like LLaMA and PaLM 2.

Dependency on moms and dad models

As a distilled design, surgiteams.com s1's abilities are inherently bounded by Gemini 2.0's understanding. It can not exceed the initial model's thinking, unlike OpenAI's o1, which was trained from scratch.

Scalability questions

While s1 shows "test-time scaling" (extending its reasoning steps), real innovation-like GPT-4's leap over GPT-3.5-still requires enormous calculate budget plans.

What next from here?

The s1 experiment highlights 2 crucial trends:

Distillation is equalizing AI: Small groups can now replicate high-end abilities!
The value shift: Future competitors may fixate data quality and special architectures, not simply calculate scale.
Meta, Google, and Microsoft are investing over $100 billion in AI facilities. Open-source tasks like s1 might force a rebalancing. This modification would permit innovation to grow at both the grassroots and business levels.

s1 isn't a replacement for industry-leading models, however it's a wake-up call.

By slashing costs and opening gain access to, it challenges the AI community to prioritize performance and inclusivity.

Whether this causes a wave of affordable competitors or tighter constraints from tech giants remains to be seen. Something is clear: the age of "larger is much better" in AI is being redefined.

Have you tried the s1 model?

The world is moving quick with AI engineering advancements - and this is now a matter of days, not months.

I will keep covering the most recent AI designs for you all to try. One must learn the optimizations made to lower costs or innovate. This is genuinely an interesting area which I am taking pleasure in to discuss.

If there is any concern, correction, or doubt, please comment. I would more than happy to repair it or systemcheck-wiki.de clear any doubt you have.

At AI Tools, we want to make learning available. You can discover how to utilize the lots of available AI software for your personal and professional use. If you have any questions - email to content@merrative.com and we will cover them in our guides and blogs.

Discover more about AI concepts:

- 2 key insights on the future of software development - Transforming Software Design with AI Agents
- Explore AI Agents - What is OpenAI o3-mini
- Learn what is tree of ideas triggering technique
- Make the mos of Google Gemini - 6 latest Generative AI tools by Google to enhance work environment performance
- Learn what influencers and experts think about AI's effect on future of work - 15+ Generative AI prices estimate on future of work, influence on tasks and labor force efficiency
You can sign up for our newsletter to get informed when we release brand-new guides!

Type your email ...

Subscribe

This post is written utilizing resources of Merrative. We are a publishing skill market that helps you produce publications and content libraries.

Contact us if you want to create a material library like ours. We concentrate on the niche of Applied AI, Technology, Artificial Intelligence, or Data Science.

  • Discussion
  • Designs
Assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking
None
Due date
None
0
Labels
None
Assign labels
  • View project labels
Reference: edwardcoane445/tehnika-sm#1