CoinRSS: Bitcoin, Ethereum, Crypto News and Price Data

  • CONTACT
  • MARKETCAP
  • BLOG
CoinRSS: Bitcoin, Ethereum, Crypto News and Price Data
  • BOOKMARKS
  • Blockchain
  • Crypto
    • Bitcoin
    • Ethereum
    • Forex
    • Tether
  • Market
    • Binance
    • Business
    • Investor
    • Money
    • Trading
  • News
    • Coinbase
    • Mining
    • NFT
    • Stocks
Reading: DeepSeek Strikes Again: Does Its New Open-Source AI Model Beat DALL-E 3?
Share
You have not selected any currencies to display
CoinRSS: Bitcoin, Ethereum, Crypto News and Price DataCoinRSS: Bitcoin, Ethereum, Crypto News and Price Data
0
Font ResizerAa
  • Blockchain
  • Crypto
  • Market
  • News
Search
  • Blockchain
  • Crypto
    • Bitcoin
    • Ethereum
    • Forex
    • Tether
  • Market
    • Binance
    • Business
    • Investor
    • Money
    • Trading
  • News
    • Coinbase
    • Mining
    • NFT
    • Stocks
Have an existing account? Sign In
Follow US
© Foxiz News Network. Ruby Design Company. All Rights Reserved.
CoinRSS: Bitcoin, Ethereum, Crypto News and Price Data > Blog > News > DeepSeek Strikes Again: Does Its New Open-Source AI Model Beat DALL-E 3?
News

DeepSeek Strikes Again: Does Its New Open-Source AI Model Beat DALL-E 3?

CoinRSS
Last updated: January 28, 2025 6:03 pm
CoinRSS Published January 28, 2025
Share

Contents
Testing the modelGenerally Intelligent Newsletter

DeepSeek, the Chinese AI lab that recently upended industry assumptions about sector development costs, has released a new family of open-source multimodal AI models that reportedly outperform OpenAI’s DALL-E 3 on key benchmarks.

Dubbed Janus Pro, the model ranges from 1 billion (extremely small) to 7 billion parameters (near the size of SD 3.5L) and is available for immediate download on machine learning and data science hub Huggingface.

The largest version, Janus Pro 7B, beats not only OpenAI’s DALL-E 3 but also other leading models like PixArt-alpha, Emu3-Gen, and SDXL on industry benchmarks GenEval and DPG-Bench, according to information shared by DeepSeek AI.

Image: DeepSeek AI

Its release comes just days after DeepSeek made headlines with its R1 language model, which matched GPT-4’s capabilities while costing just $5 million to develop—sparking a heated debate about the current state of the AI industry.

The Chinese startup’s product has also triggered sector-wide concerns it could upend incumbents and knock the growth trajectory of major chip manufacturer Nvidia, which suffered the largest single-day market cap loss in history on Monday.

DeepSeek’s Janus Pro model uses what the company calls a “novel autoregressive framework” that decouples visual encoding into separate pathways while maintaining a single, unified transformer architecture.

This design allows the model to both analyze images and generate images at 768×768 resolution.

“Janus Pro surpasses previous unified model and matches or exceeds the performance of task-specific models,” DeepSeek claimed in its release documentation. “The simplicity, high flexibility, and effectiveness of Janus Pro make it a strong candidate for next-generation unified multimodal models.”

Unlike with DeepSeek R1, the company didn’t publish a full whitepaper on the model but did release its technical documentation and made the model available for immediate download free of charge—continuing its practice of open-sourcing releases that contrasts sharply with the closed, proprietary approach of U.S. tech giants.

So, what’s our verdict? Well, the model is highly versatile.

However, don’t expect it to replace any of the most specialized models you love. It can generate text, analyze images, and generate photos, but when pitted against models that only do one of those things well, at best, it’s on par.

Testing the model

Note that there is no immediate way to use traditional UIs to run it—Comfy, A1111, Focus, and Draw Things are not compatible with it right now. This means it is a bit impractical to run the model locally and requires going through text commands in a terminal.

However, some Hugginface users have created spaces to try the model. DeepSeek’s official space is not available, so we recommend using NeuroSenko’s free space to try Janus 7b.

Be aware of what you do, as some titles may be misleading. For example, the Space run by AP123 says it runs Janus Pro 7b, but instead runs Janus Pro 1.5b—which may end up making you lose a lot of free time testing the model and getting bad results. Trust us: we know because it happened to us.

Visual understanding

The model is good at visual understanding and can accurately describe the elements in a photo.

It showed a good spatial awareness and the relation between different objects.

It is also more accurate than LlaVa—the most popular open-source vision model—being capable of providing more accurate descriptions of scenes and interacting with the user based on visual prompts.

However, it is still not better than GPT Vision, especially for tasks that require logic or some analysis beyond what is obviously being shown in the photo. For example we asked the model to analyze this photo and explain its message

The model replied, “The image appears to be a humorous cartoon depicting a scene where a woman is licking the end of a long red tongue that is attached to a boy.”

It ended its analysis by saying that “the overall tone of the image seems to be lighthearted and playful, possibly suggesting a scenario where the woman is engaging in a mischievous or teasing act.”

In these situations where some reasoning is required beyond a simple description, the model fails most of the time.

On the other hand, ChatGPT, for example, actually understood the meaning behind the image: “This metaphor suggests that the mother’s attitudes, words, or values are directly influencing the child’s actions, particularly in a negative way such as bullying or discrimination,” it concluded—accurately, shall we add.

A league of its own

Image generation appears robust and relatively accurate, though it does require careful prompting to achieve good results.

DeepSeek claims Janus Pro beats SD 1.5, SDXL, and Pixart Alpha, but it’s important to emphasize this must be a comparison against the base, non fine-tuned models.

In other words, the fair comparison is between the worst versions of the models currently available since, arguably, nobody uses a base SD 1.5 for generating art when there are hundreds of fine tunes capable of achieving results that can compete against even state-of-the-art models like Flux or Stable Diffusion 3.5.

So, the generations are not at all impressive in terms of quality, but they do seem better than what SD1.5 or SDXL used to output when they launched.

For example, here is a face-to-face comparison of the images generated by Janus and SDXL for the prompt: A cute and adorable baby fox with big brown eyes, autumn leaves in the background enchanting, immortal, fluffy, shiny mane, Petals, fairy, highly detailed, photorealistic, cinematic, natural colors.

Janus beats SDXL in understanding the core concept: it could generate a baby fox instead of a mature fox, as in SDXL’s case.

It also understood the photorealistic style better, and the other elements (fluffy, cinematic) were also present.

That said, SDXL generated a crisper image despite not sticking to the prompt. The overall quality is better, the eyes are realistic, and the details are easier to spot.

This pattern was consistent in other generations: good prompt understanding but poor execution, with blurry images that feel outdated considering how good current state-of-the-art image generators are.

However, it’s important to note that Janus is a multimodal LLM capable of generating text conversations, analyzing images, and generating them as well. Flux, SDXL, and the other models aren’t built for those tasks.

So, Janus is far more versatile at its core—just not great at anything when compared to specialized models that excel at one specific task.

Being open-source, Janus’s future as a leader among generative AI enthusiasts will depend on a slew of updates that seek to improve upon those points.

Edited by Josh Quittner and Sebastian Sinclair

Generally Intelligent Newsletter

A weekly AI journey narrated by Gen, a generative AI model.

Source link

You Might Also Like

Texas Lawmaker Misses Disclosure Deadline on Bitcoin Buys

Ethereum’s TVL spikes past 53%: Is this the sign of a major ETH rally?

Trump’s tax bill advances in Congress – Good for crypto adoption? 

Bitcoin Payments Company Strike Launches BTC-Backed Loans

Bringing Reputation On-Chain: Can Ethos Network Boost Trust and Credibility in Crypto?

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Copy Link Print
Previous Article Chainlink: Can LINK bulls protect a drop below $20?
Next Article Ethereum Gaming Network Ronin Flooded With Meme Coins via Tama Launchpad
Leave a Comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recipe Rating




Follow US

Find US on Socials
FacebookLike
TwitterFollow
YoutubeSubscribe
TelegramFollow
Subscribe to our newslettern

Get Newest Articles Instantly!

- Advertisement -
Ad image
Popular News
JPMorgan to Allow BlackRock Bitcoin ETF Shares as Loan Collateral
BTC Price will Hit $100K before Bitcoin Sweeps $30K Lows
Crypto Bahamas: Regulations Enter Critical Stage as Gov’t Shows Interest

Follow Us on Socials

We use social media to react to breaking news, update supporters and share information

Twitter Youtube Telegram Linkedin
CoinRSS: Bitcoin, Ethereum, Crypto News and Price Data coin-rss-logo

We influence 20 million users and is the number one business blockchain and crypto news network on the planet.

Subscribe to our newsletter

You can be the first to find out the latest news and tips about trading, markets...

Ad imageAd image
© CoinRSS: Bitcoin, Ethereum, Crypto News and Price Data. All Rights Reserved.
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?