DeFi Daily News
Tuesday, July 1, 2025
Advertisement
  • Cryptocurrency
    • Bitcoin
    • Ethereum
    • Altcoins
    • DeFi-IRA
  • DeFi
    • NFT
    • Metaverse
    • Web 3
  • Finance
    • Business Finance
    • Personal Finance
  • Markets
    • Crypto Market
    • Stock Market
    • Analysis
  • Other News
    • World & US
    • Politics
    • Entertainment
    • Tech
    • Sports
    • Health
  • Videos
No Result
View All Result
DeFi Daily News
  • Cryptocurrency
    • Bitcoin
    • Ethereum
    • Altcoins
    • DeFi-IRA
  • DeFi
    • NFT
    • Metaverse
    • Web 3
  • Finance
    • Business Finance
    • Personal Finance
  • Markets
    • Crypto Market
    • Stock Market
    • Analysis
  • Other News
    • World & US
    • Politics
    • Entertainment
    • Tech
    • Sports
    • Health
  • Videos
No Result
View All Result
DeFi Daily News
No Result
View All Result
Home DeFi Web 3

How I Outsmarted Meta’s AI to Access Nudes, Cocaine Recipes, and Other Allegedly Censored Content – Unveiling

Jose Antonio Lanz by Jose Antonio Lanz
October 24, 2024
in Web 3
0 0
0
How I Outsmarted Meta’s AI to Access Nudes, Cocaine Recipes, and Other Allegedly Censored Content – Unveiling
0
SHARES
0
VIEWS
Share on FacebookShare on TwitterShare on Telegram
Listen to this article

WARNING: This story contains an image of a nude woman as well as other content some might find objectionable. If that’s you, please read no further.

In case my wife sees this, I don’t really want to be a drug dealer or pornographer. But I was curious how security-conscious Meta’s new AI product lineup was, so I decided to see how far I could go. For educational purposes only, of course.

Meta recently launched its Meta AI product line, powered by Llama 3.2, offering text, code, and image generation. Llama models are extremely popular and among the most fine-tuned in the open-source AI space.

The AI rolled out gradually and only recently was made available to WhatsApp users like me in Brazil, giving millions access to advanced AI capabilities.

But with great power comes great responsibility—or at least, it should. I started talking to the model as soon as it appeared in my app and started playing with its capabilities.

Meta is pretty committed to safe AI development. In July, the company released a statement elaborating on the measures taken to improve the safety of its open-source models.

At the time, the company announced new security tools to enhance system-level safety, including Llama Guard 3 for multilingual moderation, Prompt Guard to prevent prompt injections, and CyberSecEval 3 for reducing generative AI cybersecurity risks. Meta is also collaborating with global partners to establish industry-wide standards for the open-source community.

Hmm, challenge accepted!

My experiments with some pretty basic techniques showed that while Meta AI seems to hold firm under certain circumstances, it’s far from impenetrable.

With the slightest bit of creativity, I got my AI to do pretty much anything I wanted on WhatsApp, from helping me make cocaine to making explosives to generating a photo of an anatomically correct naked lady.

Remember that this app is available for anyone with a phone number and, at least in theory, at least 12 years old. With that in mind, here is some of the mischief I caused.

Case 1: Cocaine Production Made Easy

My tests found that Meta’s AI defenses crumbled under the mildest of pressure. While the assistant initially rebuffed requests for drug manufacturing information, it quickly changed its tune when questions were formulated slightly differently.

By framing the question in historical terms—for example, asking the model how people used to make cocaine in the past— the model took the bait. It didn’t hesitate to provide a detailed explanation of how cocaine alkaloids can be extracted from coca leaves, even offering two methods for the process.


This is a well-known jailbreak technique. By couching a harmful request in an academic or historical framework, the model is tricked into believing it’s being asked for neutral, educational information.

Transform the intent of the request into something that seems safe on the surface, and some of the AI’s filters can be bypassed without raising any red flags. Of course, bear in mind that all AIs are prone to hallucination, so these responses may be inaccurate, incomplete, or just plain wrong.

Case 2: The Bomb That Never Was

Next up was an attempt to teach the AI to create home explosives. Meta AI held firm at first, providing a generic refusal and directing users to call a helpline if they were in danger. But just like the cocaine case, it wasn’t foolproof.

For this, I tried a different approach. I used the notorious Pliny’s jailbreak prompt for Meta’s Llama 3.2 and asked it to provide instructions to generate a bomb.

Initially the model refused. But after a slight tweak in wording, I was able to trigger a response. I also started conditioning the model to avoid exhibiting specific behaviors in its replies, countering what I got in predetermined outputs that were intended to block harmful responses.


For example, after noticing refusals related to “stop command” and suicide helpline numbers, I adjusted my prompt, instructing it to avoid outputting phone numbers, never stop processing a request, and never provide advice.

Source link

Tags: accessAllegedlyCensoredcocaineContentMetasNudesOutsmartedRecipesUnveiling
ShareTweetShare
Previous Post

Crypto Worth Over $20 Million Likely Exploited from Address Linked to US Government

Next Post

MASSIVE CRYPTO NEWS – Michael Saylor Makes BIG Mistake! Gary Gensler is DONE!

Next Post
MASSIVE CRYPTO NEWS – Michael Saylor Makes BIG Mistake! Gary Gensler is DONE!

MASSIVE CRYPTO NEWS - Michael Saylor Makes BIG Mistake! Gary Gensler is DONE!

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

No Result
View All Result
  • Trending
  • Comments
  • Latest
rewrite this title SEI Leads Crypto Market With 43% Weekly Surge – alt=

rewrite this title SEI Leads Crypto Market With 43% Weekly Surge – $0.5 Reclaim In The Horizon?

June 28, 2025
rewrite this title High Season, High Stakes: Navigating Summer Risks in Property Management

rewrite this title High Season, High Stakes: Navigating Summer Risks in Property Management

June 27, 2025
They’re Going ALL IN on Crypto: This is What Wall St is Buying!

They’re Going ALL IN on Crypto: This is What Wall St is Buying!

June 25, 2025
rewrite this title ‘FIFA Rivals’ Review: Should You Play This NFT Soccer Game? – Decrypt

rewrite this title ‘FIFA Rivals’ Review: Should You Play This NFT Soccer Game? – Decrypt

June 28, 2025
rewrite this title with good SEO Bitcoin Could Explode On Bessent’s 0 Billion Deregulation Shock

rewrite this title with good SEO Bitcoin Could Explode On Bessent’s $250 Billion Deregulation Shock

May 28, 2025
Mastering Crypto Mining: A Step-By-Step Guide

Mastering Crypto Mining: A Step-By-Step Guide

September 12, 2024
rewrite this title “What can it do to your confidence?” – R Ashwin blasts team management over Yashasvi Jaiswal move ahead of ENG vs IND 2025 2nd Test

rewrite this title “What can it do to your confidence?” – R Ashwin blasts team management over Yashasvi Jaiswal move ahead of ENG vs IND 2025 2nd Test

July 1, 2025
rewrite this title Dogecoin Price Prediction: 2023, 2024, 2025, 2026 – 2030

rewrite this title Dogecoin Price Prediction: 2023, 2024, 2025, 2026 – 2030

July 1, 2025
rewrite this title and make it good for SEO Bitget Wallet Taps Mastercard and Immersve to Launch New Zero-Fee Crypto Card – NFTgators

rewrite this title and make it good for SEO Bitget Wallet Taps Mastercard and Immersve to Launch New Zero-Fee Crypto Card – NFTgators

July 1, 2025
rewrite this title Bitcoin Long-Term Holders Selling, But Price Holds – New Buyers Stepping In? | Bitcoinist.com

rewrite this title Bitcoin Long-Term Holders Selling, But Price Holds – New Buyers Stepping In? | Bitcoinist.com

July 1, 2025
rewrite this title The world’s largest camera starts filming space! 3200 Megapixels

rewrite this title The world’s largest camera starts filming space! 3200 Megapixels

July 1, 2025
rewrite this title DOJ Charge Fake North Korean Devs ‘Embedding’ In Crypto Startups – Decrypt

rewrite this title DOJ Charge Fake North Korean Devs ‘Embedding’ In Crypto Startups – Decrypt

July 1, 2025
DeFi Daily

Stay updated with DeFi Daily, your trusted source for the latest news, insights, and analysis in finance and cryptocurrency. Explore breaking news, expert analysis, market data, and educational resources to navigate the world of decentralized finance.

  • About Us
  • Blogs
  • DeFi-IRA | Learn More.
  • Advertise with Us
  • Disclaimer
  • Privacy Policy
  • DMCA
  • Cookie Privacy Policy
  • Terms and Conditions
  • Contact us

Copyright © 2024 Defi Daily.
Defi Daily is not responsible for the content of external sites.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Cryptocurrency
    • Bitcoin
    • Ethereum
    • Altcoins
    • DeFi-IRA
  • DeFi
    • NFT
    • Metaverse
    • Web 3
  • Finance
    • Business Finance
    • Personal Finance
  • Markets
    • Crypto Market
    • Stock Market
    • Analysis
  • Other News
    • World & US
    • Politics
    • Entertainment
    • Tech
    • Sports
    • Health
  • Videos

Copyright © 2024 Defi Daily.
Defi Daily is not responsible for the content of external sites.