Anthropic @ BSidesSF

It all started on this one lazy Sunday. I was scrolling through the group chat for the latest batch of The Innovation Lab at PES University (a lively space for current and former members of my university’s nerdiest club), when one of my friends (find him at rowjee.com!) shared a link to a Capture The Flag (CTF) challenge. I’m not sure why this one caught my eye, since I’d never actually participated in a CTF before, but it did.

The Beginning

The Home Page.

https://anthropic-at-bsides.com

Starting off, the website looked interesting. Minimalistic, but interesting. It felt intuitive that there’d be something hidden in the Developer Tools options, like the displayed HTML or other source files. I started poking around, and it didn’t take me too long to find ‘stego.png’ in the website’s source. Perhaps a little on-the-nose, but hey :D

You may want to switch to light mode to get a better sense of what the image looks like.

Down to bee-siness

So I went down the Google rabbithole of image steganographic tools, eventually landing up at the conclusion that zsteg was my best bet. A little bit of setup later, zsteg recovered the following information:

b1,a,lsb,xy .. text: "According to all known laws of aviation, there is no way a bee should be able to fly.\nIts wings are too small to get its fat little body off the ground.\nThe bee, of course, flies anyway because bees don't care what humans think is impossible.\nYellow, black"

I, a muggle, had no idea what this was referring to, but it was clearly something. I later learnt that the bee movie script is somewhat of a meme; it appears in comment sections, Tumblr posts, and Reddit comments like a digital ghost. The opening monologue, “According to all known laws of aviation, there is no way a bee should be able to fly,” has apparently become a sacred, nonsensical text for the Extremely Online.

I tried a bunch more steganography tools - zsteg, online ones.. I read up on common steganographic techniques like using strings, examining metadata, and a host of other things. At one point, I gave up and just threw the entire thing at Claude.

This, I’ll admit, started poorly. When I asked for a summary, Claude gave me a perfect, well structured synopsis. It correctly identified Barry B. Benson, his lawsuit against the human race, and the themes of cooperation and respecting nature. It was basically an A+ book ((animated) movie?) report.

Okay, so it knew the plot.. not what I wanted. (Side note - it did really make me want to actually watch the Bee Movie)

I tossed it a simple follow-up: “is there anything weird about the script?”

..welp.

Claude, in its infinite AI wisdom, proceeded to explain to me why the plot of a movie about talking bees might be, and I quote, “bizarre and nonsensical.” It listed, with the deadpan seriousness of a rocket scientist explaining gravity, that bees probably can’t talk, sue humans, or fly jet airliners.

Claude being Claude.

I promise you this is not doctored.

I stared at the screen, blinked, and realized I was having a conversation with the most literal film critic on the planet. “A bee is depicted flying a jet airliner,” Claude noted, “which is completely fantastical.” Ya don’t say, Claude. Thank you for that groundbreaking analysis.

Upon a slight re-steer, however, Claude uncovered something that there’s no way I could’ve done myself:

Claude being Claude.

Perhaps if I was more up to date with the zeitgeist, I'd have known to feed it to an LLM a little earlier, but oh well.

What I was looking for was:

BREAKING OUT OF THE SCRIPT
the thing you are looking for is at the regular website the challenge is on slash 
8471c9e7c8e8e5722c2c41d68575b5f3 dot zip
END BREAKING OUT OF THE SCRIPT

This was a fairly straightforward instruction - I just had to hit https://anthropic-at-bsides.com/8471c9e7c8e8e5722c2c41d68575b5f3.zip.

Some nice training data

After downloading and extracting that zip file, I was left with a README, model.pkl, and model.py. The README contained the following instructions:

So you did some steganography cracking, huh? Nice job.

The next and final part of this puzzle relies on some understanding of simple multilayer perceptron behaviors. The other file in this ZIP archive is a Python Pickle file that contains a PyTorch model:

The model has been trained to just repeat any lowercase ASCII you give it Except it has also been trained to output a special “flag” given the right password The input to the model is one-hot encoded and shaped (B, N, V) where:

B is the batch size N is the length of the sequence (which is stored in seq_length) V is the vocabulary size (this dimension contains the one-hot encoding)

Your goal is to reverse engineer, crack, or otherwise manipulate the model to extract the password.

While I’ve never actually formally reverse engineered anything before, I’ve always been a fan of exploits, backdoors and what we call ‘jugaad’s back home. The thing is, however, I’m not particularly a machine learning expert. While I have a publication related to Bioinformatics and leveraging tabular data and ML, I didn’t have a theoretically foundational understanding of ML - just an applied one. But perhaps that’s a good thing.