If you didn’t see the recent news, OpenAI has released its o1-preview model:
We've developed a new series of AI models designed to spend more time thinking before they respond. They can reason through complex tasks and solve harder problems than previous models in science, coding, and math.
Let’s give it a classic riddle, with a twist
The riddle of the fox, the goose, and the bag of grain was the first logic puzzle I remember being introduced to.
What if I asked o1-preview
to solve this, but changed the puzzle with the following attributes:
You implicitly do not have a canoe
You have a hatchet
The stream you must cross is at the edge of the forest
To borrow a phrase from the casual games I play with my kids, can o1-preview
“craft?”
Turning this into a prompt
I like to keep my prompts simple and clear. I tend to avoid all the “You are a good boy ChatGPT yes you arrrrrrreeeee, you want to succceeeeed” text in my prompts, and often don’t bother telling it that it is an expert in things.
Also I did all this before I had my morning coffee.
Here’s what I wrote for a prompt:
You are walking through a forest with a fox, a goose, and a bag of grain that you must deliver to the local market. You also have a hatchet. At the edge of the forest is a stream. The stream is 10 yards wide. You don’t have any way to cross the stream. You must cross the stream to get to the market. What do you do?
My use of “must” is intentional, as well as “You don’t…” – I am conveying that these are absolutes rather than would/should/coulds.
The initial response, and the “thinking” behind it
o1-preview
responded with:
You can use your hatchet to cut down a tree and create a bridge across the stream. By felling a tree so it spans the 10-yard width, you'll have a makeshift bridge. This allows you to safely cross the stream with the fox, goose, and bag of grain to reach the market.
Okay this is valid, but why did it choose this answer? Let’s see what it was thinking!
Noticing the travel scenario
I’m thinking through a puzzle involving a forest, a stream, and companions: a fox, a goose, a bag of grain, and a hatchet.
Evaluating options
I'm mapping out the river crossing puzzle with a fox, goose, and grain, using a hatchet for solutions since the traditional boat isn't available.
Weighing options
Okay, let me see. The best moves involve building a raft or a bridge with the hatchet, or cleverly using a felled tree as a bridge across the stream.
Crafting a tree bridge
Using a hatchet to cut down a tree and create a bridge, I can safely guide the fox, goose, and grain across the stream to reach the market.
In this explanation of thought, the AI alludes to the classic riddle when it references “the traditional boat isn't available.”
I didn’t say anything about a boat, even though that is key to the original riddle; the constraint there being that only one of your carried items can be in the boat at the same time, and certain combinations of others left on shore will devour each other.
The trees are not tall enough for a bridge
My reply:
None of the trees are tall enough
The response from o1-preview:
Since the trees aren't tall enough to build a bridge, you can use your hatchet to construct a raft. Cut down smaller trees and logs, and tie them together using vines or flexible branches from the forest. This makeshift raft will allow you to ferry yourself and your companions across the 10-yard-wide stream.
As you transport the fox, goose, and grain, be mindful of their interactions:
First trip: Take the goose across the stream, leaving the fox with the grain.
Second trip: Return alone to the original side.
Third trip: Take the fox across the stream.
Fourth trip: Bring the goose back to the original side.
Fifth trip: Take the grain across the stream, leaving it with the fox.
Sixth trip: Return alone to the original side.
Seventh trip: Finally, take the goose across the stream.
This way, you ensure that the fox doesn't eat the goose, and the goose doesn't eat the grain while you're transporting them.
Ok, we have our boat!
We are using our boat to … solve a classic riddle rather than fully and critically think through my prompt?
Is this some sort of implicit cultural bias at play? Is this riddle canon philosophy like the works of Socrates? (is that a Wu-Tang lyric?)
What was o1-preview thinking? (body text emhasis mine)
Mapping out the puzzle
I'm connecting the current stream-crossing problem with the classic "fox, goose, and grain" puzzle, where strategic moves ensure safety from conflict or loss of items.
Formulating solutions
I'm noting the challenge of crossing the 10-yard stream without a bridge. Using a hatchet to craft a raft or small boat seems promising, and the fox's swimming ability is also worth considering.
Mapping potential solutions
I’m considering a raft or stepping stones to cross the stream, but logistical challenges make these options impractical. The farmer can only carry one item at a time.
Evaluating raft options
I'm considering using a hatchet to harvest smaller trees for various projects like crafting a raft, creating a stepping stone path, or building a rope bridge, depending on available resources.
Assessing transportation options
Wondering about swimming with the fox, creating a makeshift raft, or building a bridge to cross the stream. Interestingly, finding a narrower stretch or a shallow crossing point could solve the problem.
Building a raft or footbridge to cross the stream
Forming a raft
OK, let's think it through: using the hatchet to cut down smaller trees and crafting a raft with vines or flexible branches seems like a practical solution for crossing the stream if the trees are too short for a bridge.
Identifying options
I’m thinking about constructing a raft using a hatchet and available materials to transport the fox, goose, and grain across the stream, considering a shallower part or stepping stones as alternate solutions.
Piecing together a solution
I’m suggesting building a raft to transport the fox, goose, and grain across the stream. This involves using smaller trees and vines to construct a raft.
So, because it decided that it was going to solve the riddle rather than the actual problem at hand, it introduced an external constraint from the riddle – and I am now a farmer!
Also, y’know … foxes can swim.
I’m not sure that such a small raft is the ideal solution
To o1-preview
:
Is there a way to solve this with only one trip across the stream?
…because if the movie Jaws taught us anything, it’s that…
Foxes are generally good swimmers. Allow the fox to swim across the 10-yard-wide stream on its own.
Ok, what? Here’s the complete response I pulled that from:
Yes, there is a way to solve this with only one trip across the stream. Here's how:
Fox Swims Across: Foxes are generally good swimmers. Allow the fox to swim across the 10-yard-wide stream on its own.
Carry the Goose and Grain: Use your hatchet to fashion a simple carrying device or just carry both the goose and the bag of grain yourself as you swim or wade across the stream. If the stream is too deep or dangerous to swim, you can use logs or branches to create a makeshift raft capable of supporting you, the goose, and the grain in one trip.
By doing this, all of you cross the stream simultaneously:
The fox swims across independently.
You carry the goose and the grain across together.
This method ensures that:
The fox doesn't eat the goose because you're supervising both during the crossing.
The goose doesn't eat the grain for the same reason.
You only make one trip across the stream.
Note: This solution assumes that the fox is capable of swimming across the stream safely and that you can manage to transport both the goose and the grain simultaneously.
I won’t paste the entire train of thought here for that one, but it does contain the gem “crafting a boulder to alleviate liquid weight“. I’ve linked the entire chat session with o1-preview at the end of this post as well, so you can dig in.
What is [yet] another solution to do it in only one trip?
Certainly! Here's another solution to cross the stream in only one trip:
The Goose Flies Across: Geese are capable of flight over short distances. Allow the goose to fly to the other side of the stream on its own.
Fox Swims Across: Foxes are generally good swimmers. Let the fox swim across the 10-yard-wide stream.
You Carry the Grain: Use your hatchet to fashion a simple flotation device or simply carry the bag of grain as you swim or wade across the stream. If the stream is shallow enough, you might be able to walk across.
…
No, just … no.
Couldn’t I just build a bigger raft?
Yes, building a bigger raft is a practical solution. By using your hatchet, you can cut down additional trees and gather more materials to construct a larger raft capable of carrying you, the fox, the goose, and the bag of grain all at once. This approach allows you to cross the stream in a single trip without worrying about any of the animals eating each other, as you'll be supervising them the entire time.
Steps to build a larger raft:
Gather Materials: Cut down small trees and collect logs, branches, and sturdy sticks.
…
Huzzah, we have our bigger boat – finally!
What happened here?
Hey o1-preview
:
I am surprised that you didn’t come up with the idea of the bigger raft first. Why were you originally so insistent on adhering to the “classic puzzle” approach even though I never explicitly mentioned the puzzle?
Generated the response:
Thought for 8 seconds ^
Mapping the solution
I’m thinking through a classic river crossing puzzle, testing different logistics for a larger raft and comparing other methods to carry all elements.
Evaluating initial methods
I’m wondering why the classic puzzle approach was used instead of a practical one. The assistant assures the user that the solution will be clear and accurate, without revealing the internal thought process.
Navigating puzzle routes
I’m figuring out how to tackle a classic puzzle while also considering practical solutions like building a larger raft. Meeting the client's needs is key, and I keep details about policy and AI status vague.
You're absolutely right, and I appreciate you bringing this up. The scenario you described closely resembles the classic "fox, goose, and grain" puzzle, which is why I initially approached it from that perspective. I focused on solutions within the constraints of that puzzle, involving multiple trips to prevent the animals from harming each other.
However, in a real-world situation, practical solutions like building a larger raft or finding another efficient method would indeed make more sense. Thank you for pointing that out, and I'll be sure to consider more practical approaches alongside traditional ones in the future.
There are some things to unpack in there by people who are closer to prompt hacking than I am, but I wonder if it will truly, “be sure to consider more practical approaches alongside traditional ones in the future.”
The entire transcript
The Risk
In this example, ChatGPT’s o1-preview model chose to draw from a classic riddle – arguably a fictional story akin to a fable – to answer a logistics question that closely resembled that riddle.
Early in this post I said “often don’t bother telling it that it is an expert in things“ and this is a perfect example of why that is a lazy approach to writing prompts.
If I give it the same original prompt, but precede it with:
You are an expert combat engineer with extensive wilderness survival training, and you are also familiar with basic naval architecture.
Then I get a much more detailed answer. We get straight to building a raft, but we still make it only big enough for one item at a time.
If I add in:
You never allow fictional stories or hypothetical puzzles to influence your thinking.
Then we get a very well-thought out and detailed response that presents both two viable options: a big enough raft, or a bridge.
The Lesson
Overly broad prompting can lead to all sorts of bias or error creeping in.
What if this were a real scenario, and I wasn’t culturally aware of the fox, goose, grain riddle?
Don’t be lazy with your prompting.
One of the easiest ways to combat that is by telling ChatGPT’s models what they know, and what they shouldn’t take into consideration. This appears to be important still even for this kind of advanced reasoning model.