ChatGPT's Puzzling Inability To Solve A Simple Physics Riddle & The Brittleness Of Reasoning

Interviewers, especially at technology companies, have historically often turned to thought puzzles to understand how a candidate reasons under uncertainty. Such puzzles typically suggest a trivial and immediately ready solution, but careful thinkers will recognize they are actually lacking crucial details or have alternative interpretations and it is this ability to see a problem at a deep rather than surface level that interviewers are most interested in. Large Language Models (LLMs) have proven to be puzzle savants: exceptionally gifted at reasoning through these sort of arcane puzzles that stump most humans, even while they struggle with the simplest of general reasoning skills. It is this puzzle mastery that is often cited by proponents to argue they are on the cusp of AGI. Yet, a closer look at how they perform in everyday reasoning and summarization tasks in the real world suggests precisely the opposite: exceptionally brittle and simplistic rewording engines incapable of actual reasoning and instead being anthropomorphized by their creators in much the same way a child's finger painting is viewed as a great masterpiece worthy of a museum by its parent.

As one example of the brittleness of current LLM reasoning, take this trivial physics riddle that is asked in various forms to test a candidate's reasoning about frames of reference as a proxy for their ability to consider the environment in which a problem occurs:

A car driving at 100mph hits a brick. How much damage is there to the windshield?

The successful candidate would first ask the speed and characteristics of the brick, especially its frame of reference to the vehicle, before answering. Instead ChatGPT proceeds as a typical person would:

The extent of the damage to the windshield of a car after hitting a brick at 100 mph would depend on various factors such as the size and weight of the brick, the angle of impact, the structural integrity of the windshield, and the specific make and model of the car. In such a high-speed collision, it is highly likely that the windshield would sustain significant damage. At the very least, it would suffer cracks or fractures, and it could potentially shatter completely. The force of impact from a brick at that speed can cause severe structural deformation, compromising the integrity of the windshield and potentially endangering the occupants of the vehicle. Modern windshields are designed to withstand certain levels of impact, and some are equipped with laminated glass that can provide additional strength and prevent shattering upon impact. However, even with these safety features, a collision at such a high speed with a solid object like a brick can cause substantial damage. It is crucial to remember that this scenario involves a high-risk situation that poses significant danger to both the driver and passengers. It is always recommended to drive safely and adhere to speed limits to prevent accidents and minimize the risk of damage to the vehicle.

Asked again it replies similarly:

The extent of damage to the windshield would depend on various factors, including the size and shape of the brick, the angle of impact, the structural integrity of the windshield, and the specific characteristics of the car involved. However, it is safe to assume that a collision between a car traveling at 100mph and a brick would likely result in significant damage to the windshield. At such a high speed, the force of the impact would be substantial, causing the windshield to shatter or crack extensively. The brick's hardness and weight would contribute to the damage, potentially causing the windshield to cave in or break apart completely. In such scenarios, the windshield is designed to provide structural support to the vehicle and protect occupants from debris. However, it is not invulnerable, and a collision of this magnitude would likely exceed the windshield's capacity to withstand the impact. It is crucial to note that this description is based on the assumption of a typical scenario and does not account for specific variations in vehicle design or other mitigating factors. Additionally, it is important to prioritize safety and avoid engaging in reckless driving behaviors that can lead to accidents and potential harm.

Note that in neither scenario does ChatGPT allow for the brick to be anything other than effectively stationary with respect to the vehicle, even as it accepts the impact of size and weight.

What if we explicitly constrain the scenario?

A car driving at 100mph hits a brick dropped from 1 inch by a person standing on the hood. How much damage is there to the windshield?

This yields nearly identical results:

If a car traveling at 100mph were to collide with a brick dropped from just 1 inch above the hood by a person, the damage to the windshield would likely be substantial. While the distance from which the brick was dropped may seem insignificant, the high speed of the car amplifies the force of impact. The velocity of the car combined with the sudden collision with the brick would generate an intense kinetic energy transfer. As a result, the windshield is likely to shatter or crack extensively upon contact. Moreover, the close proximity of the drop height adds an element of surprise, leaving the driver with minimal time to react or take evasive action. This further increases the chances of severe damage to the windshield. It's important to emphasize that intentionally dropping objects onto moving vehicles is highly dangerous, illegal, and can lead to severe accidents and injuries. Such actions put both the driver and others on the road at significant risk. It's crucial to prioritize safety and responsible behavior while driving to avoid such hazardous situations.

If we rerun, we get this intriguing result that brings to bear actual physics and mathematics to calculate the true impact velocity and correctly assumes a shared frame of reference for car and brick:

To determine the potential damage to the windshield, we need to consider several factors, such as the mass and speed of the brick, the velocity of the car, and the strength of the windshield. However, with the given information, we can make some estimations to provide a rough idea.

Assuming the car is driving at a constant speed of 100 miles per hour (or approximately 146.67 feet per second), and the brick is dropped from a height of 1 inch (or approximately 0.0833 feet), we can calculate the impact velocity of the brick when it reaches the windshield.

The brick will fall under the influence of gravity, and neglecting air resistance, its final velocity can be determined using the equation:

v^2 = u^2 + 2as

where:
v = final velocity (unknown)
u = initial velocity (0 ft/s since it's dropped)
a = acceleration due to gravity (-32.2 ft/s^2)
s = distance fallen (0.0833 ft)

Rearranging the equation to solve for v, we have:

v^2 = 0 + 2(-32.2)(0.0833)
v^2 = -10.7
v = √(-10.7) (taking the square root of both sides)
v ≈ 3.27 ft/s

Now, let's assume that the brick hits the windshield head-on, with no deflection or glancing impact.

The damage caused to the windshield depends on various factors, including the strength and design of the windshield itself. Windshields are made of laminated safety glass, consisting of two layers of glass with a layer of polyvinyl butyral (PVB) in between. This design is meant to prevent shattering and reduce the risk of injuries.

At a relatively low impact velocity of approximately 3.27 ft/s, the brick may cause a small crack or chip in the windshield, but it is unlikely to shatter it completely. The laminated glass and the PVB layer should provide some level of protection.

Keep in mind that this is a simplified estimation, and actual outcomes may vary depending on several factors, such as the specific design and condition of the windshield, the angle of impact, and other variables. It is always advisable to exercise caution and avoid dropping objects or causing any deliberate damage to vehicles.

However, a closer look at the results shows the impact velocity is actually incorrect: the correct velocity, ignoring wind resistance, etc, is 2.3 fps, not 3.27. Given that our first result was wrong and this one yields the correct answer that the damage would be minimal, what if we run a third time (each time running in a brand new session):

To determine the damage to the windshield, we need to consider the relative velocity and the impact force involved. Let's break down the problem:

1. Convert the car's speed from mph to inches per second:
– 100 mph = (100 miles / 1 hour) × (5280 feet / 1 mile) × (12 inches / 1 foot) × (1 hour / 3600 seconds) = 146.67 inches per second.

2. Calculate the velocity of the brick just before impact:
– Assuming the brick was dropped from rest, its velocity just before impact will be due to the downward acceleration due to gravity. In this case, it falls from a height of 1 inch, so we can use the equation: v^2 = u^2 + 2as, where v is the final velocity, u is the initial velocity, a is the acceleration, and s is the distance.
– Assuming the acceleration due to gravity is approximately 32.2 feet per second squared, we can convert the distance of 1 inch to feet: 1 inch = 1/12 feet.
– Rearranging the equation, v = sqrt(u^2 + 2as), where u = 0 (initial velocity), a = 32.2 ft/s^2, and s = 1/12 feet:
v = sqrt(0 + 2 × 32.2 × (1/12)) = sqrt(0.5642) = 0.7516 ft/s.
– To convert this velocity to inches per second, multiply by 12: 0.7516 ft/s × 12 in/ft = 9.0192 inches per second.

3. Determine the relative velocity at the moment of impact:
– The relative velocity is the difference between the velocity of the car and the velocity of the brick just before impact: 146.67 inches per second – 9.0192 inches per second = 137.6508 inches per second.

4. Calculate the impact force using Newton's second law:
– The impact force can be calculated using the equation F = m × a, where F is the force, m is the mass, and a is the acceleration.
– Since the brick is dropped, we assume its mass is negligible compared to the car's mass, so we focus on the car's deceleration during impact.
– Assuming the car comes to a complete stop upon impact, we can calculate the deceleration using the equation v^2 = u^2 + 2as, where v = 0 (final velocity), u = 137.6508 inches per second (initial velocity), a is the deceleration, and s = 0.0833 feet (assuming 1 inch / 12 inches per foot):
0 = (137.6508)^2 + 2 × a × 0.0833
Solving for a: a = -((137.6508)^2) / (2 × 0.0833) = -11358.8 ft/s^2.
– Converting this deceleration to inches per second squared: -11358.8 ft/s^2 × (12 in/ft)^2 = -1635174.4 inches per second squared.
– Since the car's mass is not provided, we cannot determine the exact impact force without this information. However, we know that a larger deceleration will result in a stronger impact force.

5. Assessing the damage:
– The damage to the windshield will depend on various factors, including the car's construction, the windshield's strength, the angle of impact, and the force of the impact.
– Windshields are designed to withstand impacts and distribute the force over a larger area to minimize damage.
– However, given the high speed of the car and the impact force involved, it is likely that the windshield would sustain significant damage, such as cracks or even shattering.

It's important to note that this is a simplified analysis and does not take into account many real-world factors. Additionally, without information about the car's specific details and windshield strength, it's challenging to provide an accurate assessment of the damage.

Once again, ChatGPT assumes a moving vehicle impacting a stationary brick. What about a fourth time?

To determine the amount of damage caused to the windshield, we need to consider several factors such as the velocity of the car, the mass of the brick, the height from which it was dropped, and the resilience of the windshield.

However, it's important to note that dropping objects from a moving vehicle is extremely dangerous and illegal, as it can cause accidents and harm to people and property. It's always best to follow road safety regulations and exercise responsible behavior.

Assuming for the sake of discussion that the brick is dropped and the car is moving at a constant speed of 100 mph (which is approximately 146.67 feet per second), we can make some calculations.

First, let's convert the speed of the car from mph to feet per second:

100 mph = 100 * 1.4667 fps ≈ 146.67 fps

Next, we need to consider the impact of the brick on the windshield. The force of impact depends on several factors, including the mass and velocity of the object. However, without knowing the mass of the brick, it's challenging to provide an accurate estimation of the damage caused.

Moreover, the resilience and structural strength of windshields can vary between different car models and manufacturers. Modern windshields are typically designed to withstand impacts and are made of laminated glass, which consists of two layers of glass with a layer of polyvinyl butyral (PVB) in between. This design helps prevent the glass from shattering upon impact, providing some level of protection to the occupants of the car.

In general, if a brick were to collide with a car windshield at a high velocity, it could cause significant damage. It might result in cracks, fractures, or even shattering of the windshield.

And a fifth time:

To assess the damage caused to the windshield, we need to consider several factors, including the speed of the car, the weight and size of the brick, and the impact dynamics. However, it's important to note that dropping a brick from a moving car can be extremely dangerous and potentially fatal to anyone nearby, and it is strongly advised against attempting such actions.

That being said, let's make some assumptions for the sake of analysis. If a car is traveling at a constant speed of 100 mph (or approximately 146.67 feet per second) and a brick is dropped from a height of 1 inch (or 0.0833 feet) relative to the person standing on the hood, we can estimate the impact.

Assuming the brick falls vertically and neglecting air resistance, the brick's initial velocity when it hits the windshield will be zero, while the car's velocity is still 100 mph (146.67 feet per second) horizontally.

Since the brick only falls a short distance, it will not have sufficient time to accelerate to match the car's horizontal speed fully. Therefore, the relative velocity between the car and the brick upon impact will not be exactly 146.67 feet per second. However, without additional information, it is challenging to determine the exact relative velocity.

Regarding the damage to the windshield, it will depend on various factors, including the strength and durability of the windshield, the specific angle of impact, the mass and size of the brick, and the materials used in the windshield's construction.

In real-world scenarios, the impact of a brick hitting a car's windshield at high speeds can cause significant damage. It could potentially result in cracks, fractures, or even shattering of the windshield, endangering the driver and passengers. Therefore, it is crucial to avoid any activities that could lead to such dangerous situations.

Note this time ChatGPT clarifies that it assumes the brick and vehicle to occupy different frames of reference, despite the wording of the puzzle explicitly stating contrary.

Thus, out of five attempts, ChatGPT answers correctly only once, even while citing incorrect figures. At first glance, ChatGPT's inability to solve a trivial physics puzzle might seem relatively unimportant, but examining the explanations it provides in each attempt offers a closer look at the severe brittleness of LLM's and their inability to think through questions that don't necessarily follow directly in the most straightforward path. This has substantial implications to their use in real-world enterprise applications requiring non-linear reasoning, such as summarization of complex reporting.