Written on July 2025
Part 1: Multidimensional Models Are Weird
All modern AI is based on high-dimensional spaces, so here is a collection of counter intuitive, fun facts, about them.
For instance in hight-dimensional space, the volume of a unit sphere converge to zero!
- In high-dimensional spaces, most of the volume of a hypersphere is concentrated near its surfaceânot in the center. For example, in 100 dimensions, nearly all the volume lies in a thin shell just beneath the surface. It's like a balloon where everything happens near the skin!
- In high-dimensional spaces, random vectors are almost always nearly orthogonal to each other. That means if you pick two points at random in, say, 1000-dimensional space, the angle between them will be close to 90°.
- In high-dimensional spaces, almost all of a hypercubeâs volume is in the corners. So if you pick a point at random, it's likely to be far from the center and near one of the extreme edgesâvery different from low-dimensional intuition!
- In high-dimensional spaces, the distance between any two random points becomes almost the same. This âdistance concentrationâ means notions like ânearest neighborâ get fuzzyâeverything is similarly far apart!
- In high-dimensional spaces, simple shapes behave counterintuitivelyâfor example, a hypersphere inscribed in a hypercube touches almost none of the cubeâs volume. As dimensions increase, the sphere âshrinksâ relative to the cube!
- In high dimensions, projecting data down to just a few dimensions (like with random projections) can still preserve distances surprisingly well. This is thanks to the JohnsonâLindenstrauss lemmaâweirdly, losing dimensions doesnât always mean losing structure!
- In high-dimensional spaces, adding more dimensions can actually make classification easier. Thatâs because data points become more separable with more featuresâa key reason why high-dimensional methods like support vector machines work so well.
- In high dimensions, the triangle inequality starts to âbreak downâ in practiceâdistances between points become so similar that the shortest path between two points might not feel much shorter than going through a third point.
- In high-dimensional spaces, uniform random samples tend to cluster near the boundary of the space. So if you sample points from a high-dimensional unit cube, most will lie close to the outer surface rather than the center.
- In high dimensions, most of the volume of a hypercube lies outside any inscribed hypersphere. So even though a sphere fits snugly inside a cube in 2D or 3D, in higher dimensions it barely fills any of the cube's volume.
The above concepts may feel strange and counterintuitive, but it's possible to understand them and even develop a good intuition around them!
Part 2: Build intuitions
Multidimensional spaces are famously difficult to represent.
Most of the work done on them involves equations, and you simply have to accept the results because, well, they are true.
However, I wanted to build some intuition and see if an LLM could help.
So I start questioning one, and it also help me to make graph.
For instance hereâs the volume of the unit sphere from dimension 1 to 20. It peaks around dimension 5, then starts decreasing. Adding more dimensions makes the sphereâs interior volume shrink.
Not expected! đ
Me:
Why it increases at first?
LLM:
Adding more dimensions means adding new âdirectionsâ for volume to spread into.
1D: just a line segment 2D: a filled circle â more space 3D: a ball â even more room etc...
Me:
But why it starts shrinking after ~5D?
LLM:
As you keep adding dimensions, you dilute the volume, the center of the sphere contributes less and less. Almost all of the volume concentrates near the surface (in a thin shell), not in the "core".
Wait! What???
Almost all of the volume concentrates near the surface (in a thin shell), not in the "core"????
What does that even means?
It took me a while to understand, but I finally got it!
Spoiler, it has to do with 2l bottles...
Enlarging a 1-liter bottle to make it a 2-liter bottle doesn't make it twice as large.
In a way, the second liter requires less radius than the first one. So even in this case, the volume is not distributed evenly.
Let's go back to the sphere. The more liters you want to fit into a sphere, the larger the radius must be. However, this increase isn't linear; you don't need to double the radius to double the sphere's volume.
And this effect gets amplified when you increase the dimensions. The more dimension, the less you have to increase the radius to fit more liters in the sphere. Thats fairly intuitive: in 1D it's linear and in 2D it proportional to a square root, etc...
If we represent this effect for every dimension, we see that we need less and less radius to fit additional liters.
If you get there, you should understand the sentence that seems weird at first: "Almost all of the volume concentrates near the surface, in a thin shell".
You've got this!
If you look at the horizontal line corresponding to a radius of one and follow it along the d-axis, you'll see it first increases (crossing the blue and then green curve) and then decreases, crossing the red and yellow curves.
Et voilĂ !
I didn't expect that interacting with an LLM could make such a difference!
I clearly wouldn't have taken the time to watch a YouTube video on the topic, and I'm not sure it would have helped me much in building some personal intuition either.
There has never been a better time to be alive for curious minds!