The external storage data and shannon are both called bits, exactly because they’re both base 2. That does not mean they’re the same. As the article explains it, a shannon is like a question from 20 questions.
Wrong. They are called the same because they are fundamentally the same. That's how you measure information.
In some contexts, one wants to make a difference between the theoretical information content and what is actually stored on a technical device. But that's a fairly subtle thing.
I don't see how that can be a subtle difference. How is a bit of external storage data only subtly different from information content that tells the probability of the event occurring is ½?
It's a bit like asking what is the difference between the letter "A" and ink on a page in the shape of the letter "A". Of course, first one would have to explain how they are usually not different at all.
BTW, I don't know what you mean by "external storage data". The expression doesn't make sense.