Your conclusion that padding is unnecessary is right. It's always possible to determine the length of the input unambiguously from the length of the encoded sequence.

However, padding is useful in situations where base64 encoded strings are concatenated in such a way that the lengths of the individual sequences are lost, as might happen, for example, in a very simple network protocol.

If unpadded strings are concatenated, it's impossible to recover the original data because information about the number of odd bytes at the end of each individual sequence is lost. However, if padded sequences are used, there's no ambiguity, and the sequence as a whole can be decoded correctly.

Edit: An Illustration

Suppose we have a program that base64-encodes words, concatenates them and sends them over a network. It encodes "I", "AM" and "TJM", sandwiches the results together without padding and transmits them.

  • I encodes to SQ (SQ== with padding)
  • AM encodes to QU0 (QU0= with padding)
  • TJM encodes to VEpN (VEpN with padding)

So the transmitted data is SQQU0VEpN. The receiver base64-decodes this as I\x04\x14\xd1Q) instead of the intended IAMTJM. The result is nonsense because the sender has destroyed information about where each word ends in the encoded sequence. If the sender had sent SQ==QU0=VEpN instead, the receiver could have decoded this as three separate base64 sequences which would concatenate to give IAMTJM.

Why Bother with Padding?

Why not just design the protocol to prefix each word with an integer length? Then the receiver could decode the stream correctly and there would be no need for padding.

That's a great idea, as long as we know the length of the data we're encoding before we start encoding it. But what if, instead of words, we were encoding chunks of video from a live camera? We might not know the length of each chunk in advance.

If the protocol used padding, there would be no need to transmit a length at all. The data could be encoded as it came in from the camera, each chunk terminated with padding, and the receiver would be able to decode the stream correctly.

Obviously that's a very contrived example, but perhaps it illustrates why padding might conceivably be helpful in some situations.