Add compile-time option PB_BUFFER_ONLY.

This allows slight optimizations if only memory buffer support
(as opposed to stream callbacks) is wanted. On ARM difference
is -12% execution time, -4% code size when enabled.
6 files changed