Why does byteArray have a length of 22 instead of 20?

We try to convert from string to Byte[] using the following Java code:

String source = "0123456789";
byte[] byteArray = source.getBytes("UTF-16");

We get a byte array of length 22 bytes, we are not sure where this padding comes from. How do I get an array of length 20?

Asked by: Freddie279 | Posted: 21-01-2022

Answer 1

Alexander's answer explains why it's there, but not how to get rid of it. You simply need to specify the endianness you want in the encoding name:

String source = "0123456789";
byte[] byteArray = source.getBytes("UTF-16LE"); // Or UTF-16BE

Answered by: Kimberly590 | Posted: 22-02-2022

Answer 2

May be the first two bytes are the Byte Order Mark. It specifies the order of bytes in each 16-bit word used in the encoding.

Answered by: Charlie256 | Posted: 22-02-2022

Answer 3

Try printing out the bytes in hex to see where the extra 2 bytes are added - are they at the start or end?

I'm picking that you'll find a byte order marker at the start (0xFEFF) - this allows anyone consuming (receiving) the byte array to recognise whether the encoding is little-endian or big-endian.

Answered by: Madaline982 | Posted: 22-02-2022

Answer 4

UTF has a byte order marker at the beginning that tells that this stream is encoded in a particular format. As the other users have pointed out, the
1st byte is 0XFE
2nd byte is 0XFF
the remaining bytes are

Answered by: Sam451 | Posted: 22-02-2022

