Assume you are a "lucky" guy that your Java application interfaces with a C/C++ application (e.g. a kind of server) which sends you some kind of TCP/UDP network messages you need to parse. An example such C/C++ structure is shown below:
enum Gender { MALE, FEMALE };
struct msg {
#ifdef INTEL_STYLE
UCHAR spare4:4;
UCHAR octal:3;
UCHAR bool:1;
#else
UCHAR bool:1;
UCHAR octal:3;
UCHAR spare4:4;
#endif
UINT uint;
char str[5];
float flt;
enum Gender gender;
}
If your C/C++ application runs on an INTEL (x86) based machine architecture, then you receive the bits as little
endian (see INTEL_STYLE above), otherwise as big
endian (e.g. SPARC machines). Note that the JVM is big
endian, too. In the following we assume a
big-endian architecture.
In this blog entry we are going to see how you can parse such a message in your receiving Java application.
What will you need?
- the javolution library to parse the C/C++ struct in Java
- a calculator that handles binaries, hexadecimals and decimals (Windows, Linux and MacOSX already provide such calculators. However, they don't handle decimal point numbers, so this online converter will prove useful, too).
The following table shows how the C/C++ data types correspond to Javolution
Struct.
C |
Java (Javolution Struct) |
UCHAR
|
Unsigned8
|
UWORD
|
Unsigned16
|
UINT
|
Unsigned32
|
byte
|
Signed8
|
short
|
Signed16
|
int
|
Signed32
|
long
|
Signed64
|
long long
|
Signed64
|
float
|
Float32
|
double
|
Float64
|
pointer
|
Reference32
|
char[]
|
UTF8String
|
enum
|
Enum32
|
Let's get started.
The following Java class represents the above C/C++ struct in Java:
import java.nio.ByteBuffer;
public class Message extends javolution.io.Struct {
private final Unsigned8 bool = new Unsigned8(1);
private final Unsigned8 octal = new Unsigned8(3);
private final Unsigned8 spare2 = new Unsigned8(4);
private final Unsigned32 uint = new Unsigned32();
private final UTF8String str = new UTF8String(5);
private final Float32 flt = new Float32();
private final Enum32 gender = new Enum32(Gender.values());
public Message (byte[] b) {
this.setByteBuffer(ByteBuffer.wrap(b), 0);
}
public boolean getBool() {
return bool.get() != 0;
}
public int getOctal() {
return octal.get();
}
public long getUInt() {
return uint.get();
}
public String getStr() {
return str.get();
}
public float getFlt() {
return flt.get();
}
public Gender getGender() {
return gender.get();
}
}
enum Gender { MALE, FEMALE };
Our
Message class corresponds to the C
msg struct. It extends
javolution.io.Struct which is an implementation of the
java.nio.ByteBuffer. This
crash course about Java ByteBuffer provides useful background information.
The C/C++ struct starts with a UCHAR which corresponds to Unsigned8, i.e. one byte. The numbers after the colons (:) denote how many bits inside the byte represent each field of the UCHAR. Thus, octal:3 means that 3 bits represent the octal field. This in Javolution is represented by Unsigned8(3).
UINT is represented by Unsigned32 in javolution, which is 4 bytes long.
The string char[5] is represented by UTF8String(5). float by Float32.
Finally, the
enum is represented by
Enum32.
Let's create a unit test to test the above:
import org.junit.After;
import org.junit.Before;
import org.junit.Test;
import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertTrue;
public class MessageTest {
private Message msg;
@Before
public void setUp() {
byte[] bb = new byte[] {
(byte) 0x90, // 1001 0000
(byte) 0x00, (byte) 0x00, (byte) 0x00, // alignment with previous!
(byte) 0x00, (byte) 0x00, (byte) 0x00, (byte) 0x02, // uint
(byte) 0x48, (byte) 0x41, (byte) 0x4C, (byte) 0x4C, (byte) 0x4F, // str
(byte) 0x00, (byte) 0x00, (byte) 0x00, // alignment with previous!
(byte) 0x3F, (byte) 0xC0, (byte) 0x00, (byte) 0x00, // flt
(byte) 0x00, (byte) 0x00, (byte) 0x00, (byte) 0x01, // gender
};
msg = new Message(bb);
}
@After
public void tearDown() {
}
@Test
public void testMessage() {
assertTrue(msg.getBool()); // 1 = true
assertEquals(1, msg.getOctal()); // 001
assertEquals(2, msg.getUInt());
assertEquals("HALLO", msg.getStr());
assertEquals(1.5, msg.getFlt(), 0.0);
assertEquals(Gender.FEMALE, msg.getGender());
}
}
The first byte 0x90 corresponds to the binary value 1001 0000. The first bit (1) represents bool:1, the next three (001) the octal:3, and the last four (0000) spare:4.
Be careful of the alignment! 1 byte + 3 bytes (of alignment) and the next field (uint) starts at the 5th byte and not at the 2nd as you might have expected.
The next 4 bytes correspond to the uint. The next 5 ASCII characters correspond to the string "HALLO". Again, another alignment, and then the float field. The last 4 bytes represent the Gender enum which contains the value 1, i.e. Gender.FEMALE.
Packed
However, your data might be packed, i.e. no alignment/padding is happening. To do this, you override the isPacked() method of javolution.io.Struct:
public class Message extends javolution.io.Struct {
...
@Override
public boolean isPacked() {
return true;
}
...
}
Now your test case data should contain no padding in order to pass:
import org.junit.After;
import org.junit.Before;
import org.junit.Test;
import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertTrue;
public class MessageTest {
private Message msg;
@Before
public void setUp() {
byte[] bb = new byte[] {
(byte) 0x90, // 1001 0000
// (byte) 0x00, (byte) 0x00, (byte) 0x00, // alignment with previous!
(byte) 0x00, (byte) 0x00, (byte) 0x00, (byte) 0x02, // uint
(byte) 0x48, (byte) 0x41, (byte) 0x4C, (byte) 0x4C, (byte) 0x4F, // str
// (byte) 0x00, (byte) 0x00, (byte) 0x00, // alignment with previous!
(byte) 0x3F, (byte) 0xC0, (byte) 0x00, (byte) 0x00, // flt
(byte) 0x00, (byte) 0x00, (byte) 0x00, (byte) 0x01, // gender
};
msg = new Message(bb);
}
@After
public void tearDown() {
}
@Test
public void testMessage() {
assertTrue(msg.getBool()); // 1 = true
assertEquals(1, msg.getOctal()); // 001
assertEquals(2, msg.getUInt());
assertEquals("HALLO", msg.getStr());
assertEquals(1.5, msg.getFlt(), 0.0);
assertEquals(Gender.FEMALE, msg.getGender());
}
}
Conclusion
This concludes what you need to know to parse a C/C++ struct in Java. However, keep in mind the following gotchas of Javolution:
- All Structs should be declared final.
- Javolution doesn't support nested structs; you need to flaten your C/C++ structs in Java. E.g.
struct Identification {
byte b;
long l;
}
struct msg {
struct Identification id;
int i;
}
should be represented by:
public class Message extends javolution.io.Struct {
private final Signed8 b = new Signed8();
private final Signed64 l = new Signed64();
private final Signed32 i = new Signed32();
...
}
- Javolution array can only accept members of Struct.Member. The following will not work:
private final Reference32[] refs = array(new Reference32[2]);
and you need to replace it by:
private final Signed32[] refs = array(new Signed32[2]);
The following won't work neither:
private final AStruct[] aStruct = array(new AStruct[2]);
Happy parsing!
(You may wish to write a parser to automatically parse the C/C++ source file and generate a Java java.nio.Struct file based on the above mappings. Please let me know).