Java - File size is not incomprehensible

Accname

2D-Graphics enthusiast
Reaction score
1,462
Edit: Dang, what a typo in the title. It is of course: "File size is incomprehensible".

Hi guys.
I wanna save a class in java using the serialization process.
Now this class has a lot of data.
In fact, inside is a two dimensional array containing "short"'s.
Now this array is huge, it has 2048 * 2048 entries, each of which is a short (16 bit).

But when i saved it i sure didnt expect that file size. The file is 88mb in size on my hard drive.
Thats insane.

So i first checked everything else in the object:
i removed the two dimensional array, just for testing, and saved again. Its 200kb now.
But with the array its 88.something mb.

By my calculation it should be:
((((2048 * 2048) * 2) (since a short is 2 byte) / 1024) / 1024) == 8mb
So how comes its over ten times that amount?
Please, i really need your help here.
 

Artificial

Without Intelligence
Reaction score
326
I could not replicate your problem. Using this code:
[gist]9f859ef5ccc42508b6cd[/gist]
produced in a test.dat file of size 8.1 MB:
Code:
/home/felix $ javac a.java && java a && du -h test.dat                             
8.1M	test.dat

Then again, I haven't used Java in a while so I might've made some mistakes in there. Anyhow, maybe you could try to provide a minimal working example of the problem?
 

Accname

2D-Graphics enthusiast
Reaction score
1,462
Thanks for your reply.
Upon further investigation i think i found the cause.

The array which i was using was not exactly a two dimensional array but a 4 dimensional array with the third and fourth dimension set to a size of 1.
So it was 2048 * 2048 * 1 * 1 as a 4 dimensional array.
As it seems is the overhead produced by this construct quite large.
Changing the array to a two dimensional 2048 * 2048 array resulted in a file size of 8.3mb.
Having 2048 * 2048 * 1 * 1 was 88.6mb though.

I can understand that there is a certain overhead, but i cant explain why it would be 1100%.

I will probably change the array to be two dimensional and emulate the third and fourth dimension by extending either the width or height as required.
 

Artificial

Without Intelligence
Reaction score
326
Ah, that does indeed explain it. Java's serialization stores more information about the objects than just their contents. For example for arrays it saves at least the object's type (1 byte), size (integer, 4 bytes), and class description (handle, integer, 4 bytes). That means the size it takes is
Code:
2048*2048*(1+4+4+1+4+4+2)+2048*(1+4+4) bytes = 83 904 512 bytes
so about 83.8 MB. Storing also the type of the value (short) using 1 byte brings it up to about 88 MB. If you want to learn more about the way the data is saved, I'd recommend checking out the serialization protocol (especially the grammar is useful if you understand it).
 
General chit-chat
Help Users
  • No one is chatting at the moment.

      The Helper Discord

      Members online

      No members online now.

      Affiliates

      Hive Workshop NUON Dome World Editor Tutorials

      Network Sponsors

      Apex Steel Pipe - Buys and sells Steel Pipe.
      Top