Java - File size is not incomprehensible

Accname

2D-Graphics enthusiast
Reaction score
1,456
Edit: Dang, what a typo in the title. It is of course: "File size is incomprehensible".

Hi guys.
I wanna save a class in java using the serialization process.
Now this class has a lot of data.
In fact, inside is a two dimensional array containing "short"'s.
Now this array is huge, it has 2048 * 2048 entries, each of which is a short (16 bit).

But when i saved it i sure didnt expect that file size. The file is 88mb in size on my hard drive.
Thats insane.

So i first checked everything else in the object:
i removed the two dimensional array, just for testing, and saved again. Its 200kb now.
But with the array its 88.something mb.

By my calculation it should be:
((((2048 * 2048) * 2) (since a short is 2 byte) / 1024) / 1024) == 8mb
So how comes its over ten times that amount?
Please, i really need your help here.
 

Artificial

Without Intelligence
Reaction score
326
I could not replicate your problem. Using this code:
[gist]9f859ef5ccc42508b6cd[/gist]
produced in a test.dat file of size 8.1 MB:
Code:
/home/felix $ javac a.java && java a && du -h test.dat                             
8.1M	test.dat
Then again, I haven't used Java in a while so I might've made some mistakes in there. Anyhow, maybe you could try to provide a minimal working example of the problem?
 

Accname

2D-Graphics enthusiast
Reaction score
1,456
Thanks for your reply.
Upon further investigation i think i found the cause.

The array which i was using was not exactly a two dimensional array but a 4 dimensional array with the third and fourth dimension set to a size of 1.
So it was 2048 * 2048 * 1 * 1 as a 4 dimensional array.
As it seems is the overhead produced by this construct quite large.
Changing the array to a two dimensional 2048 * 2048 array resulted in a file size of 8.3mb.
Having 2048 * 2048 * 1 * 1 was 88.6mb though.

I can understand that there is a certain overhead, but i cant explain why it would be 1100%.

I will probably change the array to be two dimensional and emulate the third and fourth dimension by extending either the width or height as required.
 

Artificial

Without Intelligence
Reaction score
326
Ah, that does indeed explain it. Java's serialization stores more information about the objects than just their contents. For example for arrays it saves at least the object's type (1 byte), size (integer, 4 bytes), and class description (handle, integer, 4 bytes). That means the size it takes is
Code:
2048*2048*(1+4+4+1+4+4+2)+2048*(1+4+4) bytes = 83 904 512 bytes
so about 83.8 MB. Storing also the type of the value (short) using 1 byte brings it up to about 88 MB. If you want to learn more about the way the data is saved, I'd recommend checking out the serialization protocol (especially the grammar is useful if you understand it).
 
General chit-chat
Help Users
  • No one is chatting at the moment.
  • jonas jonas:
    Your website isn't blocked by GFW so much is for sure :p
  • tom_mai78101 tom_mai78101:
    Do you have family in taiwan? <-- Yes.
  • jonas jonas:
    Cool, enjoy your vacation!
  • The Helper The Helper:
    Have a great vacation Tom!
  • The Helper The Helper:
    Happy Friday Night!
    +1
  • V-SNES V-SNES:
    Happy Friday Night!
    +1
  • The Helper The Helper:
    Going out of a town for the weekend will be back sunday night! Hope everyone has a great weekend!
    +2
  • The Helper The Helper:
    Happy Monday!
  • Ghan Ghan:
    Monday? Speak for yourself. :p
    +2
  • The Helper The Helper:
    Happy Taco Tuesday!
    +1
  • The Helper The Helper:
    Tacos! :)
  • The Helper The Helper:
    Check out the discord for taco pictures :) You cannot put pictures in this chat
    +1
  • The Helper The Helper:
    Damn here come the bots again - 193 online but they are totally invisible to any stats - bunch of bots!'
  • C cubanismo:
    Re: Taco Tuesday, if there were some way to share edible tacos over the internet, technology would be complete.
    +2
  • The Helper The Helper:
    One can only wish!
  • tom_mai78101 tom_mai78101:
    I'm back from Taiwan
    +1
  • The Helper The Helper:
    I am reorganizing the site I know nobody will notice but I am not done quite yet but the main forums order has been changed and there is a new news category in Other News which is all the remaining headline news stuff not categorized - Headline News is just the stuff that shows on the main page now and the news archive is off the main forums page
  • The Helper The Helper:
    and the real archive lives off of headline news

    The Helper Discord

    Members online

    No members online now.

    Affiliates

    Hive Workshop NUON Dome World Editor Tutorials

    Network Sponsors

    Apex Steel Pipe - Buys and sells Steel Pipe.
    Top