Entities with large Text attributes or Binary Data

Entities with large Text attributes or Binary Data

I'd like to question the community about this issue. This is purely an architectural question, but nonetheless one that I come up with on most projects.
It is a best practise to separate entities with binaryData and Text attributes bigger than 2000, creating them with just two columns (ID and Text or ID and BinaryData).

Imagine that you have an application that needs to store Header images and Word Files. The Header Images are used on the header of the application itself and Word Files are the CVs of the employees. My question is:

- Would you create 2 entities (one called CVFile and the other called HeaderImage) and used them on their own screens OR would you create only 1 entity called File and use this same entity to store both filetypes on each screen?
- The same question can be applied to large Text attributes like a Text(4000) column.

I'd like your input on this. Can you argue for both sides, please?
Hi Gonçalo,

This is indeed a curious question. The way I'd personally look into it would be to separate the entities in two, yes. The way I look into tables is that each table contains a different "object" in my application. As such, I'd have a CVFile entity and a Header Image entity, for sure - otherwise, if someone in the future wants to maintain the application and evolve it, they would risk using a CV in the header of the application, or showing a header image for a specific employee CV, given that that data consistency would have to be enforced in application logic.

I mean, in my case it all boils down to preference, but if that would be the case you could argue for an extreme case of the TEXT attributes, where all TEXT fields in your database would be stored in a TEXT entity, that only has ID and String attributes - regardless of them being a user's name, address, company name, email content, etc. 

I'm sure that people in the field - unlike me - will have better opinions on this subject, but I wanted to chime in here with my two cents. What's your take on this?


Paulo Tavares
Hi Paulo,

I was hoping to generate some debate over this, since I find advantages and disadvantages on both scenarios.
Advantages of storing all binary content (or large TEXT) in the same entity:
- If you want to find TEXT regardless of its context, it takes a single query to do it;
- Less entities in the data model = less space ocupied on disk;

Disadvantages of storing all binary content (or large TEXT) in the same entity:
- Executing queries and maintenance tasks on that entity will become increasingly slower as time passes;
- No functional distinction between the data that the entity contains;

Can you guys add to these lists?