Should I normalize my data before storing it in MongoDB?

Technology CommunityCategory: MongoDBShould I normalize my data before storing it in MongoDB?
VietMX Staff asked 3 years ago

It depends from your goals. Normalization will provide an update efficient data representation. Denormalization will make data reading efficient.

In general, use embedded data models (denormalization) when:

  • you have “contains” relationships between entities.
  • you have one-to-many relationships between entities. In these relationships the “many” or child documents always appear with or are viewed in the context of the “one” or parent documents.

In general, use normalized data models:

  • when embedding would result in duplication of data but would not provide sufficient read performance advantages to outweigh the implications of the duplication.
  • to represent more complex many-to-many relationships.
  • to model large hierarchical data sets.

Also normalizing your data like you would with a relational database is usually not a good idea in MongoDB. Normalization in relational databases is only feasible under the premise that JOINs between tables are relatively cheap. The $lookup aggregation operator provides some limited JOIN functionality, but it doesn’t work with sharded collections. So joins often need to be emulated by the application through multiple subsequent database queries, which is very slow (see question MongoDB and JOINs for more information).