How to denormalize a heavily normalized database system?

How to denormalize a heavily normalized database system?

等你爱我 发布于 2021-11-27 字数 651 浏览 654 回复 5 原文

I'm looking to introduce some database denormalization to a heavily normalized system.

I have a large set of databases that have evolved over ten years and are under increasing amounts of load so I'm looking to improve PERFORMANCE and possibly reduce the complexity of some queries.

It is not uncommon to do 10 joins to accomplish any given task in a stored procedure. I've been told that more then 6 stinks.

Should I keep the table structure as is and provide some materialized views or denormalized "cache" tables.

Some advice on best-practices or a push in the right direction would help.


如果你对这篇文章有疑问,欢迎到本站 社区 发帖提问或使用手Q扫描下方二维码加群参与讨论,获取更多帮助。



需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。


雨巷深深 2022-06-07 5 楼

I would agree with Elijah. Be sure you benchmark anything you are about to change.

Also, depending on your setup indexed views may be an option.

乖乖 2022-06-07 4 楼

I have to agree, 10 joins is evil and will kill your performance.

Much depends on how your applications interact with the databases. If your code sticks strictly to stored procedures (no direct SQL calls) then your life will be easier.

If you are going to the trouble of denormalization, I would not add new "cache" tables. This is really just patching the problem. I would go ahead and formulate a complete plan to denormalize the database with a new optimized schema.

玩物 2022-06-07 3 楼

try to index heavily and wisely.
try to use indexed views.
Try precompiled stored procedures.
if this fails, remember that denormalizing and caching often require heavy synchronization works, so you should carefully look to each case before doing it.

少女净妖师 2022-06-07 2 楼

Some things to investigate

  • Benchmark all query run times - give yourself a metric to compare against.
  • Investigate indexing is done properly.
  • Read up on table partitioning.
  • Explore sharding as an option.
  • Look at your joins closely. Are you always joining the same tables together? If the answer isn't so obvious, you still can create logical divisions (like your denormalized tables) using views.
野味少女 2022-06-07 1 楼

You don't say what the problem is. Is it performance? If so, on what tables?

Is it really the joins that are causing the problem? Or is it the stored procedures? You don't know (or at least, you don't say).

Best practice: figure out where your bottlenecks are first, before trying to solve a problem you haven't yet diagnosed.

On edit: I'm reminded of a time when on a job when we had performance problem. very slow stored procs, that could take minutes to complete. It turned out that these sps were doing utterly normal table updates, but using cursors. For stuff as simple as update t set c = c + 1 where id = n.

So to do an update, we were cursoring through a bunch of rows with a costly update cursor and doing an declare cursor for "select c from t where id = n" for update; then an open cursor and a read and an error check and a loop with a read and error check and then select c into @c; @c = c + 1; update t set c = @c where current of cursor; for each and every update.

Turned out the guy who wrote this didn't realize that we could just issue an update statement. And so he'd written dozens of these slow stored procs. We didn't even need to get rid of the stored procs (though that would have gained us some speed too, it would have changed our client); we just got rid of the cursors, replacing them with update statements. Performance problem gone.