What is a crashloop?

VietMX Staff asked 3 years ago

crashloop is when a process crashes and is restarted by a watchdog daemon, indefinitely.

That is, the history is:

  • Process starts at time T.
  • Process crashes at time T+1.
  • Watchdog daemon restarts process.
  • Process started at time T+2.
  • Process crashes at time T+3.
  • Watchdog daemon restarts process.
  • Process starts…etc.

In general, in distributed computing if you want something to eventually succeed, you have to write down your intent for it to be completed and you need a worker to loop continually to act on this intent. This is “at least once delivery” of a work item.

Here, the intent is that the task runs, and watchdog itself is running the loop that is constantly trying to make sure the task runs. This is why when a task crashes, it is restarted. When a task crashes repeatedly, together you end up with a crashloop.