Some systems are experiencing issues

Past Incidents

Thursday 13th June 2019

No incidents reported

Wednesday 12th June 2019

cleverapps.io domains cleverapps.io domain are timing out

Some cleverapps.io domains are experiencing time outs, we are investigating the issue

EDIT 18:41 UTC: The problem should now be fixed since a couple of minutes. We gathered information as to why this problem happened and will try to narrow it down.

Tuesday 11th June 2019

Deployments False positives, automatic redeployments and deployments queue

A human error triggered a lot of false positives regarding applications status. This in turn queued hundreds of automatic deployments.

The issue is now fixed, but deployments will take a little while longer to start until the queue is consumed.

EDIT: Incident over at 09:40 UTC

Cellar A cellar node restarted, timeouts or HTTP 500 errors sent

A node from the old Cellar cluster restarted at 21:30 UTC. While it went okay at first thanks to the restart of the few nodes a few days ago, it started emitting HTTP 500 errors or timeouts, as it was before. Service should be back online in a few hours once the cluster stabilized itself again.

The new Cellar cluster is not impacted by those issues.

EDIT 23:40 UTC: Cluster now seems to be in a good shape again

Monday 10th June 2019

Deployments Deployments keep getting through the build phase

Some deployments seem to keep building (even if the build succeeds, another build starts). We are looking into it.

EDIT 12:33 UTC: We may have identified the root cause. It may be due to a change that happened this morning. We will revert it.

EDIT 12:43 UTC: The change has been reverted and we confirm that it resolves the issue. Sorry for the inconvenience.

Sunday 9th June 2019

Cellar A cellar node is restarting

One node of our old Cellar cluster is restarting, some requests are failing (timeouts or 500 errors). This will be resolved once the node has fully restarted. We may need to restart more nodes right after.

EDIT 23:30 UTC: Other nodes need to be restarted. We saw <1% of failing requests, expect the same amount for the remaining restarts.

EDIT 02:00 UTC: Nodes have been restarted, failing requests are getting lower and lower, still under 1%.

Saturday 8th June 2019

No incidents reported

Friday 7th June 2019

API API Unavailable or very slow

Our main API is currently having some troubles to respond to requests in a timely manner. We are investigating it.

EDIT 15:32 UTC: The issue has been identified, we are currently re-deploying the API. Console is still unavailable.

EDIT 15:34 UTC: The API successfully redeployed and is now available. Console is now available too. The incident is over.

Logs system part restart, scheduled 4 years ago

We will restart a part of our logs system, it will take 2 minutes. After this interruption, the logs produced during the restart will be available but the logs ordering will be lost. This restart is a part of our new logs system development.

EDIT 14:27UTC: finished.