Security material incorrectly revoked when application failed but there are more attempts

Description

When an application finishes or fails we must revoke all security material (X.509/JWT) issued for it. If application is RUNNING and the attempt fails the AMLauncher#cleanup method is called which also revokes the security material.

When Yarn is configured to allow more than one application attempts, application will be re-scheduled. New security material will not be issued, but they are already revoked when the first attempt failed. From this moment, application is doomed to fail since its X509 or JWT is invalid.

Status

Assignee

Antonis Kouzoupis

Reporter

Antonis Kouzoupis

Labels

None

Fix versions

Affects versions

Priority

High
Configure