NPE when RM is recovering with Yarn JWT enabled and app is running

Description

Yarn is configured with JWT support disabled and there is an application running. We enable JWT on Yarn and restart RM. When RM recovers, it will try to register with the RMAppSecurityManager to renew JWT for the running application.

Since JWT support was disabled, there is no JWT associated with the application and it will cause a NullPointerException.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 2019-06-25 16:46:55,379 INFO org.apache.hadoop.service.AbstractService: Service ResourceManager failed in state STARTED; cause: java.lang.NullPointerException: temporal java.lang.NullPointerException: temporal at java.util.Objects.requireNonNull(Objects.java:228) at java.time.Instant.from(Instant.java:371) at java.time.Instant.until(Instant.java:1144) at java.time.Duration.between(Duration.java:473) at org.apache.hadoop.yarn.server.resourcemanager.security.JWTSecurityHandler.computeScheduledDelay(JWTSecurityHandler.java:217) at org.apache.hadoop.yarn.server.resourcemanager.security.JWTSecurityHandler.registerRenewer(JWTSecurityHandler.java:208) at org.apache.hadoop.yarn.server.resourcemanager.security.RMAppSecurityManager.registerWithMaterialRenewers(RMAppSecurityManager.java:217) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$RMAppRecoveredTransition.transition(RMAppImpl.java:1151) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$RMAppRecoveredTransition.transition(RMAppImpl.java:1056) at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:387) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:304) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:48) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:455) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:864) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:127) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:357) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:499) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1589) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ResourceTrackingServices.serviceStart(ResourceManager.java:705) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMSchedulerServices.serviceStart(ResourceManager.java:842) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startSchedulerServices(ResourceManager.java:1254) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1320) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1316) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1929) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1316) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1389) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1626)

Status

Assignee

Antonis Kouzoupis

Reporter

Antonis Kouzoupis

Labels

None

Fix versions

Affects versions

2.8.2.7

Priority

Medium