4

I'm working on project in hadoop on my computer. When I start some job consuming most of the available resources (100% CPU, high RAM usage) 'something' kills my user's session and all its processes.

syslog:

Jun  8 21:38:46 michalo-desktop systemd[1]: Created slice User Slice of hadoop.
Jun  8 21:38:46 michalo-desktop systemd[1]: Starting User Manager for UID 1001...
Jun  8 21:38:46 michalo-desktop systemd[1]: Started Session 21 of user hadoop.
Jun  8 21:38:46 michalo-desktop systemd[11932]: Reached target Paths.
Jun  8 21:38:46 michalo-desktop systemd[11932]: Reached target Sockets.
Jun  8 21:38:46 michalo-desktop systemd[11932]: Reached target Timers.
Jun  8 21:38:46 michalo-desktop systemd[11932]: Reached target Basic System.
Jun  8 21:38:46 michalo-desktop systemd[11932]: Reached target Default.
Jun  8 21:38:46 michalo-desktop systemd[11932]: Startup finished in 126ms.
Jun  8 21:38:46 michalo-desktop systemd[1]: Started User Manager for UID 1001.
Jun  8 21:38:46 michalo-desktop systemd[1]: Stopping User Manager for UID 1001...
Jun  8 21:38:46 michalo-desktop systemd[11932]: Stopped target Default.
Jun  8 21:38:46 michalo-desktop systemd[11932]: Stopped target Basic System.
Jun  8 21:38:46 michalo-desktop systemd[11932]: Stopped target Timers.
Jun  8 21:38:46 michalo-desktop systemd[11932]: Stopped target Sockets.
Jun  8 21:38:46 michalo-desktop systemd[11932]: Reached target Shutdown.
Jun  8 21:38:46 michalo-desktop systemd[11932]: Starting Exit the Session...
Jun  8 21:38:46 michalo-desktop systemd[11932]: Stopped target Paths.
Jun  8 21:38:46 michalo-desktop systemd[11932]: Received SIGRTMIN+24 from PID 12000 (kill).
Jun  8 21:38:46 michalo-desktop systemd[1]: Stopped User Manager for UID 1001.
Jun  8 21:38:46 michalo-desktop systemd[1]: Removed slice User Slice of hadoop.

The log doesn't contain any info about the kill reson. kern.log doesn't have anything interesting.

Zoltan
  • 497
  • 6
  • 16
michalo2882
  • 41
  • 1
  • 2
  • How often does that repeat? Strange that it kills it within 1 second of it starting. –  Jun 08 '16 at 20:35
  • 1
    How much ram is available and in use? If it was the OOM killer you could try `dmesg` to see if it did anything – Wilf Jun 08 '16 at 20:37
  • It's 100% reproducible on normal user and the issue does not occur on root - this is my workaround for now. There's about 1GB free RAM (out of 8GB) during computation. – michalo2882 Jun 08 '16 at 20:49

1 Answers1

1

It is Hadoop itself that tries to kill some of its processes in order to reduce the resource usage on the machine, but instead it kills all processes it can due to a bug in the kill command.

Zoltan
  • 497
  • 6
  • 16