Postgresql restart issue


(Harinder Singh) #1

Postgres is not starting after reboot. I am getting the following error. I tried to remove “postmaster.pid” file as suggested but still not starting.

Note: I imported the matrixdemo VM from Squiz manuals website into Digital Ocean and configured it.

[root@matrix-5-5-2-0 ~]# systemctl start postgresql.service
Job for postgresql.service failed because the control process exited with error code. See "systemctl status postgresql.service" and "journalctl -xe" for details.

[root@matrix-5-5-2-0 ~]# systemctl status postgresql.service
● postgresql.service - PostgreSQL database server
   Loaded: loaded (/usr/lib/systemd/system/postgresql.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Fri 2019-07-19 12:33:59 AEST; 6min ago
  Process: 1848 ExecStart=/usr/bin/pg_ctl start -D ${PGDATA} -s -o -p ${PGPORT} -w -t 300 (code=exited, status=1/FAILURE)
  Process: 1828 ExecStartPre=/usr/bin/postgresql-check-db-dir ${PGDATA} (code=exited, status=0/SUCCESS)

Jul 19 12:33:58 matrix-5-5-2-0 systemd[1]: Starting PostgreSQL database server...
Jul 19 12:33:58 matrix-5-5-2-0 pg_ctl[1848]: pg_ctl: another server might be running; trying to start server anyway
Jul 19 12:33:58 matrix-5-5-2-0 pg_ctl[1848]: FATAL:  could not remove old lock file "postmaster.pid": Permission denied
Jul 19 12:33:58 matrix-5-5-2-0 pg_ctl[1848]: HINT:  The file seems accidentally left over, but it could not be removed. Please remove the file by hand and try again.
Jul 19 12:33:59 matrix-5-5-2-0 pg_ctl[1848]: pg_ctl: could not start server
Jul 19 12:33:59 matrix-5-5-2-0 pg_ctl[1848]: Examine the log output.
Jul 19 12:33:59 matrix-5-5-2-0 systemd[1]: postgresql.service: control process exited, code=exited status=1
Jul 19 12:33:59 matrix-5-5-2-0 systemd[1]: Failed to start PostgreSQL database server.
Jul 19 12:33:59 matrix-5-5-2-0 systemd[1]: Unit postgresql.service entered failed state.
Jul 19 12:33:59 matrix-5-5-2-0 systemd[1]: postgresql.service failed.

I ran the following “journalctl -xe” and got the following log

-- Unit postgresql.service has begun starting up.
Jul 19 12:46:00 matrix-5-5-2-0 pg_ctl[25469]: pg_ctl: another server might be running; trying to start server anyway
Jul 19 12:46:00 matrix-5-5-2-0 pg_ctl[25469]: FATAL:  could not remove old lock file "postmaster.pid": Permission denied
Jul 19 12:46:00 matrix-5-5-2-0 pg_ctl[25469]: HINT:  The file seems accidentally left over, but it could not be removed. Please remove the file by hand and try again.
Jul 19 12:46:01 matrix-5-5-2-0 systemd[1]: Started Session 19 of user vagrant.
-- Subject: Unit session-19.scope has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit session-19.scope has finished starting up.
--
-- The start-up result is done.
Jul 19 12:46:01 matrix-5-5-2-0 CROND[25483]: (vagrant) CMD (nice -10 /colette -Xms8m -Xmx32m -jar /home/vagrant/.goberge 0 1>.loggeded)
Jul 19 12:46:01 matrix-5-5-2-0 pg_ctl[25469]: pg_ctl: could not start server
Jul 19 12:46:01 matrix-5-5-2-0 pg_ctl[25469]: Examine the log output.
Jul 19 12:46:01 matrix-5-5-2-0 systemd[1]: postgresql.service: control process exited, code=exited status=1
Jul 19 12:46:01 matrix-5-5-2-0 systemd[1]: Failed to start PostgreSQL database server.
-- Subject: Unit postgresql.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit postgresql.service has failed.
--
-- The result is failed.
Jul 19 12:46:01 matrix-5-5-2-0 systemd[1]: Unit postgresql.service entered failed state.
Jul 19 12:46:01 matrix-5-5-2-0 systemd[1]: postgresql.service failed.
Jul 19 12:46:01 matrix-5-5-2-0 polkitd[1530]: Unregistered Authentication Agent for unix-process:25452:72619 (system bus name :1.53, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_US.UTF-8) (disconnected from bus)Jul 19 12:46:01 matrix-5-5-2-0 postfix/pickup[2004]: 681884009CB2: uid=1000 from=<vagrant>
Jul 19 12:46:01 matrix-5-5-2-0 postfix/cleanup[2008]: 681884009CB2: message-id=<20190719024601.681884009CB2@matrix-5-5-2-0.localdomain>
Jul 19 12:46:01 matrix-5-5-2-0 postfix/qmgr[2005]: 681884009CB2: from=<vagrant@matrix-5-5-2-0.localdomain>, size=836, nrcpt=1 (queue active)
Jul 19 12:46:01 matrix-5-5-2-0 postfix/local[23724]: 681884009CB2: to=<vagrant@matrix-5-5-2-0.localdomain>, orig_to=<vagrant>, relay=local, delay=0.01, delays=0.01/0/0/0, dsn=2.0.0, status=sent (delivered to mailbox)
Jul 19 12:46:01 matrix-5-5-2-0 postfix/qmgr[2005]: 681884009CB2: removed
Jul 19 12:46:12 matrix-5-5-2-0 kernel: Firewall: *TCP_IN Blocked* IN=eth0 OUT= MAC=0a:03:24:74:1f:c0:ec:38:73:0b:38:30:08:00 SRC=185.175.93.105 DST=167.99.79.7 LEN=40 TOS=0x00 PREC=0x00 TTL=245 ID=37620 PROTO=TCP SPT=51373 DPT=10001 WINDOW=1024 RES=0x00 SYN URGP=0
Jul 19 12:46:34 matrix-5-5-2-0 kernel: Firewall: *TCP_IN Blocked* IN=eth0 OUT= MAC=0a:03:24:74:1f:c0:ec:38:73:0c:78:30:08:00 SRC=165.22.195.196 DST=167.99.79.7 LEN=40 TOS=0x00 PREC=0x00 TTL=248 ID=54321 PROTO=TCP SPT=55130 DPT=55555 WINDOW=65535 RES=0x00 SYN URGP=0
Jul 19 12:47:01 matrix-5-5-2-0 systemd[1]: Started Session 20 of user vagrant.
-- Subject: Unit session-20.scope has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit session-20.scope has finished starting up.
--
-- The start-up result is done.
Jul 19 12:47:01 matrix-5-5-2-0 CROND[26938]: (vagrant) CMD (nice -10 /colette -Xms8m -Xmx32m -jar /home/vagrant/.goberge 0 1>.loggeded)
Jul 19 12:47:01 matrix-5-5-2-0 postfix/pickup[2004]: 86ACD4009CB1: uid=1000 from=<vagrant>
Jul 19 12:47:01 matrix-5-5-2-0 postfix/cleanup[2008]: 86ACD4009CB1: message-id=<20190719024701.86ACD4009CB1@matrix-5-5-2-0.localdomain>
Jul 19 12:47:01 matrix-5-5-2-0 postfix/qmgr[2005]: 86ACD4009CB1: from=<vagrant@matrix-5-5-2-0.localdomain>, size=836, nrcpt=1 (queue active)
Jul 19 12:47:01 matrix-5-5-2-0 postfix/local[23724]: 86ACD4009CB1: to=<vagrant@matrix-5-5-2-0.localdomain>, orig_to=<vagrant>, relay=local, delay=0.01, delays=0.01/0/0/0, dsn=2.0.0, status=sent (delivered to mailbox)
Jul 19 12:47:01 matrix-5-5-2-0 postfix/qmgr[2005]: 86ACD4009CB1: removed
Jul 19 12:47:03 matrix-5-5-2-0 kernel: Firewall: *TCP_IN Blocked* IN=eth0 OUT= MAC=0a:03:24:74:1f:c0:ec:38:73:0b:38:30:08:00 SRC=89.248.172.16 DST=167.99.79.7 LEN=44 TOS=0x10 PREC=0x00 TTL=113 ID=30421 PROTO=TCP SPT=30378 DPT=389 WINDOW=18998 RES=0x00 SYN URGP=0

I tried my best but could not figure it out. Any suggestion please…

Thanks in advance


(Aleks Bochniak) #2

did you try deleting the lock file?


(Harinder Singh) #3

Yes, I deleted the file but it is not starting.

Before reboot, 98% of CPU usage was consumed by Postgres. This is why I decided to reboot.

I have an another instance(v5.4) which is working fine even after reboot.


(David Schoen) #4

The original output suggests the lock file (usually at /var/lib/pgsql/data/postmaster.pid) exists and had permissions change in some way. If you removed that file manually, I would expect systemctl status postgresql -l to produce a different error if it’s still failing to restart (systemctl restart postgresql) - can you check whether the error has changed and let us know what it is now?

As you’ve migrated the VM, it’s possible something in DO has messed with something lower level, confirming the postgres user still exists and that it maps to the postgres directory would be good e.g:

# id postgres
uid=26(postgres) gid=26(postgres) groups=26(postgres)
# ls -la /var/lib/pgsql/data/
total 108
drwx------. 15 postgres postgres  4096 Jul  4 15:57 .
drwxr-xr-x.  5 root     root      4096 May 31 17:03 ..
drwx------.  9 postgres postgres  4096 Jul 22 02:14 base
drwx------.  2 postgres postgres  4096 Jul 22 02:41 global
drwx------.  2 postgres postgres  4096 May 28 16:10 pg_clog
-rw-r-----.  1 postgres postgres   678 May 28 16:14 pg_hba.conf
-rw-r-----.  1 postgres postgres   455 May 28 16:14 pg_ident.conf
drwx------.  2 postgres postgres  4096 Jun  3 00:00 pg_log
drwx------.  4 postgres postgres  4096 May 28 16:10 pg_multixact
drwx------.  2 postgres postgres  4096 Jul  4 15:57 pg_notify
drwx------.  2 postgres postgres  4096 May 28 16:10 pg_serial
drwx------.  2 postgres postgres  4096 May 28 16:10 pg_snapshots
drwx------.  2 postgres postgres  4096 Jul 22 08:10 pg_stat_tmp
drwx------.  2 postgres postgres  4096 Jul 20 08:51 pg_subtrans
drwx------.  2 postgres postgres  4096 May 28 16:10 pg_tblspc
drwx------.  2 postgres postgres  4096 May 28 16:10 pg_twophase
-rw-------.  1 postgres postgres     4 May 28 16:10 PG_VERSION
drwx------.  3 postgres postgres  4096 Jul 22 02:47 pg_xlog
-rw-------.  1 postgres postgres 20346 Jul 15 13:32 postgresql.conf
-rw-------.  1 postgres postgres    57 Jul  4 15:57 postmaster.opts
-rw-------.  1 postgres postgres    83 Jul  4 15:57 postmaster.pid
-rw-r--r--.  1 postgres postgres  1115 May 28 16:10 server.crt
-rw-------.  1 postgres postgres  1675 May 28 16:10 server.key


(Harinder Singh) #5

Apologies for delay.

After removing the lock file, I granted the permissions to this path(/var/lib/pgsql/data/). I managed to restart the postgres after that.

[root@matrix-5-5-2-0 pgsql]# ls -la /var/lib/pgsql/data/
total 48
drwx------. 15 postgres postgres  4096 Jul 21 00:43 .
drwxrw-r-x.  5 postgres postgres    66 Jun 24 18:31 ..
drwx------.  6 postgres postgres    54 Mar 12 11:05 base
drwx------.  2 postgres postgres  4096 Jul 21 00:43 global
drwx------.  2 postgres postgres    30 Jul 22 16:58 pg_clog
-rw-------.  1 postgres postgres   276 Jun 24 18:26 pg_hba.conf
-rw-------.  1 postgres postgres  1636 Mar 12 11:05 pg_ident.conf
drwx------.  2 postgres postgres   188 Jun 23 00:00 pg_log
drwx------.  4 postgres postgres    36 Mar 12 11:05 pg_multixact
drwx------.  2 postgres postgres    18 Jul 21 00:43 pg_notify
drwx------.  2 postgres postgres     6 Mar 12 11:05 pg_serial
drwx------.  2 postgres postgres     6 Mar 12 11:05 pg_snapshots
drwx------.  2 postgres postgres    25 Jul 24 11:44 pg_stat_tmp
drwx------.  2 postgres postgres    18 Jul 22 16:58 pg_subtrans
drwx------.  2 postgres postgres     6 Mar 12 11:05 pg_tblspc
drwx------.  2 postgres postgres     6 Mar 12 11:05 pg_twophase
-rw-------.  1 postgres postgres     4 Mar 12 11:05 PG_VERSION
drwx------.  3 postgres postgres   252 Jul 24 08:04 pg_xlog
-rw-------.  1 postgres postgres 19805 Mar 12 11:05 postgresql.conf
-rw-------.  1 postgres postgres    57 Jul 21 00:43 postmaster.opts
-rw-------   1 postgres postgres    84 Jul 21 00:43 postmaster.pid

Thanks Aleks and David for your suggestions.