Centos 7.2 (Build 1511) installer unexpectedly quits sometimes, when installing via network with kickstart file

Question

I have been experiencing an intermittent problem when installing CentOS 7 from a USB stick, using the network installer. The kickstart file is found via URL and installation proceeds normally, through setting up the drives. After the screen switches to "starting installer," sometimes the installer will immediately quit and reboot (unless I give the inst.nokill option, where it will halt instead of rebooting). Sometimes the process works correctly without any change in the procedure I follow. I managed to save the log files in /tmp from one such problem occurrence and found nothing in there to indicate what went wrong. To diagnose this problem, what should I be looking at? I am willing to post logs etc. but I would like to know what is the most useful to post. A colleague of mine has also been experiencing the same problem with an entirely independently-created kickstart file installed using the netinstaller from a DVD.

Here is my kickstart file (changed only slightly to not give out my root password hash):

# Automatically generated file. DO NOT EDIT DIRECTLY. Instead, edit the source
# files that are used to create this file.
#

install
lang en_US.UTF-8
keyboard us
network --onboot yes --device eth0 --bootproto dhcp --noipv6
timezone --utc America/New_York
rootpw  --iscrypted xxx
selinux --disabled
authconfig --enableshadow --passalgo=sha512 --enablefingerprint
firstboot --disable
%include /tmp/ks-platform
part /boot --fstype="ext4" --size=500
part pv.1 --fstype="lvmpv" --size=500 --grow 
volgroup vg1 pv.1
logvol / --vgname=vg1 --size=500 --grow --fstype=ext4 --name=root --label="Fedora"


# Current releases
url --url="http://mirror.centos.org/centos/$releasever/os/$basearch"
repo --name=epel --baseurl=http://dl.fedoraproject.org/pub/epel/$releasever/$basearch/

# CentOS-specific stuff
eula --agreed
graphical
xconfig --startxonboot
%packages
@base
@core
@^graphical-server-environment
@network-file-system-client
@networkmanager-submodules
@x11
epel-release
epel-release.noarch
cinnamon
kernel-devel
kernel-headers
yum-plugin-priorities
gdb
strace
gcc
-gnome-initial-setup
%end
%pre
#!/bin/bash -x
#
# Changes made at runtime are all done here

export PATH=$PATH:/mnt/sysimage/sbin:/mnt/sysimage/bin

f=/tmp/ks-platform
rm -f $f

radeon=0
nvidia=0
apple=0
drive=sda

lspci | grep -q -i radeon
if [[ $? == 0 ]]; then radeon=1; fi

lspci | grep -q -i nvidia
if [[ $? == 0 ]]; then nvidia=1; fi

grep -q -i "Apple Inc" /sys/firmware/dmi/entries/*/*
if [[ $? == 0 ]]; then apple=1; fi

cat /proc/partitions | grep -q -i nvme0n1
if [[ $? == 0 ]]; then drive=nvme0n1; fi

echo clearpart --initlabel --drives=$drive --all >> $f
net_device=($(cat /proc/net/dev | grep : | grep -v lo: | sort -n -r -k2 | sed -e 's,:.*,,'))
for g in "${net_device[@]}"; do
  echo network --bootproto=dhcp --device=$g --noipv6 --activate --onboot yes >> $f
done
echo firewall --enable --trust=${net_device[0]} >> $f

if (( $apple )); then
  # Apple needs special macefi partition type
  echo part /boot/efi --fstype=\"macefi\" --size=200 --label=\"Linux HFS+ ESP\" >> $f
else
  echo part /boot/efi --fstype=\"efi\" --size=200 --label=\"Linux HFS+ ESP\" >> $f
fi

if (( $nvidia )); then 
  # nvidia needs to disable kernel mode setting with nouveau
  echo bootloader --location=mbr --driveorder=$drive --boot-drive=$drive --append=\"nouveau.modeset=0\" >> $f
else
  # Most use default autodetected driver (radeon, intel)
  echo bootloader --location=mbr --driveorder=$drive --boot-drive=$drive >> $f
fi
%end
enter code here

If the ks file never changes and you have intermittent issues, I bet you will find after increasing the logging HBruijn provided, that anaconda isn't waiting for interface negotiation long enough and sends premature dhcp requests. If that is the case, there are 3 flags you can pass to the installer. `linksleep=30 nicdelay=30 dhcptimeout=120` At least, that is the intermittent issue I often run in to with ks on auto-neg interfaces and dhcp. — Aaron, Sep 15 '16 at 14:53
This may be the case, but it seems unlikely since the kickstart is obtained over the network and parsed correctly; it's only when the packages start to be installed that it quits. I rely on dhcp to get the network up to get the kickstart file. — Eric Sokolowsky, Sep 15 '16 at 17:16
That's what I used to think too, but there are actually 2 network interface resets and 2 dhcp requests that occur, one for pxe and one for anaconda. It's the second one that usually fails, depending on the network gear used and auto-neg time. — Aaron, Sep 16 '16 at 18:01
I tried using the linksleep, nicdelay, and dhcptimeout parameters and it did not seem to affect the reliability of the installer. Any other suggestions? — Eric Sokolowsky, Oct 26 '16 at 15:14
I think we would need more details around the physical and logical configuration of your network. e.g. any special bonding options or access+vlan config on the ports, anything forced on/off, any non defaults on the access switches, anything special about your dhcp? e.g. dhcp network relay ... and so on. — Aaron, Oct 26 '16 at 19:13

HBruijn · Answer 1 · 2016-09-15T13:56:29.120

Is this your actual download URL?

url --url="http://mirror.centos.org/centos/$releasever/os/$basearch

Or have you obfuscated that part of your config as well?

Because although in theory you can do a kickstart install from any random internet server, that is not the best idea. Most people set up their own mirror (a NFS share or a trivial web server with a copy of the RPM tree from installation DVD) which then they can access at LAN speeds.

That will make you deployments go faster and will behave much more consistently.

mirror.centos.org is a round-robin DNS record (probably geo-targeted) so one install might get an extremely fast mirror, the next install might get another one that is much slower, making your install that much slower as well.

BTW you can use the ALT + F1-F6 keys to open the alternate consoles during your installation to monitor progress. You can increase verbosity by upgrading the log level to debug with an option your kickstart .

logging --level=debug

Would this cause it to abruptly fail without a message? Or if there is a message, where might it be? While your advice is good, I do not see how it answers my question about the installer quitting. — Eric Sokolowsky, Sep 15 '16 at 13:44
I posted a couple of links to the RHEL 7 manual which also apply to CentOS — HBruijn, Sep 15 '16 at 13:57

Centos 7.2 (Build 1511) installer unexpectedly quits sometimes, when installing via network with kickstart file

1 Answers1