Category Archives: Linux

NetworkManager CentOS 7.. Everything is fine

Anyone else getting tied up with NetworkManager on CentOS 7?

Example, building out a new machine with 2 interfaces, bonded and on a VLAN. When the machine boots up the network doesn’t start because you see:

RTNETLINK answers: File exists

Ouch..

if you run

service NetworkManager Status you then see something like

Oct 30 07:16:06 thisisfine.mydomain.com NetworkManager[2826]: <info> Failed to activate ‘Vlan bond0.10’: Connection ….ime

If you try and restart with “service network restart” it fails..

Since this is part of a bond, i checked out the interfaces attached – I first looked at my em3 adapter

[root@thisisfine network-scripts]# cat ifcfg-em3
BOOTPROTO=”none”
DEVICE=”em3″
HWADDR=”ec:f4:zz:e4:zz:b4″
ONBOOT=yes
PEERDNS=no
PEERROUTES=no
NM_CONTROLLED=no
MASTER=bond0
SLAVE=yes

Then I checked out the other adapter in the bond – em4

[root@thisisfine network-scripts]# cat ifcfg-em4
BOOTPROTO=”none”
DEVICE=”em4″
HWADDR=”ec:f4:zz:e4:zz:b5″
ONBOOT=yes
PEERDNS=no
PEERROUTES=no
NM_CONTROLLED=no
MASTER=bond0
SLAVE=yes

Looks good, nothing out of the ordinary. I even ran “ip link” and grepped for UP and em3/em4 were the adapters with link up.

2: em3: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT qlen 1000
4: em4: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT qlen 1000

Next I checked out the bond0 config..

[root@thisisfine network-scripts]# cat ifcfg-bond0
BOOTPROTO=”none”
DEVICE=”bond0″
ONBOOT=yes
PEERDNS=no
PEERROUTES=no
DEFROUTE=no
TYPE=Bond
BONDING_OPTS=”miimon=100 mode=active-backup”
BONDING_MASTER=yes
NM_CONTROLLED=no

looks good..

aww.. here we go.   my vlan on bond0..

[root@thisisfine network-scripts]# cat ifcfg-bond0.10
BOOTPROTO=”none”
IPADDR=”10.10.10.10″
NETMASK=”255.255.252.0″
GATEWAY=”10.10.10.1″
DEVICE=”bond0.10″
ONBOOT=yes
PEERDNS=no
PEERROUTES=no
VLAN=yes

Apparently my kickstart doesn’t add NM_CONTROLLED=no to the vlan on a bond.. off to fix.

[root@thisisfine network-scripts]# cat ifcfg-bond0.10

BOOTPROTO=”none”

IPADDR=”10.10.10.10″

NETMASK=”255.255.252.0″

GATEWAY=”10.10.10.1″

DEVICE=”bond0.10″

ONBOOT=yes

PEERDNS=no

PEERROUTES=no

VLAN=yes

NM_CONTROLLED=no

That worked..

However, it just seems that NetworkManager is the culprit.. if you simply do “service NetworkManager stop” and do “service network restart” everything is fine. (not really..)

Anyone else running into similar issues with NetworkManager? Are you simply disabling it entirely by turning it off? are you migrating to using it or are you managing what NetworkManager controls and using it for some things?

Sometimes I really miss the simplicity of just defining an interface and knowing it’s there.. I understand NetworkManager’s role on a desktop/workstation to manage wifi access and “automagically” manage your devices, but on Servers? What is your take? Are you disabling it if you don’t use DHCP? can it safely be disabled or should the tooling grow up around it and let it manage what it expects to manage?

Tagged ,

Puppet & Oracle Enterprise Linux – When Modules Fail

Puppet & Oracle Linux – Fixing those failing manifests!

  • OS: Oracle Enterprise Linux 6.3 / 6.4
  • Puppet: Puppet 3.1.1
  • Facter 1.6 or Higher

Puppet runs great on OEL 6, there really isn’t any pain to implementing it until you get to installing some modules – a lot of them simply don’t know what OEL or OracleLinux is. The fix however, is pretty straightforward and easy if you run current versions of facter and puppet.

First thing you want to do, is validate that you have a current version of facter that has osfamily fact.

[bmiller@puppet ~]$ facter | grep osfamily
osfamily => RedHat
[bmiller@puppet ~]$

If you have the osfamily support, you will want to edit the manifest that is failing. We will use The Foreman for an example.  The Foreman 1.1 has multiple setup files that implement the following code: (you will often find these in install.pp manifests or if you run “puppet agent -t -noop” it should show the failing manifest during a puppet run)

class apache::install {
  case $::operatingsystem {
    redhat,centos,fedora,Scientific: {
      $http_package = 'httpd'
    }
    Debian,Ubuntu: {
      $http_package = 'apache2'
    }
    default: {
      fail("${::hostname}: This module does not support operatingsystem ${::operatingsystem}")
    }
  }

  package { $http_package:
    ensure => installed,
    alias  => 'httpd'
  }
}
This code fails miserably that “OracleLinux  isn’t a supported Operating System. The fix? Instead of looking for each and every type of RedHat Distro and adding more OS Variants to check for (some have a LONG list), use facter to get the $osfamily.  It is as simple as this code snippet:
class apache::install {
  case $::osfamily {
    RedHat: {
      $http_package = 'httpd'
    }
    Debian: {
      $http_package = 'apache2'
    }
    default: {
      fail("${::hostname}: This module does not support operatingsystem ${::osfamily}")
    }
  }

  package { $http_package:
    ensure => installed,
    alias  => 'httpd'
  }
}

I’ve submitted the fixes to the foreman, so future versions can install without a hitch and I hope other developers start implementing the “osfamily” check in lieu of “operatingsystem” checks. The osfamily check makes cleaner code and happier admins 🙂

Hope this helps other people running into issues on OEL / Oracle Linux / OracleLinux (Or whatever they want to call it!)

%d bloggers like this: