Tuesday, April 9, 2013

Marching Off the Map

The title is not a new one, but it is a great image of what I feel like the Devops community is doing. The Map is the way businesses have been run for the last 100 years and which the IT industry adopted in the 80's and was mostly adopted even during the dotcom days. In the 80's and 90's Enterprise is what everone wanted. In the 00's as the large web operations started growing "Enterprise" became a dirty word among the cool kids. Now, Enterprise does describe some very large companies, but many of the Enterprise ways are in many smaller companies (generally older ones). Damon Edwards used the term "Classic Organization" and I think that is a much more inclusive and less emotionally charged term than Enterprise, so I will use that term to mean "Orgainzations operating with the culture and processes akin to Enterprise". Classic Organizations are the epitome of the "before" picture in the Devops transformation. Devops (building on Lean and Agile and others) is marching off the map of business models and, I think, incoporating much of the best of the past into new models to lead us into the future.

Recently, I realized my personal life has been paralleling my professional life in many ways. I'm seeing the core principals of Devops echoed throughout my life. Many people are discovering that things happening in the tech industry will work in running a household or other community too. Not sure if this is behind a paywall but the Wall Street Journal ran a story by Bill Gates in January where he describes what sounds a lot like Lean thinking as a solution to fixing global problems. http://online.wsj.com/article/SB10001424127887323539804578261780648285770.html and some interesting replies that generally uphold Lean principles and illustrate the challenges of applying Lean in a "Classic" culture http://online.wsj.com/article/SB10001424127887324156204578275993802414124.html. There are also many articles around the internet on running a household on Agile principles.

Then I heard a few podcasts from Growing Leaders describing the need to look for new ways of communicating with and educating young people today.http://growingleaders.com/blog/podcast-7-an-interview-with-dan-pink/, http://growingleaders.com/blog/podcast-8-the-benefits-of-a-gap-year/. One theme they share is that in school you are measured on (roughly) 75% IQ and 25% EQ, but in the workforce the proportions are reversed. This tweet illustrates that shift.
The conclusion is that school is not teaching people how to be productive workers. For Devops and Lean to work there needs to be more focus on EQ development in people. It is said that your IQ is relatively fixed from birth, but that EQ can be trained and developed. When you have your technical people thinking more with their "Right Brain" (Big Picture, Context, Synthesis) you should see the culture fall into place much easier. The "Left Brain" logical, analytical stuff is so easy, probably too easy, that we use it as a crutch to not work on peope, culture, empathy, systems thinking, and such. Just read some of the stories about the Etsy Hacker Grant program and its effects. "Right Brain" thinking can be developed and learned.

This post is rambling a bit through multiple topics but my main point is that I feel Devops is on the right path because its driving principles are echoed throughout life and so many cultures. I put a lot of weight on "uncommon, common sense" where you re-discover eternal truths that are built into human nature (respect, empathy, purpose, quality) and build on top of those. The name "Devops" doesn't really matter and will pass away, but the principles behind it should always be the foundation of all we do. I'm marching off the map at work and marching my kids off the map in their education at home. It's a little scary, but exciting to be doing something new and discovering a vibrant community around you to let you know you are not alone.

Wednesday, January 23, 2013

Upgrading Chef Server from 0.10.8 to 10.18.2

Here is my story of upgrading Chef from 0.10.8 to 10.18.2 while moving to a new server and updated OS. Someone please comment and tell me where I may have been able to do better.

So, we are running Chef Server 0.10.8 on CentOS 5.4 with Ruby 1.8.7 and I want to upgrade to latest release of Chef and go to Centos 6.3 and Ruby 1.9.3 at the same time. So I couldn't just do an in-place upgrade on the existing server. I needed to migrate my Chef server to a new system and upgrade everything.

My first plan was to take the cautious route since I wasn't sure if Chef could be updated that many revs, so I tried to export all data as JSON, build a new Chef 10.18.2 server, then import all the JSON. It worked perfectly EXCEPT all the client's couldn't authenticate to the server even though I imported its public key. I could create a new client key in the Chef server and the node could authenticate, but it wouldn't with an imported key. I spent about a day on this to no resolution. Maybe someone else will have better results.

Next I tried to just copy the couchDB database. Unfortunately I flubbed things up a few times and spun my wheels for a few days because things didn't work (mostly my fault). Finally I found this method that works:

1) Compile Ruby 1.9.3 and rubygems 1.8.23
2) Install Chef via chef-solo http://wiki.opscode.com/display/chef/Installing+Chef+Server+using+Chef+Solo
3) Fix for the CentOS 6.3 bug for rabbitmq init documented https://bugzilla.redhat.com/show_bug.cgi?id=878030. We decided to change the rabbitmq init script to work around the bug

CONTROLPROG=/usr/sbin/rabbitmqctl
CONTROL="sudo -u ${USER} ${CONTROLPROG}"

4) Add the chef queues because rabbitmq was broken when chef-solo tried to do it. And change the solr maxfieldlength to 100000 to work around the problem of indexing nodes with lots of attributes.

/usr/sbin/rabbitmqctl add_vhost /chef
/usr/sbin/rabbitmqctl add_user chef testing
/usr/sbin/rabbitmqctl set_permissions -p /chef chef ".*" ".*" ".*"

ex /var/lib/chef/solr/home/conf/solrconfig.xml
:%s/<maxFieldLength>10000/<maxFieldLength>100000/g
:wq

5) Shut down couchdb and rename /var/lib/couchdb/chef.couch to chef.couch.bak
6) Copy the couchdb database from the old server (can still be running)
7) chown chef.couch to be "couchdb:couchdb"
8) Start couchdb back up
9) Start rabbitmq, chef-solo, chef-expander, chef-server, chef-server-webui (in that order)

Now I wish I could say it's working at this point, but all the cookbooks are broken. Maybe someone from Opscode can tell me where the cache is of cookbook files that can be copied. But I tried to load all the cookbooks with a knife cookbook upload -a -d, but that still didn't give me working cookbooks. In the UI when you click a cookbook it says "end of file reached" and has no data. The name and version are there, but no contents. I had to knife cookbook delete -p each cookbook, then when it was added back MOST of the cookbooks worked. Some still gave the "end of file" and I had to purge them one-by-one and upload one at a time.

I hope this helps someone. Let me know if you want more detail on any step. I still haven't done extensive testing on the new server but the few clients I tested seem happy. I'm really looking forward to an omnibus install for Chef Server 11 and hope the migration is not painful.