sysadmin HQ

bridging the gap

The Top Six Pitfalls of Technology Workers

with one comment

Careers in technology can be an exercise in self defeat. Many of us in technical positions will find ourselves faced with numerous internal and external problems in the work place. One of the most challenging aspects of succeeding in a technical career path is having to step back from the technology and look at ourselves, coworkers and workplaces. The very aspects of technology that drive us and enable us to succeed in technical endeavors often blinds us from seeing and responding to “the soft side” of working – interaction, socialization and office politics.

#1. “I’m Indispensable”

It’s 7:45 PM and the entire office has already gone home, except for you. You’re busy resolving a critical issue to meet a deadline or to get your companies main assets back online. Long after everyone has left for their home and families the success of your employer rests solely on your shoulders. Or is it? With tight deadlines, non-technical managers and high levels of expectation it’s easy to fall into the trap of truly believing that your employer cannot survive without you.

IT Professionals are the creators and caretakers of the very technology that allows their employer to succeed and we’ve all been called upon to do the job that nobody else was capable enough to, or willing to complete. It’s no wonder that the number one pitfall is believing that you’re irreplaceable.

It’s a compound problem that plagues organizations across the world. Take the case of Terry Childs, the former San Francisco network administrator who held his department hostage by falling into this trap. The story quickly became a media frenzy with allegations of data tampering and ”millions of dollars in damages.” Terry was the sole administrator, designer and implementor of the network that he refused to allow anyone else to have administrative rights to. He truly believed that nobody else in his department was capable of managing the network and that was the way the network was managed since the time it was implemented. That attitude and misguided belief that he had the right to restrict and protect that network landed him jail.

So what went wrong? From day one of the implementation of  the city’s FiberWAN network Terry was the one who called the shots. He was the sole implementor and maintainer of the network and naturally that would result in him believing that he was the only one capable of maintaining it. That network should have been documented and the information shared among his coworkers. The passwords and access should have been treated as critical assets, more valuable than the network itself. Terry was the only one with administrative access and both his coworkers and managers knew it and everyone accepted that fact. His managers made a critical error in allowing a single person to hold the keys to a critical asset. The very fact that this was allowed not only is the root of the issue, but a indicator of others. What if Terry had been in an accident and was unable to communicate the passwords to anyone else? (The “Hit by a bus”, or beer truck, depending on which you prefer scenario.)

Terry thought he was irreplaceable and didn’t trust anyone; now he’s in jail.

#2. “Management doesn’t understand what we/I do.”

IT departments are often autonomous departments regardless of organizational size. This autonomy and freedom creates a systemic problem that often creates an organization wide view of IT as a brick wall and gives way to making the activities of the department or individual unknown. People don’t like what they don’t understand (whether this gives in to fear or not is a whole other topic.) When departments and people become a black hole of information the result is usually distrust. There cannot be trust without understanding.

This distrust often spreads between departments or individuals and creates black holes of information and silo’s of responsibility. When we should be working together it separates us further thus exasperating the problem. At the very first sign of a problem it is imperative that extra effort is made to head off the issue before it is allowed to grow.

So what can we do? Managers need to make sure that they have at least a rudimentary understanding of what their employees are doing and this requires both manager and employee to work together to establish that understanding. It can be as simple as spending 15 minutes at a whiteboard. Unfortunately many non-technical managers of IT departments do not take the time to understand what their employee’s are doing. As a manager I would never want to be caught in the position of being asked what my team was doing and be unable to answer.

The ability to explain a topic to someone who knows nothing about it is a direct indicator of your own knowledge of that topic.

#3. Dismissing Ideas / ”That will never work.”

We’ve all seen this one – a potential solution for a problem or an implementation detail is presented and someone immediately pipes up  stating “that will never work.” They’ve usually got a point too, but the point isn’t that it won’t work it’s that there is a concern; even if the prompter isn’t aware of it. Concerns are rooted in fear and misunderstanding and as such the number one cause for rejection of an idea is simply not understanding it. The root of the issue could be inexperience with the idea or solution presented, negative past experiences or other seemingly valid reasons in view of the prompter.

This is one of those situations where experience is sometimes a hinderance. We’ve been around long enough to know what works and what doesn’t but those same attributes which are the culmination of years of labour can also lead us in the wrong direction, limiting our view. People get comfortable with a way of doing things and deviation from that norm is foreign and uncomfortable. It’s uncomfortable because we don’t trust it. We’ve learned to rely on our experience and our knowledge so it seems counter intuitive to consider alternatives; after all our experience has been reliable and hasn’t posed a problem.

In cases like this it’s usually best to step back with an open mind as soon as soon as you feel yourself dismissing the idea. Being open to an idea doesn’t mean acceptance and it doesn’t mean you’re comfortable with it but being receptive and working together towards a goal will broaden your horizons and lead to you down a better path, a path you likely never considered.

#4. Seeing technology as the entire solution, as opposed to a tool that supports a solution.

When experienced technology professionals are asked for their input to a product or solution they more often than not immediately begin rattling off ideas in their heads about the programs or code they’ll have to write to support what they’re being asked. We’ve amassed enough experience and exposure in our careers to have a solution to a given problem that we can immediately begin thinking about how we’re going to implement it.

There are usually many good reasons for this, notably time constraints. It’s easier to rehash something you’ve done before than it is to fly back up to 10,000 feet and understand the intricate details what’s being asked of you before you dive into the details. The downside of this method of thinking is the danger of implimenting either an incomplete, or outright wrong, solution.  Lets face it, we as tech workers are in this line of work because (hopefully) we love technology, so of course we want to immediatly roll up our sleaves and start building.

It’s important to put the brakes on, and make sure you have a complete picture of the requirements and intended objectives of your project before you start building the tool. After all, it might be easier than you thought and you’ll be more of a team player than just a technical guru.

#5. “That’s non-standard, so I / my team can’t allow it”

Surprisingly, many organizations have no standards or policies defined at a management level. Maybe this isn’t so surprising, since some organizations view their IT infrastructure no differently than their plumbing or electrical infrastructure. At best, senior management delegates the creation of any IT policies to the resident floor manager or senior systems admin. You often get a “whatever you want is fine with me” response from managment, which unfortanatly results in a couple of problems. First, you have IT creating policies that directly impact business, when they likely don’t have a large enough view of the organization to truly gauge the impact of said policies. Second, since management didn’t participate in the creation of the policy, they have not bought into it and therfor the policy is toothless.

Now, this post is themed around the pitfalls of tech workers, not the companies they work for so we will focus on the first of the two points mentioned above. Many systems admins, especially those in charge of internal systems, live and die by the standards they have set forth for the operating environment. Whether it be the operating system, hardware or office automation suite, they refuse to even consider allowing something outside the norm into the environment.  This isn’t necessarily an invalid stance to take depending on the resources in your IT department and its mandate, but unfortunately it often stems from a fear of the unknown, or in the worst cases, plain laziness.  Absolutism is dangerous and often alienates people.  I’ve seen more than a few organizations that claim major “PR” problems between the IT department and the rest of the company.  A lot of the times these problems stem from IT being viewed as a hindrance towards getting things done as opposed to a valued resource.

#6. Platform Zealotry

As you may have guessed, Platform Zealtory (or PZ) means your one of those guys who lives and breathes one particular operating system or tool and will stick your nose up at anything and everything else under the sun. PZ is a great way to ensure your career will remain on a flat plateau.  This is because no one takes a Platform Zealot seriously.  PZ is synonymous with narrow views, rigid thinking and inflexibility.  It also is a sure fire way to demonstrate you can’t cope in a heterogernous environment. Sysadmins who get respect and advancement always bring a balanced, unbaised viewpoint.

It’s fine to have a preference and its fine to have opinions.  Just make sure you temper them with objective thinking, facts, and (at least) the appearance of an open mind.

Written by Adam Serediuk

March 4, 2009 at 9:31 pm

Cacti: Graphing custom script data by extending snmp.

leave a comment »

The true power of Cacti comes from its ability to gather data from sources outside of what snmp will give you out of the box.  By creating custom data input methods, you can graph everything and anything your heart desires – be it page load times for a URL, or the number of messages in a sendmail queue.

For a long time I would write custom data input methods (generally bash scripts employing a lot of sed and awk), and would then struggle to find ways to get my data into Cacti. I’ve used a lot of cheap hacks in the past, even going so far as simply having my shell script write to a file on a shared NFS volume. Then my “data input method” would be a shell script that would do “tail -1″ on the file! (oh my, the hackness!) .

Then I realized there is a better way. Cacti is a powerful snmp monitoring tool, and the net snmp agent is designed to be easily extendable to custom scripts. Combine the two, and you no longer have to worry about how you’re going to get your data points from your monitored host to your cacti server.

Let’s say you’re interested in monitoring a memory statistic on a Linux host that you can’t get from the built in mibs. For example, “HugePages_Free. “

Writing a shell script to grab this value is pretty easy, and you could go about it any number of ways. Here’s my kiss-principle based method:

#!/bin/bash

memvar=$1

#/usr/local/bin/meminfo.sh

memresult=`cat /proc/meminfo |grep $memvar |awk ‘{print $2}’`

printf $memresult

In the above example, I’ve chosen to pass whatever meminfo I want to graph as a command line argument, that way if I decide later I want something else the script is already set up. Now that bash scripting for 1st graders is over, we can get on with the business of sending the return value of this script to cacti.

Open up your /etc/snmp/snmpd.conf file, and append something like this at the bottom.

extend meminfo /usr/local/bin/meminfo.sh HugePages_Free

“Extend” is the directive, “meminfo”is an arbitrary name that can use as a referent later, and the last part is just the path to my shell script, including the command line argument I set the script up to accept.

The extend directive allows you to either specify an OID, or just let the value get returned to the nsExtendOutput1Table.  As you can see in my example, I chose the default table. However you do it is up to you. For more details, do a man on snmpd.conf.

 
After you’ve restarted you snmp daemon, its time for a quick test. Try using snmpwalk from your cacti server and see what you get back (wrapped for readablity)

>snmpwalk -c rocommstring -v1 remoteservername ‘NET-SNMP-EXTEND-MIB::nsExtendOutput1Line.”meminfo”‘

Should produce:

#NET-SNMP-EXTEND-MIB::nsExtendOutput1Line.”meminfo” = STRING: 2154

Note1 – inside the quotes we specified the name that we had used in our snmpd.conf file for this extension.

What you’re hoping for here is that the string you get back though snmpwalk matches what the real script outputs back on “remoteserver”. In my case, the system had 2154 huge pages free at the time I did this query.

Note2 – the use of single and double quotes in the snmpwalk command are significant. The command won’t work without them. Not an issue when we get to the cacti phase, as you’ll see shortly.

So, we know our script has been brought under the umbrella of snmp and can be referenced accordingly. Now over to cacti.

For starters, create a new data template.

If you’re a regular cacti user, the various fields here are familiar so I won’t go into them. The first one your interested in is “Data Input Method”. You will choose “Get SNMP Data”

snmp021
 

Depending on your cacti version, you might have to save the template at this point, and re-open it, in order to see the updated fields at the lower half of the screen. The next field we’re interested in is the OID field. We will input something very similar to what we used in the snmpwalk command; the difference is we need not worry about the single quotes.

snmp-012

… and there you have it. You can now proceed to creating your data sources, and associated graph templates just as you normally would for any other data source.

Footnote:

Most of the documentation I’ve seen on this topic all reference the snmp “exec” directive, instead of the “extend” directive. The exec directive works fine, but is deprecated in newer versions of snmp. If you happen to be using a system with an older version of snmp, you follow the same steps, but substitute “exec” for “extend” in your snmpd.conf file. These entries are not indexed by name, and are instead rooted under OID .1.3.6.1.4.1.2021.8.1. So, assuming you had a single exec value in your snmpd.conf, you would substitute .1.3.6.1.4.1.2021.8.1.101.1 for the oid string above.

Written by George Heppner

January 28, 2009 at 9:25 pm

Posted in Cacti, Utility Belt

Tagged with , ,

Batch Processing With Python

with 2 comments

System Administration problems often require a programatic approach; traversing the line between admin and developer. It’s more accurate refer to myself as an IT Technologist or System Infrastructure Engineer than a System Administrator at times. As the systems we work on become increasingly complex in both scale and technology the methods used to manage them change accordingly. 

Recently when faced with the task of processing over 500 million files (nearly 5TB) as a batch job I used the opportunity to exercise my developer side. Bash was by no means cut out to the task (have fun forking!) Having used Python extensively in the past to automate other system functions and most recently to interface with the excellent func libraries (which I highly recommend you explore, more on that later.) After spending a couple hours writing my own thread pool class for Python I came across an _excellent_ example from an  ActiveState contributer and immediately ditched mine.

The recipe from ActiveState provides a a great foundation for most threaded Python applications that does batch type processing. By creating a thread pool with callback functions one can easily insert tasks into a queue.  For my particular task I had a deep nested tree of directories and files that needed to be queued, processed and their output written. The most basic example follows:

#!/usr/bin/env python

"""
threadpool.py: ThreadPool Example.
	-i, --input	Input Directory
	-o, --output	Output Directory
	-t, --threads	Number of Threads
	-h, --help	Help
"""

import sys
import getopt
import os
import threading
import subprocess
import glob
from time import sleep

def usage():
	print __doc__

# ThreadPool recipe from ActiveState: http://code.activestate.com/recipes/203871/
# Ensure booleans exist (not needed for Python 2.2.1 or higher)
try:
    True
except NameError:
    False = 0
    True = not False

class ThreadPool:

    """Flexible thread pool class.  Creates a pool of threads, then
    accepts tasks that will be dispatched to the next available
    thread."""

    def __init__(self, numThreads):

        """Initialize the thread pool with numThreads workers."""

        self.__threads = []
        self.__resizeLock = threading.Condition(threading.Lock())
        self.__taskLock = threading.Condition(threading.Lock())
        self.__tasks = []
        self.__isJoining = False
        self.setThreadCount(numThreads)

    def setThreadCount(self, newNumThreads):

        """ External method to set the current pool size.  Acquires
        the resizing lock, then calls the internal version to do real
        work."""

        # Can't change the thread count if we're shutting down the pool!
        if self.__isJoining:
            return False

        self.__resizeLock.acquire()
        try:
            self.__setThreadCountNolock(newNumThreads)
        finally:
            self.__resizeLock.release()
        return True

    def __setThreadCountNolock(self, newNumThreads):

        """Set the current pool size, spawning or terminating threads
        if necessary.  Internal use only; assumes the resizing lock is
        held."""

        # If we need to grow the pool, do so
        while newNumThreads > len(self.__threads):
            newThread = ThreadPoolThread(self)
            self.__threads.append(newThread)
            newThread.start()
        # If we need to shrink the pool, do so
        while newNumThreads < len(self.__threads):
            self.__threads[0].goAway()
            del self.__threads[0]

    def getThreadCount(self):

        """Return the number of threads in the pool."""

        self.__resizeLock.acquire()
        try:
            return len(self.__threads)
        finally:
            self.__resizeLock.release()

    def queueTask(self, task, args=None, taskCallback=None):

        """Insert a task into the queue.  task must be callable;
        args and taskCallback can be None."""

        if self.__isJoining == True:
            return False
        if not callable(task):
            return False

        self.__taskLock.acquire()
        try:
            self.__tasks.append((task, args, taskCallback))
            return True
        finally:
            self.__taskLock.release()

    def getNextTask(self):

        """ Retrieve the next task from the task queue.  For use
        only by ThreadPoolThread objects contained in the pool."""

        self.__taskLock.acquire()
        try:
            if self.__tasks == []:
                return (None, None, None)
            else:
                return self.__tasks.pop(0)
        finally:
            self.__taskLock.release()

    def joinAll(self, waitForTasks = True, waitForThreads = True):

        """ Clear the task queue and terminate all pooled threads,
        optionally allowing the tasks and threads to finish."""

        # Mark the pool as joining to prevent any more task queueing
        self.__isJoining = True

        # Wait for tasks to finish
        if waitForTasks:
            while self.__tasks != []:
                sleep(.1)

        # Tell all the threads to quit
        self.__resizeLock.acquire()
        try:
            self.__setThreadCountNolock(0)
            self.__isJoining = True

            # Wait until all threads have exited
            if waitForThreads:
                for t in self.__threads:
                    t.join()
                    del t

            # Reset the pool for potential reuse
            self.__isJoining = False
        finally:
            self.__resizeLock.release()

class ThreadPoolThread(threading.Thread):

    """ Pooled thread class. """

    threadSleepTime = 0.1

    def __init__(self, pool):

        """ Initialize the thread and remember the pool. """

        threading.Thread.__init__(self)
        self.__pool = pool
        self.__isDying = False

    def run(self):

        """ Until told to quit, retrieve the next task and execute
        it, calling the callback if any.  """

        while self.__isDying == False:
            cmd, args, callback = self.__pool.getNextTask()
            # If there's nothing to do, just sleep a bit
            if cmd is None:
                sleep(ThreadPoolThread.threadSleepTime)
            elif callback is None:
                cmd(args)
            else:
                callback(cmd(args))

    def goAway(self):

        """ Exit the run loop next time through."""

        self.__isDying = True

def myTask(data):
	file = data[0]
	inputdir = data[1]
	outputdir = data[2]

	print "Procesing", file
	# Do Stuff.
	return data

def myTaskCallback(data):
	file = data[0]
        # Do Stuff. Check output status, cleanup, etc.
	print "Finished", file

def main(argv):
	try:
		opts, args = getopt.getopt(argv, "hi:o:t:",["help","input","output","threads"])
	except getopt.GetoptError, err:
		print str(err)
		sys.exit(2)
	for opt, arg in opts:
		if opt in ("-h", "--help"):
			usage()
			sys.exit()
		elif opt in ("-i", "--input"):
			inputdir = arg
		elif opt in ("-o", "--output"):
			outputdir = arg
		elif opt in ("-t", "--threads"):
			threads = arg

	if len(opts) < 3:
		usage()
		sys.exit(2)

	# Initialize the threadpool with the number of requested threads.
	pool = ThreadPool(int(threads))

	print "Starting..."
	print

	for file in os.listdir (inputdir):
		print "Queueing", file
		# Insert the tasks into the queue.
		pool.queueTask(myTask, (file, inputdir, outputdir) , myTaskCallback)

  # When all tasks are finished, allow the threads to terminate
	pool.joinAll()

if __name__ == "__main__":
	main(sys.argv[1:])

 

While this example doesn’t actually do anything it should provide a workable framework for your own tools. I’ve removed my particular code for privacy reasons for my client and replaced it with the skeleton you see here (I certainly don’t recommend queueing millions of tasks in a single process. While entirely possible you’ll have better performance by distributing that task.) You can quickly see how portable and extensible this type of framework is.

 

Till next time.

Written by Adam Serediuk

December 9, 2008 at 12:51 pm

Posted in Code, Operating Systems

On Communication, Ego and Perception

leave a comment »

Throughout the course of your working life, I predict you will encounter some of the greatest villains ever to step into the arena of office politics.  Conniving, self serving, petty excuses for human beings that want nothing more than to use your face for a rung as they scramble up the corporate ladder.

Sounds pretty grim doesn’t it? Well, I also predict that these villains will (the majority of the time) not exist outside the realm of your own imagination and perceptions.  Perhaps this make’s you breath a sigh of relief, assuming your taking me seriously of course, but these self conjured bad guys are every bit as dangerous to your career, well being and mental health. And they are a hell of a lot more common.

During my modest career, I’ve had the good fortune to work with two teams of sysadmins that I can genuinely say I enjoyed being around. You know, the kind of group that has that perfect blend of professional integrity, mutual respect and good humor that makes you actually enjoy going into the office.  I have also had the misfortune to witness both those teams disintegrate, and morph into the exact opposite – a tense group of unhappy sysadmins, questioning each others motives, harboring resentment and using up their sick days every month because the prospect of heading into the office makes you feel physically ill.

After stepping back and looking honestly at both these situations, I saw a pattern.

Lack of Communication –> Scewed Perceptions –> Assumptions –> Resentment –>  Lack of Communication –> Even more scewed perceptions –> etc. etc. (you can see the pattern I’m sure.)

Before I go further, let me explain why I’m writing an amateurish pseudo-psychology article on this technology focused sysadmins web site.  In a nutshell, I think that the industry of technology is fertile territory for this particular phenomenon. My intent is not to stereotype all techies as introverts who don’t know how to manage simple interpersonal relationships (but lets face it, they’re out there, and in large numbers). I do believe however that certain elements of that stereotype, combined with the types of communication tools popular with our crowd make sysadmins more susceptible to this danger than others.

The HR person at my last organization sent out an article talking about email flame wars.  It’s here if you want to read it.  The gyst of the article was that we generally misinterpret the tone of email 50% of the time. That’s pretty significant, and I can sure believe it, speaking as someone who’s been on both ends of such an exchange.   Think about it.  Ever worked with someone who you didn’t know well, and had very little face time with? Maybe you’ve been in a meetings with him/her and had differing opinions on how to proceed on an issue. Ever had an email from that guy (or gal) that rubbed you the wrong way?

How did you react?

Did you walk over to their desk and talk to them about it – clear the air so to speak?  Or did you take your interpretation of that email and use it to continue building your ongoing personality profile on this individual? If you chose the former I applaud you – your probably not the kind of person to fall into this trap.  If your like a lot of people though, you probably couldn’t resist speculating wildly about this persons motivations. Why would they send you such a hostile sounding email? And why did they CC their manager? Was it just ass covering? Or are they trying to make you look like an idiot in front of middle management (god no !).  It doesn’t take long and this line of thinking will have you convinced this person is out to get you.

The reality is, you have no idea what someone’s motivations are until you talk to them about it.  Not everyone is receptive to this of course, as I said there are plenty of techies out there who turn to stone when pressed to interact in real-time with someone.  None the less, I must stress how important it is to remember that, in all likelihood, the guy across the office floor isn’t sitting in his cubicle maniacally twirling his villains mustache and plotting your downfall.  He’s just a regular guy like you, dealing with his own hang ups and scewed perceptions of the world.

Call him on it. If he turns to stone on you, then at least you can say you took the higher ground and tried to get to the heart of the matter. And above all else, remember that thoughts are not reality. Don’t be a slave to your own perceptions of others and yourself.

Written by George Heppner

October 5, 2008 at 2:28 pm