Wednesday, April 9, 2014

Installing OpenTSDB 2.0 "next" on an HBase cluster in Amazon's ElasticMapReduce (EMR) Service

The remainder of this post will assume you've already gotten an HBase cluster installed on EMR.

Note:  This is not intended as an endorsement of either the performance nor the cost-effectiveness of using EMR to back your OpenTSDB deployment.  Rather, there are definite use cases for being able to quickly bring up a cluster and install OpenTSDB on it, so if you have one of those, this should get it done.


  1. ssh to the master server in your cluster.  If you took the defaults when creating your cluster, you can identify the master in the console by the name of the security group, which will be either ElasticMapReduce-Master or ElasticMapReduce-Slave:

         
    ssh -i nameOfYourKeyPair.pem hadoop@emrMasterPublicDnsName
  2. telnet to the local zookeeper port to make sure zookeeper is installed and running, e.g.:

         telnet localhost 2181and you should see:

         Trying 127.0.0.1...
      Connected to localhost.
      Escape character is '^]'.which means your telnet was successful.  Now enter:
      stats
    and you should see:

       Zookeeper version: 3.4.5-1392090, built 09/30/12 17:52 GMT
       Clients:
        /10.XXX.XXX.XXX:50333[1](queued=0,recved=4048,sent=4048)
        /10.XXX.XXX.XXX:50360[1](queued=0,recved=4052,sent=4052)
        /127.0.0.1:37169[0](queued=0,recved=1,sent=0)
        /10.XXX.XXX.XXX:50343[1](queued=0,recved=12209,sent=12212)
        /10.XXX.XXX.XXX:50331[1](queued=0,recved=4053,sent=4054)
        /10.XXX.XXX.XXX:45165[1](queued=0,recved=12174,sent=12175)
        /10.XXX.XXX.XXX:45160[1](queued=0,recved=4079,sent=4087)
        /10.XXX.XXX.XXX:50350[1](queued=0,recved=4259,sent=4260)
        /10.XXX.XXX.XXX:45168[1](queued=0,recved=12174,sent=12175)
        /10.XXX.XXX.XXX:36065[1](queued=0,recved=4057,sent=4057)
        /10.XXX.XXX.XXX:50336[1](queued=0,recved=8114,sent=8114)
        /10.XXX.XXX.XXX:50332[1](queued=0,recved=4051,sent=4051)
        /10.XXX.XXX.XXX:50335[1](queued=0,recved=4140,sent=4153)
        /10.XXX.XXX.XXX:50337[1](queued=0,recved=20258,sent=24309)
        /10.XXX.XXX.XXX:36062[1](queued=0,recved=4077,sent=4085)

       Latency min/avg/max: 0/1/206
       Received: 101746
       Sent: 105832
       Connections: 15
       Outstanding: 0
       Zxid: 0x1ff1
       Mode: standalone
       Node count: 34
       Connection closed by foreign host.

    So far, so good.
  3. Install git:

         sudo yum install git
  4. In the directory above the location you'd like to have the OpenTSDB repo live, run:
         git clone https://github.com/OpenTSDB/opentsdb.git
  5. To build OpenTSDB, we need to add some more basic dev tools to the base system.  The following command is the nuclear option for adding dev tools:
         sudo yum groupinstall 'Development Tools'
  6. And we also need gnuplot:

         sudo yum install gnuplot
  7. Change directory into the directory containing OpenTSDB:
         cd opentsdb
  8. Pull all of the branches from github:
         git fetch
  9. Checkout "next":
         git checkout next
  10. Now you've got the right code and the right tools.  Let 'er rip!:
         ./build.sh
  11. When complete, change directory into the build directory:
         cd ./build
  12. To verify success, you're looking for the .jar file created by the build process.  In this case, we've built a file named tsdb-2.0.0.jar and a script named tsdb.
         ls tsdb*
  13. And with that, we now have OpenTSDB built and ready to install.  To install it, run:
    sudo make install

  14. Now we'll create the required tables in HBase (with compression).  The good folks at OpenTSDB made this easy by providing a script that does the heavy lifting for us.  By default, this script will enable compression for your HBase tables.  First we'll change directory to the location of the script, then execute it:

    cd ../src

    ./create_table.sh 


  15.  With the tables created, we can configure the OpenTSDB process by editing the opentsdb.conf file and moving it into a place that the process can find it:

    vi ./opentsdb.conf

    and give appropriate values to the following variables (safe recommendations noted in italics, but you should provide answers appropriate for your system/configuration):

    tsd.http.cachedir = /dev/shm/tsdtsd.http.staticroot = /home/hadoop/opentsdb/tsd.storage.hbase.zk_quorum = localhost
     and move the file to a recognized configuration directory:

    sudo mv ./opentsdb.conf /etc  



  16.  Now that you're fully configured, you can start OpenTSDB:

    cd ..

    ./build/tsdb tsd 


  17. You now have OpenTSDB running and receiving requests (both telnet and HTTP!) on port 4242 (assuming you took the defaults)

Sunday, March 9, 2014

How I Learned to Stop Worrying and Love Daylight Savings Time

I woke up on this, the ugliest calendar-imposed-lack-of-sleep mornings feeling great.  No -- even better than that.  I'm going with "excellent".  And I was commiserating with friends last night about how this "spring forward" things sucks.  And despite that, I accidentally stumbled into the secret combination to make "losing an hour sleep" work for you.

First, sleep in.  Get a little extra sleep over your typical workday.  Since I've been getting about 5 hours sleep a night (or less) for the past many months, getting more than this wasn't a problem.  I've also been trying to sleep in a little on the weekends to make up for particular arduous, sleep-deprived weekdays.  I ended up getting about 9 hours last night, which was a critical part of this whole "feeling good in the AM" thing.

Second, get an Internet-connected alarm clock.  A smartphone will work here, but I happen to use a now-discontinued-and-out-of-business Chumby.  This is important for two reasons:

  1. You never have to go through the depressing "I've got to set my clock forward and this is going to suck in the morning" ritual right before falling asleep, and 
  2. When you wake up in the morning and look at your clock, you think you've gotten even more sleep than you expected because it automatically reset the time when it was supposed to.  Having a poor memory doesn't hurt here at all.

So about two hours after I woke up, my oldest started making his way around the house resetting clocks, and wasn't until then that I remembered I'd lost an hour, but I was already up, productive, and feeling good before it could slow me down.  And now I'm pumped that I beat Daylight Savings Time!

A good night's sleep FTW!