Showing posts with label smart data center. Show all posts
Showing posts with label smart data center. Show all posts

Wednesday, May 29, 2013

Dtrace training -- its worth it!

While its fresh in my mind, I wanted to cover the wondering Dtrace training I had with Max Bruning and Brendan Gregg. The three day course is occasionally offered by Joyent, and here is an example synopsis. Not only do you get the meaty book on the subject, but the workbook and the labs you do are challenging. Sadly I came in a not too distant second on the 'ole leader board for solving the most labs.

What is more interesting was how quickly I found myself using the tool after the class. Although I heavily utilize KVM instances in Joyent's SDC, I got a request the following day as students were approaching "tape out" for their chips. Various students weren't cleaning up after themselves, with simulations running endlessly. The usual quip that "the internet is slow" or some such came quickly, in this case the statement that the network to and from the primary VM for some Ansys tools was very slow. Looking inside this 16 core, 64GB VM running Linux, it was not entirely apparent what was going wrong. The load was high, and in the end, I could have done some basic thread counting and discovered the answer if I knew where to look. However, I was able to use Dtrace in an opaque way and see from the hardware node that the qemu/kvm process was using its 16 cpu threads full tilt, and it was not I/O bound. Looking back into the Linux VM, it took little time to find that it was beyond the 16 CPU cores, trying to run at least 17 full time tasks. One task had over 20GB of memory mapped, and each time it yielded execution time and reacquired its CPU, it would undoubtedly spend extra effort in dealing with its large memory payload.

Killing off just one process here resolved the performance problem. VMs are sort of the worse case scenario for Dtrace, but it still guided me to the solution. The trick with the book is that one needs real world examples and enough practice to get the specific predicates and formation of queries down. Once you are over the hump, and I'm not sure I'm there yet, you'll find Dtrace to be indispensable.

Friday, May 03, 2013

How To: Pluribus NAT Routing

Its no secret that at Stanford we do a lot with OpenFlow. We get to play with some new and interesting stuff that we integrate into our OpenFlow network. One of these is the Pluribus Network switch, which combines system and network virtualization with a high bandwidth 48+ port 10GB switch fabric. We have been running this in our network, and for months it has been handling the heaviy lifting duties for our SmartOS-based private cloud.

Various features including OpenFlow functionality have been tested, but the products user interface is still being crafted and changes some what over time. Recently, we needed to enable NAT routing for the private administrative network for the SmartOS private cloud. This network is not attached to a router interface, and applying something outside the network fabric to enable NAT or routing will create an undesired point of failure. Pluribus has full routing functionality tied to their virtual network capability. Here is the current command sequence used to enable routing between the private 10.0.x.0/16 administrative address space (could be larger) to an external routable network. I've added the VLAN to attach externally as VLAN 4444, and the fabric name is sdc-global:


> nat-create name sdc-global-gateway vnet sdc-global
> nat-interface-add nat-name sdc-global-gateway ip 10.0.27.1/24 if data
> nat-interface-add nat-name sdc-global-gateway ip 172.20.1.1/24 if data vlan 4444
> nat-map-add nat-name sdc-global-gateway name sdc-global-nat ext-interface sdc.global.gateway.eth0 network 172.20.1.0/24

sdc.global.gateway.eth0 should be the external port, as seen from "nat-interface-show"

UPDATE: A bug when first did this prevents the zone managing the NAT from having a correct default gateway. You'll need shell access and "zlogin sdc-global-gateway" or the like to enter the zone, add add /etc/defaultrouter with the IP of that router there for future use. Then you can exit the zone and run "zoneadm -z sdc-global-gateway reboot" to get it working.

Thursday, April 04, 2013

Joyent SDC 6.5.6 released -- Upgrade workaround

Just a heads up that Joyent has released Smart Data Center 6.5.6 as noted here:

http://wiki.joyent.com/wiki/display/sdc/Upgrading+SDC+6.5.3+or+6.5.4+to+SDC+6.5.6

First upgrade attempt fails at the very end when selecting the correct platform. Joyent Support noted that it has seen this before, and that a "sdc-restore" from the pre-upgrade backup and then a reattempt should work. In my case, it did just that. I did the quick restore (no -F here). Rebooting the head node as I write this.

Tuesday, April 02, 2013

Save time in backing up Joyent Smart Data Center

One does not frequently backup the head node USB key with Joyent's Smart Data Center. Generally you do it prior to upgrades. Therefore, its commonly a "do it twice" process as it has a quirky bug with regards to terminal emulation that I never seem to remember until its too late:

[root@headnode (CIS:0) ~]# sdc-backup -U c2t0d0p0
Disk c2t0d0p0 will relabled, reformatted and all data will be lost [y/n] y
labeling disk
creating PCFS file system
mounting target disk c2t0d0
mounting source disk c0t0d0
copying files
setting up grub
Sorry, I don't know anything about your "xterm-256color" terminal.
Error: installing grub boot blocks

Yep, my OSX default xterm-256color is not known, and the many-minutes long backup process dies at the end. To address this, override the terminal setting in root user's .bash_profile file with the line:

 export TERM=vt100

Simple, but its not every day you can increase performance by 50%, so to speak.

Thursday, February 28, 2013

The fine art of SmartOS image creation in SDC

I suspect many organizations that run Joyent's Smart Data Center have them operated by Joyent staff themselves. Template creation of SmartOS images is something any private cloud operator will need to do, and Joyent has basic information on how to do so. However, certain steps require tools and code generally only available or known to Joyent staff. I wanted to impart my knowledge on how to go about doing this here for my own notes and for others.

First, one can follow the instructions at http://wiki.joyent.com/wiki/display/jpc2/Creating+Your+Own+SmartMachine+Image

I found that creating the snapshot locally to the compute node, as mentioned near the bottom, was insufficient, but your mileage may vary. I used the UI to snapshot the templated VM. In my case, I used the Smart64 image as my base OS image to then customize as mentioned, such as adding tomcat, services, and configurations.

One step that I found problematic is the meta data creation. The commands for doing this were found only in Smart64 or similar instances, and not the underlying nodes or SDC head node zones. I created a new Smart64 instance for template manipulation, pointed it to my cloudapi host using the "sdc-setup" command, and after configuration, used sdc-updatemachinemetadata from /opt/local/bin. The specific command I used for my meta data example was:


sdc-updatemachinemetadata -m image_name="tomcat" -m image_version="1.0.1" -m image_description="tomcat appserver" 99199472-bae6-4c89-a7ef-d6d4cf736feb

The final part of that line is the zone uuid after it has been shutdown. The final step is to run sdc-create-image, a script that is only available internal to Joyent. Please contact your team rep to get this. Once you have it, your image publishing is a trivial command, run from the head node:

./sdc-create-image 99199472-bae6-4c89-a7ef-d6d4cf73757

With that, your new template is created, and your users can not pick your new application image to instantiate.



Followers