Thursday, May 09, 2013

Decoding MPT_SAS drives in Nexenta/Illumos

Another group on campus had SAS resets on a drive, but the drive never failed. All we ever got was something like this in dmesg reports:


May  9 07:51:50 hostname scsi: [ID 365881 kern.info] /pci@0,0/pci8086,d138@3/pci1000,3010@0 (mpt_sas0):
May  9 07:51:50 hostname  Log info 0x31080000 received for target 9.
May  9 07:51:50 hostname  scsi_status=0x0, ioc_status=0x804b, scsi_state=0x0

We have a target address, but a zpool status or equivalent never gives us that target number, just a long string like "c0t50014EE057D81DE7d0". How to find the drive to off-line and replace it?

For future reference, you'll need to follow this link:

https://www.meteo.unican.es/trac/meteo/blog/SolarisSATADeviceName

You'll also need the lsiutil, which I is part of the LSIUtil Kit 1.63. For reference: One valid download link --- Its next to impossible to find this on LSI's site and the Oracle references are all down now it seems.


Tuesday, May 07, 2013

NexentaStor auto-tier ACL workaround

Let us say that you created an auto-tier and said yes to preserve ACLs, but you used rsync as the protocol. Well, this is not realistic, and you'll see "Error from ACL callback (before): OperationFailed: the job was not completed successfully" in your logs after 0 seconds from running the auto-tier.

How to remove the ACL requirement? Delete and re-create the service? Not necessary.

If you run "show auto-tier :data-fsname-000" or the like, you'll see a value like "flags : copy ACLs" that is not otherwise addressable in properties. Just "setup auto-tier :data-fsname-000 property flags" and adjust the value "1024" to "0" and you'll remove the flag. Thats it.


Friday, May 03, 2013

How To: Pluribus NAT Routing

Its no secret that at Stanford we do a lot with OpenFlow. We get to play with some new and interesting stuff that we integrate into our OpenFlow network. One of these is the Pluribus Network switch, which combines system and network virtualization with a high bandwidth 48+ port 10GB switch fabric. We have been running this in our network, and for months it has been handling the heaviy lifting duties for our SmartOS-based private cloud.

Various features including OpenFlow functionality have been tested, but the products user interface is still being crafted and changes some what over time. Recently, we needed to enable NAT routing for the private administrative network for the SmartOS private cloud. This network is not attached to a router interface, and applying something outside the network fabric to enable NAT or routing will create an undesired point of failure. Pluribus has full routing functionality tied to their virtual network capability. Here is the current command sequence used to enable routing between the private 10.0.x.0/16 administrative address space (could be larger) to an external routable network. I've added the VLAN to attach externally as VLAN 4444, and the fabric name is sdc-global:


> nat-create name sdc-global-gateway vnet sdc-global
> nat-interface-add nat-name sdc-global-gateway ip 10.0.27.1/24 if data
> nat-interface-add nat-name sdc-global-gateway ip 172.20.1.1/24 if data vlan 4444
> nat-map-add nat-name sdc-global-gateway name sdc-global-nat ext-interface sdc.global.gateway.eth0 network 172.20.1.0/24

sdc.global.gateway.eth0 should be the external port, as seen from "nat-interface-show"

UPDATE: A bug when first did this prevents the zone managing the NAT from having a correct default gateway. You'll need shell access and "zlogin sdc-global-gateway" or the like to enter the zone, add add /etc/defaultrouter with the IP of that router there for future use. Then you can exit the zone and run "zoneadm -z sdc-global-gateway reboot" to get it working.

Thursday, May 02, 2013

Grails Fixtures in Bootstrap.. the missing pieces

One nifty way to load in a lot of data into either development or perhaps even production instances of Grails apps is the Fixtures Plugin. You can more easily define loads of data and multiple relationships with this plugin. This plugin is designed to be used for integration test data, but as noted here, nothing should prevent you from loading it in your Grails bootstrap step. However, I ran into curious errors such as "Fixture does not have bean". For my future self, here's the solutions to avoid incorrect assumptions:


  • In the BootStrap.groovy file, define the service fixtureLoader within the BootStrap class (def fixtureLoader).
  • All fixtures must be in a closure titled "fixture {}"
  • Each domain class needs importing at the top of each fixture file. This is the resolution to the bean error noted above.


Thursday, April 04, 2013

Joyent SDC 6.5.6 released -- Upgrade workaround

Just a heads up that Joyent has released Smart Data Center 6.5.6 as noted here:

http://wiki.joyent.com/wiki/display/sdc/Upgrading+SDC+6.5.3+or+6.5.4+to+SDC+6.5.6

First upgrade attempt fails at the very end when selecting the correct platform. Joyent Support noted that it has seen this before, and that a "sdc-restore" from the pre-upgrade backup and then a reattempt should work. In my case, it did just that. I did the quick restore (no -F here). Rebooting the head node as I write this.

Tuesday, April 02, 2013

Save time in backing up Joyent Smart Data Center

One does not frequently backup the head node USB key with Joyent's Smart Data Center. Generally you do it prior to upgrades. Therefore, its commonly a "do it twice" process as it has a quirky bug with regards to terminal emulation that I never seem to remember until its too late:

[root@headnode (CIS:0) ~]# sdc-backup -U c2t0d0p0
Disk c2t0d0p0 will relabled, reformatted and all data will be lost [y/n] y
labeling disk
creating PCFS file system
mounting target disk c2t0d0
mounting source disk c0t0d0
copying files
setting up grub
Sorry, I don't know anything about your "xterm-256color" terminal.
Error: installing grub boot blocks

Yep, my OSX default xterm-256color is not known, and the many-minutes long backup process dies at the end. To address this, override the terminal setting in root user's .bash_profile file with the line:

 export TERM=vt100

Simple, but its not every day you can increase performance by 50%, so to speak.

Thursday, February 28, 2013

The fine art of SmartOS image creation in SDC

I suspect many organizations that run Joyent's Smart Data Center have them operated by Joyent staff themselves. Template creation of SmartOS images is something any private cloud operator will need to do, and Joyent has basic information on how to do so. However, certain steps require tools and code generally only available or known to Joyent staff. I wanted to impart my knowledge on how to go about doing this here for my own notes and for others.

First, one can follow the instructions at http://wiki.joyent.com/wiki/display/jpc2/Creating+Your+Own+SmartMachine+Image

I found that creating the snapshot locally to the compute node, as mentioned near the bottom, was insufficient, but your mileage may vary. I used the UI to snapshot the templated VM. In my case, I used the Smart64 image as my base OS image to then customize as mentioned, such as adding tomcat, services, and configurations.

One step that I found problematic is the meta data creation. The commands for doing this were found only in Smart64 or similar instances, and not the underlying nodes or SDC head node zones. I created a new Smart64 instance for template manipulation, pointed it to my cloudapi host using the "sdc-setup" command, and after configuration, used sdc-updatemachinemetadata from /opt/local/bin. The specific command I used for my meta data example was:


sdc-updatemachinemetadata -m image_name="tomcat" -m image_version="1.0.1" -m image_description="tomcat appserver" 99199472-bae6-4c89-a7ef-d6d4cf736feb

The final part of that line is the zone uuid after it has been shutdown. The final step is to run sdc-create-image, a script that is only available internal to Joyent. Please contact your team rep to get this. Once you have it, your image publishing is a trivial command, run from the head node:

./sdc-create-image 99199472-bae6-4c89-a7ef-d6d4cf73757

With that, your new template is created, and your users can not pick your new application image to instantiate.



Followers