Monday, July 28, 2014

Using iptables to route DNS searches away from Google DNS rate-limiting

I have a local CloudFoundry deployment where the Google DNS server is placed near the top of the /etc/resolv.conf file for my Ubuntu environment. Long story short, the applications in there started to get a little too busy in their RESTful requests to external servers and ran afoul of the Google rate-limiting policy for DNS lookups (https://developers.google.com/speed/public-dns/docs/security#rate_limit) .

We had a couple of local DNS servers at hand…blush…but I wanted to experiment with results before committing to a full reconfiguration of the environment to push out the new resolv.conf files to all DEAs, which would entail evacuating all applications from their wardens and waiting for Health Manager to restart the applications in other DEAs. I also had the option to launch a distributed script across all VMs to modify the resolv.conf file for the DEA and for all the wardens, but I thought it would be risky since some buildpacks could have DNS caches in place, which would be a major blind-spot in my testing.

Enter iptables, which was something I had somewhat successfully avoided for all these years other than the occasional OUTPUT rule. “Output” rules would have worked very well if it was not for one caveat: my local DNS servers were on the private network for my VMs whereas the Google DNS lookup was obviously being routed through the public interface. The obvious fix was to replace the source address in the packets with the private IP address of the VM, which would move the DNS traffic to the internal interface. In itself this last step precludes a solution based on changes to resolv.conf files.

After some searching I came across this excellent tutorial* to help me morph the conceptual solution into the final iptables instructions: http://www.linuxhomenetworking.com/wiki/index.php/Quick_HOWTO_:_Ch14_:_Linux_Firewalls_Using_iptables#.U4kU3RC1dPF

src_dns_server=8.8.8.8
target_dns_server=x.x.x.x
dea_private_ip=y.y.y.y

# a. Replaces the destination of all warden DNS requests directed at the original DNS
# with the address for the target
DNS. 
iptables -t nat -A OUTPUT -p udp -j DNAT --dport 53 –d $src_dns_server --to-destination x.x.x.x


# b. In the post routing phase, with the DNS requests already pointing at the target DNS,
# replace the source of the packet with the private IP address of the VM, so that the
# request to x.x.x.x will go out through the private interface.
iptables -t nat -A POSTROUTING -p udp -j SNAT -d x.x.x.x --to-source $dea_private_ip

# c. Make the same change as #a for the wardens. I did not have an explicit warden-output chain,
# but the warden output chain was tied to warden-prerouting chain.
iptables -t nat -I warden-prerouting 1 -p udp -j DNAT --dport 53 –d $src_dns_server --to-destination x.x.x.x

# d. Make the same change as #b for the wardens

iptables -t nat -I warden-postrouting 1 -p udp -j SNAT –d x.x.x.x --to-source $dea_private_ip

As a concrete and slightly trite example, if an application requested a DNS lookup for “resftful.service.com”, the IP addresses in DNS search packet would be originally generated as

source=$dea_public_ip
target=8.8.8.8
target_port=53

rule #c will transform the target DNS address to

source=$dea_public_ip
target=x.x.x.x
target_port=53

and rule #d will transform the source request address to

source=$dea_private_ip
target=x.x.x.x
target_port=53

* I also bookmarked this practical cookbook for a different challenge on a different day: http://www.thegeekstuff.com/2011/06/iptables-rules-examples/

Monday, January 20, 2014

Heading for the clouds

Last week marked the 4th year of my tenure in Jazz for Service Management (it was known by different names at first) , as the lead developer for Registry Services, a data reconciliation service for the multitude of IBM products developed in-house and acquired over the years.

Looking back…

For the first three years I did not write much about the Registry Services experience, other than offering general impressions on leadership and focus on quality in context of the challenges at the time, both in "Guns, Police, Alcohol, and Leadership" and in “On Mountains, Beliefs and Leadership”. In 2013, I did write a more detailed technical account in the Service Management Connect blog.

Without getting into the gritty details of the pace-setting procedural excellence we achieved in these four years (modesty did improve a bit, but the progress we all made in these four years earned us a few seconds of self-congratulatory pats on the back :- ) it is worth noting that over 4000 functional points are now validated in an automated manner in less than 12 hours across key platforms and 24 hours across the complete set of platforms, every single day of the week.

This marked evolution was only possible with disciplined execution within the Agile development method, few steps at a time, guided by stakeholders one sprint demo at a time, honed by team’s feedback one retrospective session at a time.

I think it will be hard to top the combination of motivation we had from the people executing the tasks with the vision and support from the management team, but four years was also enough time for several people on the team to grow to a level of leadership, skill, and confidence where the smartest thing to do was to make room for the next level of collective growth.

...to look forward

As I started to look for a position to make room for the overall transition, it was to my surprise and delight to hear about the IBM Cloud Operating Environment team just starting to look for a Scrum Master and Dev/Ops lead for its core deployment (based on Pivotal’s Cloud Foundry) . To distill the letter soup, this is essentially IBMs Platform as a Service (PaaS) offering, now officially known as Bluemix. Within two weeks both teams had lined up the whole transition and starting this week, this is my new professional home.

Cloud and Dev/Ops is a space I have been meaning to enter for a long while. It is an opportunity to work with the most recent technologies in a very competitive space, against established giants, but backed by another giant flexing a multi-billion effort across all of its divisions.

I am looking forward to a brilliant 2014 in company of the Bluemix team, learning how to bring the same level of excellence into a different development model, adapting to the new cultures in our geographically dispersed teams, closing the gaps between development and production deployments (a tooling geek’s paradise) , and moving closer to our customers experience, both in the way we support IBM offerings in Blue,ix and watching what development shops around the world will do with it.

Monday, October 28, 2013

Registry Services by the DASHboard light

Quite excited about the prospects of our most recent work on getting Registry Services data (IT resource management) into DASH, the dashboarding component of Jazz for Service Management.

I wrote an entry on the Jazz for Service Management blog:

https://www.ibm.com/developerworks/community/blogs/69ec672c-dd6b-443d-add8-bb9a9a490eba/entry/you_can_see_registry_by_the_dashboard_light?lang=en

The possibilities of integration with datasets from other systems management products offering OSLC interfaces make it really cool.

And there is even a video-recording.

Registry Services data on JazzSM DASH UI

Friday, September 20, 2013

Registry Services on the glass

glassAnother entry related to my current project, Jazz for Service Management. I am quite thrilled about having mixed jQuery and rdfQuery to interact with our OSLC server directly from a web-browser. Look’ma, only 2-tiers:

https://www.ibm.com/developerworks/community/blogs/69ec672c-dd6b-443d-add8-bb9a9a490eba/tags/rsotg?lang=en

Wednesday, September 18, 2013

Renewing expired certificates for wsadmin invocations against WebSphere Application Server

We have a verification lab with about a dozen machines and on a periodic basis we hit a problem where wsadmin cannot connect to an instance of WebSphere Application Server due to a certificate expiration.

Note: This being a closed test environment, this kind of problem can be minimized or avoided altogether by creating the profile in Advanced mode through the Profile Management Tool and choosing longer expiration period than the default one-year period chosen when creating the profile using the manageprofiles command-line tool.

The connection error message returned by wsadmin will not be all that helpful because essentially it does not know why its connection request was refused, but the server logs ($profile/logs/server1/ServerOut.log or TextLog_<timestamp>.log depending on your troubleshooting settings) will contain a very telling message like this:

[9/18/13 10:22:53:664 EDT] 0000001c WSX509TrustMa E   CWPKI0022E: SSL HANDSHAKE FAILURE:  A signer with SubjectDN "CN=hostname, OU=hostNode01Cell, OU=hostNode01, O=IBM, C=US" was sent from target host:port "unknown:0".  The signer may need to be added to local trust store "$profile/AppSrv01/config/cells/hostNode01Cell/nodes/hostNode01/trust.p12" located in SSL configuration alias "NodeDefaultSSLSettings" loaded from SSL configuration file "security.xml".  The extended error message from the SSL handshake exception is: "PKIX path validation failed: java.security.cert.CertPathValidatorException: The certificate expired at Wed Aug 21 21:54:27 EDT 2013; internal cause is:        java.security.cert.CertificateExpiredException: NotAfter: Wed Aug 21 21:
54:27 EDT 2013".

It took me a little while to figure out the keystore used by wsadmin, which turned out to be:

$profile/etc/key.p12

Listing the certificates in the file quickly showed the offending certificate (keep in mind that the default keystore password in WAS is WebAS) :

bash-2.04# keytool -list -storetype PKCS12 -keystore ./profiles/AppSrv01/etc/key.p12 –storepass <password> -v
Keystore type: PKCS12
Keystore provider: IBMJCE
Your keystore contains 1 entry
Alias name: default
Creation date: Aug 22, 2012
Entry type: keyEntry
Certificate chain length: 2
Certificate[1]:
Owner: CN=hostname, OU=hostNode01Cell, OU=hostNode01, O=IBM, C=US
Issuer: CN=hostname, OU=Root Certificate, OU=hostNode01Cell, OU=hostNode01, O=IBM, C=US
Serial number: 2623bc78cd3b
Valid from: 8/21/12 9:54 PM until: 8/21/13 9:54 PM
Certificate fingerprints:
         MD5:  D2:FB:3F:8A:53:6F:19:B2:6C:77:AB:00:AC:40:EC:0B
         SHA1: 64:76:8F:7F:2A:0A:6A:F0:C9:21:86:FF:90:5B:C5:FA:FF:64:61:B4
Certificate[2]:
Owner: CN=hostname, OU=Root Certificate, OU=hostNode01Cell, OU=hostNode01, O=IBM, C=US
Issuer: CN=hostname, OU=Root Certificate, OU=hostNode01Cell, OU=hostNode01, O=IBM, C=US
Serial number: 2622d2d16331
Valid from: 8/21/12 9:54 PM until: 8/18/27 9:54 PM
Certificate fingerprints:
         MD5:  C2:CB:AB:C3:3A:6C:D2:77:6A:87:DA:D7:21:2E:DC:E4
         SHA1: 0C:09:CD:37:96:F8:FE:91:18:92:5A:93:05:AB:15:D6:6D:36:A1:AA

One can (nay, should) use the keytool command-line to recreate the certificate above, but this being a closed test environment where we do not test SSL functionality, and since we did not have any new certificates added to either wsadmin or server keystores since we installed that test system, I did the unthinkable, that is, overwrite the wsadmin keystore with the server keystore.

cp $profile/config/cells/hostNode01Cell/nodes/hostNode01/key.p12 $profile/etc/key.p12

Note: The last step was a very specific decision for a test environment running WebSphere Application Server. **DO NOT** take the last step on a production environment or where company policies do not allow for it, and only after carefully examining the two keystores to ensure all their fields (sans expiration date) are exact matches.

Thursday, May 30, 2013

I am on GitHub…bookmarks, linked data, and more to come.

While writing more extensively for the System Management Connect blog,  I ended up creating some source code examples, which quickly evolved into interesting side projects.

At the same time, formatting these examples into blog editors became a major hassle with ephemeral results. Blog editors are definitely not geared for handling the heavy syntax of source code (yes, yes, good luck with “pre” tags) and even when you achieve good results, they are completely botched months later when the website hosting the blogs makes even the most minute change to their CSS stylesheets.

I am now sharing the code examples in a new GitHub account, at https://github.com/nastacio.

My first project, unimaginatively titled “lctodel”, was developed a couple of years ago and copies bookmarks from a Lotus Connections account to a del.icio.us account (I know, I know, where is the bridge to bit.ly?) .

The second project is still unimaginatively, but less cryptically, dubbed dw.article.rdf. It contains a web-based project that showcases Javascript interactions with my current project (Jazz for Service Management) and is based on jQuery and RDFQuery. I started writing about that exercise here.

Back to GitHub itself, I think more and more it has become a necessity for software developers to organize and publicize their hobbies outside their professional work. In my role I often interview candidates for development positions within the company and I cannot overstate the importance we give to a well-executed technical online presence.

Thursday, April 25, 2013

Iterating through response pages queried from Registry Services using Java

This entry is related to my current project, Jazz for Service Management. Instead of reposting it here, here is a link to the original in the developerWorks website:

Iterating through response pages queried from Registry Services using Java.