Friday, December 19, 2008

Leveraging Misunderstanding To Generate Great Ideas

Today we had a little meeting to talk about the priorities for our next iteration. We had a really odd, fun experience.

We're trying to figure out how to implement a feature that none of our competitors' products have really addressed, but the client needs. But, it's a bit tricky, so we were talking about the problem.

The team lead explained a possible solution to the problem. It sounded like a great idea, and I re-stated the problem back to him, but I had misunderstood what he said. He in turn, heard my response incorrectly and thought that I had had a great idea. But the thing he thought I was saying actually was a great idea, and solved the problem beautifully, elegantly. We all had a good laugh, since we had arrived at a terrific solution as we went back and forth misunderstanding each other.

So, whose great idea was it? Nobody's, really, and everybody's. So cool.

Now, of course, we have something of a puzzler - how do we effectively leverage our confusion to produce more groundbreaking ideas?

Sunday, December 7, 2008

Windows Vista And My Network

Well, if you've been paying attention to Apple's commercials and your friends with new Windows computers, you may have heard that Windows Vista has issues. Here's one that got me.

When I hooked up a new Windows Vista computer to my wireless hub, it asked me for a PIN number. Hm. That's interesting, never seen that before. It told me to find the PIN on the underside of the WRT54G2 wireless hub, so I got it and entered it. It showed me a network SSID, which I carelessly clicked through. Suddenly, all the other devices hooked to the wireless network were dropped. Hmmm, that's weird.

Here's what happened. Apparently, using the PIN gives your computer complete control over the network hub. It changed the network SSID, randomly generated a new WPA shared key, and reset it. At this point everything else was dropped. What a pain! Shame on me for missing the fact that it was going to change the SSID, but surely I don't want the shared key changed, blocking out all other network devices. Yuck!

Bridging WAP54G and WRT54G2

Let's start off by stating the goal: I have a network in a room off the kitchen where my DSL comes into the house. Downstairs is an XBox 360 connected to the family TV. Rather than run a cable down to the basement, I decided to set up a wireless bridge between the basement and the network upstairs.

Despite the warnings from Linksys to the contrary, you can use WAP54G to bridge to devices besides other WAP54Gs. In my case, I have a WAP54G in the basement bridged to the WRT54G2 upstairs. It's not too hard if you follow these instructions.

First, be very very careful not to follow the instructions provided by Linksys. It's easy to accidentally read them, thinking they'll tell you what you need to do.

Next, hook a cable directly from your PC to the WAP54G. Configure your PC to have a static IP address of 192.168.1.200 (most other addresses on 192.168.1.x will work, too). Open a browser and go to 192.168.1.245, which is the default address of the WAP54G. You will be asked for a user ID and password, and the Linksys instructions are wrong. Here's what actually works: Use a USER ID of blank, and a password of "admin."

In the Setup tab, click on "AP Mode." Select the radio button next to "Wireless Bridge Remote Wireless Bridge's LAN MAC Addresses." The WAP54G needs the MAC address of the WRT54G2, which is on the underside of the WRT54G2. Enter the MAC address in the first box and click "Save Settings."

At this point, I reconnected my PC to the network, and moved the WAP54G downstairs. I plugged in the WAP54G, connected the XBox 360 to the WAP54G, and everything's good to go. For me, the XBox 360 won't dynamically pick up an address, so it's statically configured to 10.5.128.10. Your mileage may vary.

Friday, December 5, 2008

Favorite Projects Series, Installment 2

The previous project in this series I chose as a favorite due to the impact it had for the user. I consider this second project to be a favorite for the interesting technical challenges that it presented. While it certainly had impact for a lot of users, that was far less visible to me.

I lead a team of about three developers on this project at A.G. Edwards. The project was named BLServer based on a service by the same name that we used from an external provider. This provider's service was implemented using a COBOL program that would handle requests over a limited number of network connections. It had a number of limitations that we needed to overcome, which we did by wrapping it with Message-Driven Web Service running in a Weblogic cluster.

The function of the BLServer service was to receive messages for account creation and modification, including changes in holdings of various securities. The provider's BLServer had a nightly maintenance window (I think it was about four hours), and used a proprietary message format. It was secured by using a dedicated leased line and a single common password.

Our wrapping BLServer service was required 1) to expose a standards-based (SOAP) interface, 2) to provide 24x7 uptime, 3) to preserve message ordering, 4) to secure access to the service without exposing the single common password, and 5) to queue requests during maintenance windows delivering them when the provider's BLServer became available. There were also scalability and performance issues which, in combination with message ordering requirements, drove us to an interesting solution. I'm not sure about the exact scalability requirements, since that was four years ago. If I remember correctly, we initially had to be able to handle about 300,000 requests during a normal 8-hour business day, with the ability to handle peak loads of around 1.5 million per day.

The first benefit that our service provided was to expose a standards-based (SOAP) interface, and interact with the provider BLServer which took requests and delivered responses using a proprietary protocol and message format. Our service was then used by application Web Services to provide customer value.

In order to meet the scalability and availability requirements, we proposed standing up a small cluster of WebLogic engines to host a BLServer WebService. This WebService would receive requests and (using JMS) queue them for processing. Responses would then be queued for later retrieval by the calling services. By queuing requests in this way, we could use the transactionality of the JMS provider to guarantee that each message was processed once and only once. Furthermore, we could queue up a backlog of messages and feed them through the finite number of connections made available by the provider BLServer.

By using a cluster, we would be able to handle the necessary load of incoming requests, queue them, and run them through the provider BLServer, keeping it as fully loaded as possible over the finite number of available connections.

Aye, but here's the rub. We had a cluster of WebLogic engines pulling messages from the queue. How do you go about maintaining message order while at the same time leveraging the parallelization of the cluster to handle the load? Consider what happens if you have two messages in the queue in order. The first is a stock sell that results in an increase in available cash. The second is a buy that uses the that cash to buy some other stock. You can see that these must be processed in order. If one server in the cluster grabs the first from the queue, and another grabs the second, there's no guarantee that the sell will be handled first by the provider BLServer. Therefore, we have to guarantee the order in our BLServer service.

How to do that? The solution became more obvious once we realized that total message ordering was not required. What's really required is that messages within certain groups be correctly ordered. These groups are identified by key, and all messages for a given key must be ordered. Depending on the type of request, that key might be a CUSIP, might be an account number, or some other identifier.

Now message ordering with scalability becomes simpler. If all messages for a certain key are handled by a given engine, then we can guarantee ordering by pulling a single message at a time from the queue, and processing it to completion before beginning the next message. Other engines in the queue will be doing the same thing at the same time for other keys. Thus, we gain some scalability.

Oooh, but we've just introduced Single Points Of Failure (SPOFs) for each key. If a given server that handles keys that start with '17' for example, and that server crashes, then messages for those keys won't be processed, and we have failed to meet our availability requirements. That's where the second bit of creativity came into play. We employed a lease mechanism. Leases were stored in a highly-available database. Upon startup, a given engine would go to the database and grab a lease record. Each lease represented a group of keys. For example, a lease record might exist for all records starting with the range '00' to '03'. An engine starts up, finds that this lease is the next available, and grabs it. In order to 'grab' a lease, an engine will update the lease with a time in the not-to-distant future, say, five minutes. As long as the engine is up, it will continue to update the lease every two minutes or so with a new time. If the engine crashes, the time expires, and some other engine grabs the lease.

As long as an engine has a lease for a given range, it can use a selector to receive messages from the queue for that given range. We now have scalability, message ordering and high availability. Everybody say, "Woah, that's so cool!"

At this point, we've solved a significant technical issue that should be captured as an architectural pattern. We never did that. It may be that this solution is documented somewhere as a pattern, but I'm not aware of it.

At the end of the project I moved on to other things, and left BLServer in the capable hands of my friend Brian S. I heard some months down the road that the service was in active use in production, and had seen only one minor bug. I've always been proud of the product quality that our team delivered. We went through at least four or five variations of possible solutions before arriving at the one described above. In each case, we'd get into the details of the solution only to ask, "yeah, but what happens if..." and realize that we were close, but had some problem because of the distributed nature of the environment, or whatever. It was very satisfying to finally arrive at the elegant solution that we delivered.