How I became “glue”

2021-01-18

A piece I read recently about how “Systems Design Explains The World” reminded me of a story. Well, a bunch of stories. And since I was recently asked for a story by a very special person and didn’t have one ready, here it goes.

Really, you should read that blog post and the ones it links to, but if you don’t want to, here is the critical excerpt that smacked me in the face. Fair credit: it’s only a tiny piece of what he discusses and you should really read the whole thing.

There were two groups of misfits:

People who maxed out as a senior engineer (building things) but didn’t seem to want to, or be able to, make it to staff engineer (translating business problems).
People who were ranked at junior levels, but were much better at translating business problems than at fixing bugs.

Group #1 was already formally accounted for: the official word was that most employees should never expect to get past Senior Engineer. That was the whole point of calling that level Senior.

People in group #2 weren’t supposed to exist. They were doing some hard jobs – translating business problems into designs – with great expertise, but these accomplishments weren’t interesting to the junior-level promotion committees, who had been trained to look for “exactly one level up” attributes like deep technical knowledge in one or two specific areas, a history of rapid and numerous bug fixes, small independent launches, and so on.

As has often been the case in my life, I was in a category of people who were not supposed to exist. I was that “Type II misfit.” How I got there is the story.

*In training. I was skinnier then, had more hair, and wore a tie.*

When I joined the tech business it was a much smaller place, and one where the number of roles was growing faster than the number of qualified people. So we had things called “training programs” that most large companies ran to get people without the right tech background up to speed. The Morgan Stanley IT training program was one of the top places you could go and I was proud to have been selected. At the time Morgan was a lot smaller than today, it was well before they moved into the consumer space by buying Dean Witter, and later merging in a whole bunch of other firms. It was a purely institutional trading and investment banking house, competing with Goldman Sachs and Salomon Bros (RIP) for those purely institutional deals. They made a point of hiring people with limited tech backgrounds and in fact I was one of the first four who had a computer science degree, though there were lots of other engineers from other disciplines.

The approach was based on the recognition that most of our programming problems were not very hard, but that the larger problems we were solving could be very challenging. It helped that the guy who ran our tech programs was from the business ops side of things, not from a tech org. They hired for generic problem solving not tech skill, and they put you through 6 months of hell to come up to speed on enough tech to get started.

[As is true today, the tech you tended to learn in school diverged a lot from what was in common use in IT organizations. Most corporate environments were still using a variety of proprietary systems and architectures that were not taught in school, so they usually needed to train new hires even if you had a CS degree.]

I was one of the few who had the tech background, but everybody went through the same training. I realized later that this was also a great filter for the company. When it was over they knew who was good at what. Most of us went into software development roles, but some went into data center ops, some worked on internal customer service teams, and others went on to specialties like database management or systems programming. You had six months to demonstrate what role you’d be most ideal for.

Training was a full-time job. You spent 8 hours a day paying your dues doing grunt work in the data center, and 4 hours a day of training. Depending which shift you were on that week, you worked 8-4 and trained 4-8, trained 12-4 then worked 4-midnight, or trained 8-midnight, then worked through to 8am. Unless you had a weekend shift in which case you worked two 12 hour shifts on the weekend, normal shifts 2 other days, and did one day of solid training.

We still went out for beers after work, even after getting out at 8am. It was NYC, there were lots of people working a variety of night shifts, and the bars open early. The bar that we went to is still there, and from what I’m told still does a solid early-morning business for Wall Streeters who are up all night. [There are plenty of bars in NYC that will lock the doors and stop serving at 4am as the law requires, but allow you to remain and “finish your drink,” which could be a full pitcher, until they re-open at 6 or 7. If you’re in such a place, please be polite, tip well, and pick up your feet when asked so they can mop the floor.]

I graduated from the training program, then got moved uptown into a development and ops support job.

It was different then. We always deployed on Fridays, because as a financial institution we were closed weekends, and that meant we would always have a full weekend to recover if things really screwed up. That was preferable to having to be shut down for a day during the week if something went really wrong. Everything we did had potentially major downstream impacts, and many things were done in batch, running just once per cycle. So just reverting to previous software was rarely sufficient, as you might also have to mitigate all the downstream problems.

Code was monolithic, running a single version on a single platform. Ops was relatively straightforward. I had to wear a tie. We hadn’t yet started calling ourselves engineers, instead we were just programmers or programmer/analysts. The programming we did was not incredibly technical and we worked in straightforward languages. Programming for financial operations is mostly reads, sorts, selects, and occasionally writes or updates. We used standard libraries for everything complex. Very little of my CS background was critical. I’ve never in my life actually implemented a sort or search algorithm because there’s always a standard library written by somebody much smarter than me. Has anybody had to? Really?

I spent six months doing small maintenance tasks, some reporting, and a lot of ops support. I got to support our Tokyo operation’s first major audit by the Japanese Ministry of Finance. They were surprised at how quickly we were able to collect and present the data they wanted. At that time, our international trading tech led the world, everybody knew it, and we were proud of it.

The person who had previously supported our Tokyo office ops had quit and I moved into the role. My boss brought me up to speed and turned me loose. Colleagues working on our “primary” (US markets) systems helped me figure out how to manage what was essentially a streamlined fork of their larger environment with a few special pieces added. There wasn’t anybody else managing our daily “Tokyo flow” and time differences meant that “overnight” processing happened during our early morning. I had a fairly independent schedule and role.

Part of my job was being the one who was pulled in to support major projects being done by the New York systems teams, to ensure that our Tokyo environment was maintained at parity with nothing breaking. (This experience later came in handy for dealing with some of AWS’s “special” regions.) The expectation was that in most cases, changes made to the major systems would be incorporated by all the regional forks unless there was a compelling reason to build a localized solution, in which case I would have to develop something else. The New York systems teams had the specific expertise and staffing to do much of the work, but testing and validation had to include the local tech and ops people, which for Tokyo was me!

Unlike most of my classmates who were mostly assigned to teams with a narrow focus, my function was more as a generalist. I had to have decent knowledge of all the systems used by our relatively small office rather than highly-detailed knowledge of a single functional area. I learned the breadth of all our major systems, rather than one of them in depth.

Along the way I really impressed somebody who was managing a disruptive change to the way we processed and recorded end of day balances and positions, a key chokepoint in the nightly processing flow. The change to the database structures and the standard routines that updated them had to be done as a “big bang” implementation in one weekend. We created two parallel streams: the old and the new. The plan was to do a full nightly processing flow, then switch to the new database, and run it again from the same starting point, using the same inputs. Every report, balance and position had to match the second time around. No other changes were permitted to any systems that week in order to keep the comparison clean. It was all locked down (or so we thought).

This was also a really good example of why making changes on Fridays worked. We had the whole weekend to figure it out if anything went wrong.

Time zones meant that Tokyo went first, so I was the guinea pig. I spent many weeks writing test scripts to compare all the key outputs from Tokyo-specific reports and systems to avoid the need for any manual checking. All that test infrastructure only had to be run once, as it all worked! After the conclusion of the initial processing run on Friday morning, I directed the switchover, reset and restart. By mid-afternoon on Friday my testing confirmed that it had all worked.The London/Europe stream then ran successfully. We were in great shape.

Well, except the part where a “star programmer” on the NY trading system decided that he had to make a change to reports that a “star trader,” needed, that he was sure wouldn’t have impact. Somehow, he also happened to have a senior person’s override code to approve the change/deployment despite the lockdown. But that introduced some changes to some of the back end reporting for New York. Tokyo and London didn’t use that particular functionality, but when new York finally ran, it caused the test suite to flag discrepancies across a range of trader position reports.

Fortunately, we were ahead of schedule and they decided to use the extra time to troubleshoot before reverting. Somebody noticed that there had been a change to a program and asked “hey, who the hell approved this?” The fact that he could get a change through when none were being permitted taught me a lot about permissions, security, and how to improve process to make sure such a thing could never happen again. The follow up was the first in a long list of Correction of Errors or Incident reviews that I’ve been part of.

The manager of that program moved on to a new role and asked me to join her team. It was also internationally focused, but working on smaller geographies where we didn’t have a physical presence and did business through intermediaries. I was to figure out how to automate process that previously had been managed on paper with faxes and odd-hour phone calls. Two weeks later and barely a year out of school, I was in London.

Much of the work I needed to do leveraged systems that had been developed by our London team for local use. They were developing applications for the London and Euro marketsand the idea was that I would learn from them while building new pieces of software. In retrospect, I used a lot less of their work than anybody thought I would. I could have gotten all I needed in about a week, but I was there for six. I wrote some of my best code in that job during that period, and also learned about the things I should never be allowed to do, like design and build user interfaces.

I spent my 1-year work anniversary drinking beers in London with a bunch of Lebanese guys I had met in a pub near my hotel rather than with my classmates who had gathered at our old “8am after the night shift” bar. It was a fun summer and while I could have done a much shorter trip, I also learned a lot about working cross-border and cross-culture by spending as much time there as I did. Shortly after my 23rd birthday, I flew back to New York.

I spent a couple of months finishing what I had started in London. The tech world was smaller then. When I was experiencing problems between my software, the network card that we expected our counterparts overseas to use, and Microsoft drivers, I could just pick up the phone, call 3Com, call Microsoft, ask for “whoever is working on X,” find the right people and figure it out with them. For some particularly quirky network stuff, I worked with an outside programmer whose office was in WTC Tower 2, the only time I had ever been in one of those lost buildings.

It was a lot of coordination, understanding the larger problems, and working around them based on the realities of what we needed to achieve and the limitations of the technologies and communication infrastructure. There was no map and no process for getting from where we were to where we needed to be, so I made up my own. Sometimes it was the right map. Over the years, I’ve found that I’m at my best when allowed to define the process for doing my own work and my most powerful work is done when I’m able to work all the way back from the customer problem to delivering the solution. That wasn’t entirely the case in this instance: I wasn’t yet engaged enough with the customers to be able to identify business problems that technology could solve. That would come later.

By October it was done, and we were ready to onboard the first partner. Italy was first. I was off to Milan.

Milan set the tone for the next 6-12 months of customer onboarding. I arrived on an early morning flight at the start of Milan Fashion Week because of a lack of rooms in Milan. This would become a recurring theme. Almost every customer visit we did seemed to be at the worst possible time for the place we were doing it, and often at a horrible time for the business overall. Along with the rest of my team, I learned a lot about the kinds of questions to ask when trying to schedule something in an unfamiliar location.

I was the sole technical member of a team of business development and market operations people. We gave presentations, we did training, we helped our counterparts understand the docs which were all in English.

My role was to hold together all the pieces, to be the “glue” on the technical side that ensured that the software I wrote, the software the guy at WTC 2 wrote, systems software on a mainframe in New York, a leased network, and all the other pieces, worked together. None of these pieces were supposed to work together, and my job was to understand enough about them that I could drive the people with deep expertise to a solution. I had to solve problems as they came up and address them in the most flexible and agile manner possible. There was no model and no metrics to use as a guide and nobody to tell me what to do, so I developed, and incrementally improved the process as we went along. The key was to know who to call and to be able to accurately explain what was needed.

We were working with a bank that had emerged from the bank at the center of the Vatican Bank Scandal. Its former chairman inexplicably jumped off a bridge in London with a rope around his neck and bricks in his pockets. Several people who had been part of the organization disappeared or found themselves in Italian prison.

All this made for an interesting environment. On the one hand, they were eager for the business and an opportunity to be part of something forward-looking that would break from their past. On the other hand, it was… weird. Our primary contact was a banker we nicknamed “Lucky” because he always seemed to be focused on other opportunities and possibilities. (He loved the nickname!) He spent most of his time monitoring the prices of gold and South African shares. Several months later, he disappeared. I didn’t ask questions.

[Years later, he resurfaced in Switzerland. Today, LinkedIn shows him running his own asset management firm, so my worst guesses at the time were apparently wrong.]

The same repeated elsewhere. I held things together dealing with the kinds of messy problems that aren’t solved through the application of algorithms and specific technical knowledge or through a carefully-planned and micromanaged software development/deployment process. Over time, the “on-site” team was reduced to just me, one bizops person, and one business development representative who would come along for the first day of introductions, wining and dining, then leave.

I have great memories, but none of them are about actual tech work: the time that I — by far the most junior person on the team — ended up in the most expensive hotel in Paris while everybody else was in a cheaper place down the street. (I was added to the team last, and by then the cheap rooms were gone, so corporate travel put me in the more expensive place.) The rest of the team got me back by drinking in my hotel’s bar every night and charging it to my room. My boss was not amused at the $2,000 bar bill. I learned a lot about dealing with colleagues. and I learned that Paris just before Christmas is a wonderful place to be, but a terrible place to try to get work done.

My boss was also not amused when I called from Melbourne on my first day there to explain that they had sent me during Melbourne Cup week, that starting the next day, the entire country would be drunk for the rest of the week, and by the way I’m going to have to extend my stay another week to get anything done. As usual, I was there at the worst possible time, but we found a way to take advantage of the chaos, scheduling a last-minute visit to Auckland to visit our NZ banking partner. (There wasn’t really much point to the visit, but as my manager said “it’ll make our partners feel more important than they are.” Another thing I learned.)

But through this I built a lot of trust with both my internal and external partners, and we did something that a year prior nobody thought was possible.

Years later, a manager I worked with said my greatest strength was my ability to sit at the back of the room, take everything in, figure out the direction things were going, and only say something when the team needed to be put back on course, or when they hit a stumbling block that — he thought — I had probably anticipated before the meeting even started and had prepared for. (He exaggerated, I’m not that good.) That’s the skill I learned in those few early years.

During all this I had been promoted, ahead of the normal schedule. There’s a downside to that. I had not done the detailed coding and ops work that everybody else who came up with me had done. When it was over, I moved into a related role with a new organizaton using variants of the systems I had built. In that role I learned one of the most important lessons I’ve ever had about where and when engineering and technology matter.

The manager of that division was smart about how we could use our technology to disrupt the world he had come out of. But there was a cold realism to his approach. After one successful customer launch, he reminded us:

“Our customers don’t come to us because we have the best technology, they come to us because we provide a better value. They’re happy to see us using great technology and are impressed with our mastery of it, but in the end, if we can provide the same results at the same price by putting a million monkeys in front of a million typewriters, they’d be just as interested in the value proposition.”

[Years later, I realized that his idea may be the first known example of a serverless architecture as well as a precursor to ChatGPT. I rarely use this exact phrase when I tell the story anymore. First, almost nobody under 40 knows what a typewriter was. Second, the name of our primate evolutionary cousins has become polluted by racist misuse and this phrase is too easily misconstrued by those who are unfamiliar with it. “A million cats walking across a million keyboards” is how I tell it today, but I wanted to be factual in this writing.]

I have never forgotten the “million monkeys” analogy. I credit him for my tendency to always be skeptical of technology even as I work to master it. I’ve saved my clients and employers millions of dollars by warning them off of expensive “investments” that would only generate value to the vendor, and resume-padding for the people working on it.

I’ve pissed off hundreds of software and services sales people by keeping my clients focused on the question of “what problem does this solve at what cost?” rather than “is this cool tech?” or even “could this work?” To this day, I use less tech in my home than a lot of people who aren’t in the tech business, because, really, what problem is it solving?

I also learned something from him about pricing. We were delivering a better and more accurate service, and were able to do it cheaper than the competition. Some thought we could charge less and just take over the market. But we charged more. When you’re doing a better job, you charge more. If you don’t, the smart customers will wonder why. If you’re able to do it at lower internal cost, that just means more profit to you.

Competitors eventually caught up and prices/profitability declined, but I learned not to under-price, and the importance of using higher prices as a signal of value. This has held true for me working independently. I was never more in demand than when I raised my rates.

My further success in that job was short-lived. It’s tough to come back to the “standard progression” when you’ve gone off in a completely different direction for 2-3 years. By any standard, my first few years of work were… bizarre. I had spent my time working on very messy problems of very high importance, but where rigorous engineering was rarely the key to success. I loved every minute of it, but it came back to haunt me when things slowed down, and the focus moved to incremental efficiencies in existing products and refactoring code. That is to say, work that required the skills I had mostly not developed while doing other things.

I became the Type II misfit cited at the start. Once you jump off the standard progression and start dealing with the incredibly messy problems of inserting systems into different organizations in different countries with different expectations of how they would and should work, it’s hard to go back. I had been doing project management, onboarding, service, and customer relationships since I was a year out of school and I was good at it, but I moved off the normal technical progression one would expect, and into something that didn’t really fit any standard role.

Like other “Glue” types in the Type II Misfit category, I was in an odd position: a early/mid-level employee with senior or better skills in many critical but less technical areas, and far less rigorous engineering experience, because when you’re in front of a customer and need a solution right now, rigorous detailed engineering is not the answer. I eventually took a few months off to ski, then on the advice of one of my first managers I applied to business school and never looked back.

[Business school may have been a mistake for other reasons, but the decision to refocus was not.]

I do still wonder what might have been had I been slotted into a different job early, or had I decided to take a step or two back to return to hardcore engineering. Of the members of my original training class, I’m one of only two who remained in a tech-related field over the long haul. The other remained in that organization for his entire career and eventually occupied the CTO role. But maybe that’s to be expected. We were all hired for more generic problem-solving skills, not because we had a CS degree. Not surprising that many of us found lots of other places to apply those skills.

Epilogue:

As a technical program manager I’m still very much in the “glue” category, and I still wonder where the future will take me. “Glue” has had a lot of different names in my lifetime and “TPM” is just the most recent one. I’m long past the point of going back to engineering, and I’m also not likely to want to move into more senior management. But “glue,” whatever the current popular job name, is still viable and mostly pays well when you find it.