Monday, July 1, 2024

Redundancy, Disaster Recovery and The Cloud

 If you pay attention to tech news, you know that cyber attacks are commonplace these days.  There are a variety of vectors of attack from malicious software finding its way onto a computer because someone opened a link in an email they shouldn't, to recent examples where social engineering was used to access a network attached device and download the malware onto it.  Regardless of the method, once an attacker has a beachhead in your network, you are likely in for outages and loss of production or income.  Your defenses today include a wide array of active measures; firewalls, anti-virus, VPNs and device policies.  All of these help prevent an attack - but what do you do once you've been taken offline by an intrusion?  In this post, we'll look at how some of the biggest companies harden their networks for specific levels of redundancy and disaster recovery.


High Redundancy

In 2005 I worked for one of the largest privately owned insurance agencies in the nation. You might say their foot print was nation wide.  To ensure uptime and recovery, they utilized a redundancy system that was truly impressive.  Today they likely have improved on this plan, but 20 years ago, this was big and rock solid stuff.  

It started with an A/B power system in the data center.  By having redundant power, your primary supply could go down and the backup would recover in seconds.  The help desk firm I worked for in the late 90's had a failover response on their standby generators of half a second.  The lights barely flickered.  A/B power schemes usually have similar backup capacity that can weather an outage of a certain duration, but is usually limited to a fuel supply and how long it will last.

The second layer of redundancy was to have a North / South data center clone in their home city.  This ensured that any outage that compromised an entire site, like local natural disaster or other unforeseeable interruption at one location wouldn't stop business because the secondary location could pick up the slack.

But what if we had an earthquake? That's where the East / West redundancy came in.  In a South West state, they had another data center that formed the South component of a two site redundancy pair.  In another state up in the North Central plains, they had the North location.  

This provided regional failover capability. The glue that held it all together, at the time, were Cisco Smart Switches.  A demo conducted one weekend for the executives involved tracking the packet loss on the network when an entire data center was shut down.  Thanks to the highspeed routing and error checking of the smart switches, there were zero packets lost, and no discernable down time that could be measured.

On top of this highly redundant network, they use clustering and cloning.  IBM mainframes provided a high performing computational platform upon which servers were cloned in real time across mirrored Linux partitions.  Every transaction was relayed to 7 other clones of the same server in real time.


Disaster Recovery

The same company used what was, at the time, the best you could get in disaster recovery.  Nightly backups were taken from every tier one server, all of the business critical systems, written to optical disc or magnetic tape, depending on the system.  These were shipped every day to cold storage, literally, in an old salt mine converted to secure storage, deep underground.  In the event of a disaster, the nightly backups could be pulled and restored within hours.  This ensured minimal down time in the event a network wide failure of some sort, like a cyberattack, managed to affect all systems and all locations.

It's typical today for companies to use this strategy still.  The service of securing and transporting backup medium was, until recently, a lucrative business model.  Network bandwidth improvements have made this almost an antiquated approach to backing up and securing data these days as the Cloud has distributed not only the networks and compute resources companies build their own infrastructure on top of, but the backup can be done from the cloud providers data center rather than from your on-premise systems.

Restoring Service

While I was working for a major automotive manufacturer, the second of my career, they got hacked.  Worse, it was ransomware - more than 80 percent of the computers connected to the network had their user directories encrypted.  Rather than pay the hackers, the company opted to reimage all of the machines... every server, workstation and laptop.  I participated directly in this effort and it took several days to complete.  I pulled at least two 14 hour shifts.  The entire outage lasted approximately 5 days.  The speed of recovery had a lot to do with the skill of the IT department, and the relatively small flaw that was exploited on a Microsoft Domain Controller.  It should have been patched, but was a couple of days behind a zero day bug announcement.  The process of evaluating patches to systems was the real culprit - it took too long to decide it was OK to apply a patch via certification in the lab, and left the window of vulnerability open.

A more recent story saw a regional manufacturer taken off line for upwards of two weeks.  Without knowing all of the internal details, we can infer that they either had trouble isolating the problem or insufficient redundancy or DR planning in place.  Some of the things they could have done to help would have been to use any of the many available network scanning tools now available.  Running an up-to-date commercial anti-virus on all of their devices may have quarantined malware quickly.  Reports indicate that their VPN was put out of commission for a long while, leaving remote workers incapacitated.  A potential work around for this would have been to use a Microsoft hybrid virtual network combining the Network Gateway SASS offering with a secure virtual network switch to their on premise network.  Utilizing Microsoft two-factor authentication would have helped to ensure only their employees could access the VPN, and all they would have needed to do was use the Windows built in VPN client to connect.  

Getting Better

Many companies to day still retain a heavy on-premise component due to security or cost concerns. There is a fair amount of distrust surrounding cloud computing, which is hard to argue with when big name providers deprecate and turn off functions regularly.  They are also not immune to breaches.  Auth0, a provider of OAuth services to enable SSO across applications, recently had a mid-day outage that took all of their customers off line for 45 minutes during the work day.  Centralized services like this provide a cost savings and distributed service which can service your internal and external apps, but poorly architected service offerings present a single point of failure that can cost you and your business a lot of money. 

Having the right mix with properly vetted service providers that have also hardened their systems and applied architectural best practices to their designs to provide redundancy and resiliency is an important step not to miss in the design of your network and infrastructure.  Picking the right balance of low priority and mission critical systems that have layers of redundancy available to them can prevent embarrassing outages that can turn customers off or delay the delivery of goods and services, which in turn can cost you money and business.  

The old adage of not keeping all of your eggs in one basket still applies.  And while the cloud is in fact just someone else's computer, properly applying the design patterns that it affords is the key to keeping your costs low and your uptime -- up. If all you do is move a server from your in-office rack to a virtual clone on the cloud, you're only turning a capital expense into an operational expense.  By properly designing your applications and services, you can take advantage of distributed architectures, high redundancy, and high availability, while realizing lower capital costs with tightly controlled and monitored operational expenses.  

The key to a successful migration to or integration with a cloud service provider begins with engaging with an enterprise architect with a broad range of exposure to service providers and current offerings.  Be wary of flashy consulting firms with name brand recognition as they don't always deliver.  Watch out too for consultants who pad their resumes with a never ending stream of certificates, but have little actual experience.  The long time veterans who have seen some things and survived some of these scenarios remain your best shot and fortifying your systems and redesigning your network and service layers for optimal security and performance.

Thursday, March 9, 2023

Building Competency

 Over the past year, my main responsibility has been helping the people I manage grow their skills and develop competencies.  Usually we have a steady supply of client work to cut our teeth on, but this past year presented some challenges and for almost 50% of our workforce, there was little to do but learn.  So how did we do it and not lose our minds watching training videos?

There are, of course, decades worth of training videos on numerous online platforms.  You can also find several million miles of text to describe almost any facet of technology that you would care to learn.  Everyone and his cousin's brother has a blog about something or other.  Learning material is not in short supply.  Quality material can be a challenge to locate, but even that is becoming a simpler task with tools like Bing AI and Chat GPT which can give you a well sourced answer to most questions.  But reading and videos only take a person so far?  What can you do to actually hone technical skills?

Simply put: hands-on-training.  In the office we have had a couple of different ways we tried to do this and each met with differing levels of accomplishment and engagement.  That last part, engagement, is a huge key.  Without engaged minds, the learning is not going to take root in the gray matter.  While a good portion of our workforce is geolocated to one city, a smaller portion is remote.  For these remote folks, getting them engaged and keeping them engaged is difficult, but can be done.  It is, in my opinion, still not as effective as face-to-face group project work, but it can yield results.  And that reveals the big take away - face-to-face group project work is, we have found through much trial and error, the most effective and engaging way to train people on new skills.  Let's take a look at each of these mechanisms and find out what worked well and didn't.

Remote paired programming is pretty good for getting the job done when there is a client driven deadline, requirement or some other forcing factor.  When the project at hand is a voluntary upskilling lab exercise, those forcing factors are not present, so we found we needed to introduce them, even if they were arbitrary and artificial. Setting deadlines also has an effect of providing a challenge to people, which it turns out can be fulfilling when it's a target they can hit by pushing themselves slightly.  

What does not work in this space is demeaning people for their lack of skill and then obligating them to fix that deficit in the way you determine at a pace you determine.  Proscriptive, and derogatory, leadership in a field of creative technical contributors is like pouring water on a campfire.  So we have learned by observing some minor failures what not to do here.  It's important for leaders to understand that culture matters and talking down to talent is a big mistake.

Face-to-face projects benefit from the same mechanisms and suffer from the same failures of leadership. It's important to find tasks that the people engaged in the lab will enjoy and want to learn.  Asking developers to learn network mapping is going to be met with a lot of hesitancy and might even turn people off to the point where you start losing talented people to employers who understand human psychology better than you do.  But taking solid backend developers and asking them to learn a new service oriented architectural pattern, or front end developers and having them pick up a new tool chain for web development makes for good career growth that leverages their existing passions and interests.  It's OK to ask them to stretch into other areas, but make sure you tie it in well to what they already know.  

In summary, people learn better when they have a study buddy, attainable goals, and interesting objectives.  If you would like help designing your training activities, drop an email to datatribe@gmail.com - I'd be happy to help you out on a contract basis.

Wednesday, March 16, 2022

Job Change

 Over the years, as a consultant, I've filled various roles.  Two of my favorite roles, or duties, were mentoring and leading a team.  I've been fortunate enough to have that opportunity more than a few times, but recently moved back out of self-employment into a salaried position to take on full time team leadership, with an emphasis on coaching and mentoring.

This is something I've wanted to do for about 5 years. When I first started thinking about this, I had already had the opportunity to co-lead a large team in a fast paced, tightly managed environment.  Eventually, the stress that resulted from the tightly managed aspect of that burned me out and I fled to a lower paying job that nonetheless led to job and income stability for well over a decade, including previous tough economic downturns like the one we're presently heading into.  A little over half way through that 12 and a half year period, I began seeing more turnover in the office at my client site. Lot's of young minds were streaming in under the leadership of a young manager whom I reported to for a brief time.  She had a talent for finding bright minds and recruiting them and it was my joy to interact with them, find out what motivated them, and occasionally recruit them into my projects or tap their expertise.  We spent a good amount of time walking around the parking lot on our breaks just talking, too.

As a consultant, their fates were not in my hands and I didn't have the bandwidth or authority to guide their careers, and while I was certainly influential (in their words), I wished that could be my main job.  As time went on, I found the role I was in suffering a sort of organizational atrophy.  We went through several reorgs in short succession, and the scope of activities and expectations, along with the work-from-home change to the daily routine, deprived me of the comradery we'd enjoyed and also served to alert me to a decline in demand, within the company I served, for my technical services.

At length, I decided it was time to look for a new job.  While many people looked at my resume and wanted to put me into the same sort of position I was leaving, one recruiting firm in Columbus, Ohio, looked at me as a person and picked a couple of very promising roles for me, for which I interviewed all the way to the third round for each.  Ultimately, the best fit for me was the one offer I received, and now I find myself quite happy to have a new set of responsibilities, and a team brimming with young minds to guide along their career paths!

As a result of this change, the nature of what I blog about here is likely to change as well.  My personal focus is migrating from technical aspects to leadership aspects.  A recommendation to me from my new manager has been to obtain a copy of and read, "Turn the Ship Around!", by L. David Marquet.  I highly recommend the book to anyone who wants to change the way they interact with subordinates, or the way they view being a member of a team.  It takes wide adoption to realize the gains covered in the book, but there is a wealth of individually applicable advice that will help anyone in any job better enjoy, thrive and grow in their current or future occupation.

On the technical side, we're doing a lot with Azure, so I plan to share some of what I learn about that in a future piece.  I'll just say for now that there's a lot to absorb and if you want to get into it, getting a Udemy or Pluralsight subscription is not a bad place to go after you consume all of the free content you can find on the internet.

To wrap up this post, I want to underscore the most valuable insight from Marquet's book: Learning is core to everything we do, so look at everything you do as a learning opportunity.

Monday, December 6, 2021

2021 Retrospect

 It has been a while since I have posted and I just wanted to give a short update.  Obviously the last couple of years have been atypical.  Thanks be to God, I am still gainfully employed as of this writing.  I've been engaged in more varied work outside of tech, as well as expanding my skillset as a developer, and working a lot more with my church, expanding my role there adding a chairmanship (temporarily) to help us navigate post-covid attendance and giving slumps.

On the technical side, I've been working more with Spring and JPA and getting deeper into how much utility there is there.  Adding Thymeleaf early this year was a tough choice for me, but that has become a staple technology for my UI work these past 14 months or so.  The reason for this is that, while it harkens back to JSP's and I dislike JSP (for reasons), it does represent a server-side composition method for views (in our trusty old MVC pattern) and works particularly well with JPA.

Surfacing a model or transitory bean in a Thymeleaf template is trivial once you have  a working understanding of it.  There is no shortage of free documentation and tutorials so I'll omit any code examples here except at a high level to say: once you define your JPA bean, taking advantage of javax, Lombok and Spring annotations (you only need define your fields, not even getters or setters), the generation of your database structure, the inherited repository structures and simplified controller authoring move the bulk of the work into your controller or helper layer and the authoring of your templates and any supporting JavaScript.   Shorter; the work to create all of your layers in an MVC application is greatly reduced to mostly focusing on the UI and business logic.  

Why does this matter?  Because speed of delivery matters.  Maintainability also matters.  This is why I have backed away from using Spring Roo.  While I think the pattern Roo offers provides a lot of utility, the generated code, while fully leveraging everything Spring has to offer, is now lagging behind and creates more code than is readily maintainable.  As a bootstrap to a quick demo, it has tremendous utility, but if you're going to hand code to someone to maintain later, you set them up for an impossible task unless they are a Spring and Java master.

On the JavaScript front, I've been working on and off on my simplified Modular JS framework approach.  For a time, I thought JSON defined forms was going to be the way to get a SPA the way I wanted it, without the arcanity of Angular or React.  When I adopted Thymeleaf, I backed off from this a bit.  I instead have adopted the use of a custom reflection engine to create a standard way to simply generate JQuery DataTables from beans, whereby a schema query is the only thing you really need to add to a custom repository implementation. I still very much like the modular approach, and I still think it is simpler than Vue, React or Angular.  You don't need to learn anything other than the standard JavaScript Class and Module approaches to make use of this, so long term, it's easy to describe to others and highly maintainable without the need for framework specific knowledge.

And why, you may wonder, would a guy who has created frameworks of his own eschew popular frameworks?  Because the frameworks have become self serving, in short.  They exist more to establish the dominance of the framework and maintain its relevance than improve your life as a developer or system owner.  Yes, you can do great and amazing things with them, but you buy a lot of future pain and agony when your webpacker or npm build blow up due to incompatibilities.  Someone said several years ago, "modern web development has become a minefield" and they were 100% correct.  The approach I'm taking gets off the train to Griefsville and lets us focus on development and quick turn arounds.  Ultimately, your value as a provider of solutions hinges on speed of delivery and cost to the client.  You can make more money making people happy with low cost and short turn arounds than you can milking a project for several years. My goal, maybe not yours, is to help as many people as possible get the most out of their investments in IT.

In other technical developments, I've been working at building my knowledge of Azure.  I tried AWS for a while but didn't like the way it was difficult to estimate costs.  Azure goes out of their way to illustrate and project costs and provides ample free access to help you get a solution to a testable stage.  One caveat that is if you're trying to use integrated authentication with the above technologies I have discussed, you'll need to invest in an https certificate to make it work unless you're only testing on your localhost. Azure itself is undergoing some changes, and as a massively distributed cloud platform with an astonishing amount of abstraction of services into small slices, there are things that are being pruned from the vine as time goes on.  To me, if you need to host something on Azure, the benefit is only in the ability to federate an existing client domain with the authentication you want to provide on an application.  That's a narrow view, I know, because my experience and needs are thus far narrow when it comes to this platform.  If I were to start from nothing, I might use more native MS technologies rather than try to apply my approach to app dev to their platform.  As it stands, I may or may not be continuing with this work due to external reasons.

Outside of tech, I've spent time this year getting more into landscape design and architecture and have made a lot of progress on my homestead.  I hope to write more about that in the future on my other blog.  I'll summarize with some stats: 150 trees put into nursery this spring, currently planting those out as time permits, 1.5 tons of boulders hauled, 40 cu yds of gravel spread, 30 cu yards of dirt moved, 1 driveway redone, one grade revised (raised), 20 pumpkins, 40ish squash, and 15 lbs of potatoes grown.

Heading into 2022, I don't know if I'll still have a full time job, but I'll be busy.


Saturday, September 19, 2020

Automating with PowerShell

 This isn't going to be a deep dive on the subject but is presented more as an informational overview of system automation using Microsoft PowerShell.

PowerShell has been around for a while now and has even gone cross-platform in recent years, with the release of PowerShell as an Open Source project that can also run on Linux operating systems. This ubiquity provides flexibility of resources, meaning the people and hardware involved in system automation, and brings a lot of power to bear on a wide range of needs within any size of IT infrastructure.

At a simple level, PowerShell can be used for simple tasks like copying files or even programmatically creating files.  Coupled with the Windows Task Scheduler or Linux cron tab, you can even create regular activities like backups and exports.

At a more advanced level, however, PowerShell gives the system administrator or developer access to everything from Windows Instrumentation to Web Services.  It supports common scripting idioms like function declarations and modules for creating reusable code. It is not, however, a compiled language and lacks advanced concepts like classes, inheritance and namespaces.  But this is what makes it simple to use. No compiler is needed, no building of code to create executable artifacts is required.  At minimum, you need to have PowerShell installed and a text editor and that is all it takes to make it work for you.

My preference has been to use Visual Studio Code, a free development tool from Microsoft, with the PowerShell addon installed.  This, combined with an embedded PowerShell terminal, makes scripting quick and easy.  Combining good coding practices like building test routines, making code modular and reusable, and using a source control system like Git, brings a commercial and industrial level of code management and performance to a PowerShell implementation.  

Given a sufficient set of resources, PowerShell can operate as a regularly executed background task handling repetitious administrative and system management activities, freeing up System Administrators to focus on other areas of concern.  A good rule of thumb for selecting PowerShell as a tool is similar to the way we decide to make a function when writing good code: when you find yourself repeating an action or set of operations, you have found an opportunity for abstraction.  When coding, this means it's a good time to write a callable function. When working as a System Administrator, it's a good indicator the task should be automated, and PowerShell is the perfect, purpose made tool for that automation.

Planning Phase for Scholarships

 I've been thinking about this for a while and have decided this year to start putting the plan in gear.  Datatribe Softwerks, Ltd., will, at some point in the future, be offering scholarships for students seeking technical careers in Information Technology related fields.  The structure, requirements, amounts and other details remain to be determined, but this seems like a great way to start implementing my dream of helping younger talent find their way up and into a field they will enjoy and have a life long passion for.

Meanwhile, if you have comments or suggestions or interest, drop us a note or comment here.  

Thanks!

Monday, January 13, 2020

Constantly Evolving - Mobile Development

When I started writing software that could run on a Blackberry, I was counselled to avoid writing directly to the operating system and pursue a lowest-common-denominator web application to ensure resiliency over time.  This advice paid off marvelously and an 18 month project to create tag-parsing framework to expose Lotus Notes applications uniformly on any web browser has saved a client of mine close to $20 million dollars over 9 years.  When they moved away from Blackberry to iPhone, I only had to update one style rule in a master CSS file.

Recently, however, we have been making increasingly sophisticated applications that require hardware access at a low level on the Android platform, and hence we have begun some limited Android development efforts.  Coming from web development, my impressions of Android development using specifically Android Studio and Java have not been all that favorable.

Android is complex, but not beyond mastery.  The complexity is only one of a few problems I find.  The biggest problem is the constant evolution of the environment.  It doesn't change gradually.  It has changed dramatically and rapidly. This makes researching a solution to a given problem a little difficult.  The internet is littered with examples and discussions dating back 10 years, at this writing. Finding correct version information can be a challenge in google-fu.

One of the other things I would say is a fault is the lack of consistency.  There are numerous classes, plugins, and frameworks for making parts of the UI.  They each have different super classes and hence expose different methods.  Again, not an insurmountable problem, but an annoyance. You might for example, as I did today, be able to find an example for using a SwitchPreference on a PreferenceFragmentCompat, but only an example of using that same SwitchPreference to toggle other inputs on an ActivityCompat or something similar. The pieces are made to be interchangeable, but not universal.  A mix of XML and Java (or Kotlin, as the now formally primary and supported language is) must be used to get the most out of the parts provided.

Then there's Unity, a completely different set of tools for making games.  And now there's Flutter, which throws out all of the above, except Android Studio, and takes a completely different approach to developing for Android more akin to React Native.

It's all well and good. We should certainly grow as we go forward, but we seem to be making the same mistakes over and over again.  Application development is still presented as a coding exercise, when it seems we have long passed the point where we know how to make configuration driven application code generators.  We should be creating new systems that vastly reduce coding effort to get the same, or better, performance and capabilities as yesterday. But we, and I mean techies as a breed, keep churning out new ways to relive the same misery over and again.

I would hope that the industry will grow suspicious of these trends and do better going forward.  I plan to do my part where I can, partly by calling attention to this silliness, and partly by working to make things better.  Hopefully I'll have something to share in that regard in the future.