Software on a Manned Interstellar Expedition
Sending a manned interstellar expedition is going to be fraught with challenges such as propulsion, life support, communication and many more. Under the communication umbrella is the significant challenge of maintaining software on such an expedition. The challenges of multi-year latency and limited bandwidth completely throw any normal support process out of the window. Earth needs to have a plan for what software and information to send with very little knowledge of what could happen aboard the spacecraft. The spacecraft needs to be able to incorporate software advances from Earth and send back notable advances. For this paper, we will assume a colonization mission without FTL travel capability. In three parts, we will cover what the software environment could look like, what technology we would use today, and what items to watch for in the future.
The Software We Need
There are generally two different camps in the software world: Free and Open Source Software (FOSS) and proprietary software. FOSS refers to software with source codes that are available for anyone to see, use, modify and share. Some examples are Linux, Firefox, Apache, and Ubuntu. Proprietary software usually means that the company that produced the project is the only one who can see, use or modify it in any way. Some examples are Windows, Photoshop, and AutoCAD. As it is with most things, it's not always black and white, for example the Ubuntu Linux distribution (an Operating system and collection of software) also provides proprietary firmware and drivers to help make hardware work. When the colonists are over 6 months of communication time from Earth, there is no point of asking someone on Earth for immediate assistance with a software issue. At that point, the colonists will want to have all of the information they need to solve their own problems. Figuring out what they will likely need is critical. I break it down into 7 different categories. For the examples, I will generally select items used by Ubuntu because it is the Linux distribution with which I am the most familiar. However, most major Linux distributions have similar options.
Source Code
Source code are the instructions humans create to tell a computer program what we want it to do. Open source software is commonly developed using a Revision Control System that tracks changes to the source code which helps to keep the project organized. Each change is typically called a commit (or patch). Each commit should include a description of what was changed. By only sending the changes to the source code with the commit messages we could reduce the amount of bandwidth used to send messages to the spacecraft. Having the commit messages also helps the colonists with basic context around the changes and provides insight into what they should prioritize.
The Linux kernel is one of the fastest changing software components in the world. Linux releases updates approximately every 3 months and the amount of data needed to describe the changes from one release to another is only about 8 MB. I analyzed the Linux kernel to determine what kind of bandwidth requirements are necessary to keep the colonists relatively in sync with its development.
How to build it
Since the colonists will only be receiving source code, we need to build that code so that it is consistent and contains the most useful executables possible. A Linux distribution like Ubuntu takes the source code of upstream projects and packages them in a consistent way. Ubuntu uses a build system behind a open source web service called Launchpad, which not only "builds" Ubuntu, but also tracks bugs, code shares, and does so much more. By syncing the Launchpad project and Ubuntu's packaging details, we provide the colonists with the tools they will need to build the software we send them.
How to deploy it
Example configs will be essential for providing the colonists with a starting point so that they can build their own production systems. I can see using an example configuration management tool, such as Puppet or Canonical's Juju, which is aptly referred to as an "Executable White Paper" because it is designed to give you the best tested setup right out of the box. The colonists' software deployments will be different throughout the journey, and as their needs change we will need to provide them with the tools they need to remain adaptable.
How to use it
Along with configs, documentation for users and administrators is essential. Ideally, this could be generated from the source code so it stays in lock-step with the actual code.
How to debug it
Knowledge on how to debug the program can sometimes be very domain specific. It might take the form of back and forth communication on bug reports between someone with a problem and the project maintainer. This could be on a bug report, a mailing list, or even in IRC or another chat client. Sending all of this information won't be possible, but of them bug reports seems like the best place to start. Many times commit messages include the bug reference numbers and we would like the colonists to be able to refer to at least part of the more detailed bug.
How best to modify it
Every software project has rules around what kind of code could get accepted. Many also have coding styles that they prefer. For consistency, the colonists should adopt the coding conventions of that specific project. Many open source projects don't provide roadmaps as development happens from many contributors acting in their own interest. Even attempts at providing more detail aim for a couple months at most. A roadmap becomes more useful for the colonists as it lets them know where to not focus their work for the next release. Right now, that kind of information is usually stored informally in mailing lists. OpenStack and Ubuntu both use Launchpad's blueprint system to track big projects that are being proposed. Having an up-to-date even basic roadmap in the source code could be very useful to prevent duplication of effort.
How to test it
All of the other work is for naught if they can't trust it to eventually run their critical systems. Most major projects have tests that run automatically and we would want earth to send the latest of those as well. That way colonists modifications get the same test coverage as a change on Earth.
Thought Experiment - If done today?
Earth would be sending out the above information on a regular basis, perhaps with a slight lag time (say 1 month) to catch obvious regressions. This would have the colonists to receive the software in about the same order it is produced on Earth. This should allow them to pick whatever release cadences they want. For instance, they could have a "rolling" release that gets the latest updates and an every 3 year release that they use on systems that need more stability. Once the software arrives on the ship, it is built for whatever computing architectures they have chosen to use and Quality Assurance test can be automatically run. Software engineers on the ship can choose to backport important fixes to their stable release as they come in. This is not substantially different than what a Linux distribution does today. Ubuntu uses a system called Launchpad (launchpad.net) which takes upstream code and builds it for multiple architectures. It also tracks bugs, code hosting, and has a private build system (PPAs) which allow for experimentation. By syncing source code the ability to support multiple computing architectures becomes just a cost of colonists time and the energy required to build and test on each of them. Different computing architectures like x86_64 (Intel/AMD), ARM, MIPS, PowerPC all have different advantages which could be important for different aspects of the starship.
The "Main" Computer
The main computer would be for running the majority of computationally complex tasks on the ship. This would include science, navigation, communication processing, security, and all software development tasks. 3 Data Centers would make up the main computer and they would be dispersed throughout the ship for redundancy. Each data center would run cloud or dynamic scaling software such as OpenStack. This would allow workloads to be scaled up or down depending on needs. It would also allow workloads to be moved between the different data centers so maintenance can be performed. One data center could also run on a more aggressive release so you get good test coverage from real workloads on your new software code. The obvious hardware choice for supercomputers today is Intel, followed by AMD or PowerPC. Co-processing will also be essential for science workloads which for supercomputers that goes to Intel or nVidia. Main storage would also be integrated with these data centers.
Stationary Workstations
Workstations will be the primary way colonists interact with the ship and computing. This would cover both the workstations to control things like navigation to systems for personal or group entertainment. By having all workstations be based around similar hardware, it builds in some replacement parts for more essential workstation systems. The ship should be designed to run just on workstations for emergency situations. Additionally, they should be able to be used for extra computing when that makes sense. Distributed computing (like BOINC) could be very useful here.
Sensors, Controls, etc
This could range from the actually communication equipment to environmental controls to a simple door control. What the have in common is that they will likely be running pretty much continuously and won't have to do a lot of computing. The computing architectures that make sense would be MIPS (commonly used in routers) and ARM (common in cell phones). PowerPC might also need to be used for electronically sensitive systems, as it has spent a lot of time in space already. The operating systems for these might have to be smaller than used elsewhere. If Linux is too heavy FreeRTOS seems like an ideal choice. Otherwise choosing one of the transactional systems like Ubuntu Snappy should work.
Laptops, Mobile
For a many decade journey our mobile tech simply won't work. Part of this is due to the focus of these industries right now. Replacements every year for phones and maybe 3 for laptops. Additionally losing power to wifi or charging seems wasteful when you are already surrounded by computers. On the other hand, smaller devices with smaller batteries might work fine.
Future
We don't have hardware available today that actually gives us all the source code we would need to support it for decades.
Most software development today is measured in KLOCs (Thousands Lines of Code) which is admitted to be an arbitrary metric. It would be advantageous for our purposes if we could get that changed to a compressed diff metric similar to what I propose we use for sending software changes to the spacecraft.
There is an opportunity for studying revision control systems to see if we can use them to provide an extra level of redundancy (or increased bandwidth) for our interstellar communication.
Lastly, but perhaps most obviously, both hardware and software are improving at tremendous rates Other developments
Add note about ScalingStack - https://insights.ubuntu.com/2014/10/30/scalingstack-2x-performance-in-launchpads-build-farm-with-openstack/