Computer Science: Source: Distributed Systems - Defining A Distributed System

Definition of a Distributed System

A distributed system is one in which independent, self-sufficient, often heterogeneousand autonomous,spatially separated components must use a common interconnect to exchange information in order to coordinate their actions and allow the whole to appear to its users as a single coherent system.

This is explained below.

Paradigmatic Examples

Some paradigmatic examples of Distributed Systems are:

The World Wide Web over the Internet

Mobile Telephony over cellular networks such as O₂

Electronic funds transfer systems (EFTS) over special-purpose networks. This is mainly for Online Banking like Lloyds TSB, and for Credit/Debit card purchases or even via cash machines (ATM).

Other examples can include:

Email (Hotmail, MSN, Google)

Instant Messaging (MSN, Yahoo, Skype)

Videoconferencing (Skype)

Multiplayer gaming (World of Warcraft, GuildWars)

Functional and Non-Functional Reasons

There can be two reasons as to why someone may want to construct a Distributed System. These are FunctionalandNon-Functional.

Functional reasons are when a Distributed System makes it possible for us to do more things. These can include:

By making continuously-evolving, remote resources accessible for sharing.

By opening proprietary processes to external interaction in order to foster cooperation.

Non-Functional reasons are when a Distributed System makes it possible for us to do the same things in a better way. These can include:

By leading to better performance/cost ratios

By scaling effectively and efficiently if demand for resources changes significantly

By scaling through modular, incremental expansion and contraction

By attaining high levels of reliability and availability

These Non-Functional reasons can be shrunk down to:

More efficiently

More flexibly

More incrementally

More reliably

More often

The Benefit of Scale

As we all know, more really is more. This is why interconnecting many distributed systems has increased our ability to tackle problems - and come up with solutions - that even centralized systems in sequential mode cannot solve efficiently.

More users can do more work of a more valuable nature, more efficiently and more effectively with distributed systemsthan with centralized ones.

Independent / Self-Sufficient

When we describe a Distributed System as independent or self-sufficient, we mean that each component has its own:

Processor

State (IE, Memory [RAM or HDD])

Resource Control and Management (e.g., Operating System (OS) such as Mac and Linux)

Autonomous

A Distributed System described as autonomous we mean that each component may:

Change or,

Be changed of its own accord (ie, without previous agreement or notification).

Heterogeneous

A Distributed System described as heterogeneous indicates that different components may have different capabilities (e.g., performance).

There are many sources of heterogeneous:

Different Hardware

Different Software

Different Software Interface

The above in combination, and more.

Because a heterogeneous system has such differences, this can cause interacting components to drift further apart in time. For example, electrical signals will take longer to get from on component to another, or they will become out of sync.

Failures also cause components to have to deal with a gap in their knowledge of the current system state.

Given a system, the more spatially distant the components, the more representative of a distributed system it becomes.

Characteristics of a Distributed System

From the definition of a distributed System given above, it is clear that in a distributed system:

Computation is concurrent - when to instructions or events appear to operate/occur at the same time/in parallel.

There is no shared state, such as memory (RAM or HDD) this implies that there can also be no global clock.

Components may fail independently; this means that if one component fails, it will not affect other components in the system.

These 3 characteristics of distributed systems have stood the test of time as being fundamentally challenging for theoreticians, as well as for software architects, designers and engineers.

A Complex Consequence

If the spatial separation of components is significant, then the communication events have a non-negligibleduration. Meaning that the time it takes for any event to get from component A to component B doesmatter. This implies that communication time may become more dominate/significant than processing time.

Moreover, durations may vary for two instances of a communication event between the same 2 components. For example:

The time, t₁, taken for a communication event to travel from component A to component B, will not always be equal to a time, t₂, taken for the exact same communication event to travel to the exact same component A and B.

Therefore, this shows that components may show variable rates of information exchange.

Another Complex Consequence

Given the absence of a physical clock (global clock), and the presence of heterogeneity, different components effectively - may - perform even the same task with a significant variable duration. Therefore, this can imply thatasynchrony is inherent and synchronization requires special measures. This also implies thatcomponents may exhibit variable rates of processing.

Paradigmatic Examples Revisited

The Web

The Web was initially conceived as a means of a way for users to share linked collections of documents. This means that the Web is Functional.

The Web uses independent hosts to exchange webpages, emails, videos and audio. This will be done via Internet Protocols (IP).

A few examples of IPs are:

HTTP = Hyper Text Transfer Protocol

TCP = Transfer Control Protocol

FTP = File Transfer Protocol

The Web allows the user to have concurrent access to resources; these require the exchange of data to retain their state, for example cookies. These are used, as there is no distinction between a host that is slow and a host that isdown.

Mobile Telephony

For a long time, Mobile Phone Networks have been able to scale up and down in response to the increase in demand. For example, between the years 2006-2007 there was some 3 million net additions; this was the same between the years 2007-2008!

Mobile Telephony is mainly made up of devices - such as phones, tablets and netbooks - that can be used to roam a network. This allows the devices to exchange phone calls, images and video. All of this is done via a cellular network.

Mobile Telephony also allows a user to have concurrent message sending, this requires a customer-provider interaction to retrieve texts kept centrally.

ETFS

Electronic Funds Transfer Systems have a lot of member organizations and can be quite large, yet are stillhighly reliable and highly available. (e.g., LINK had 51 members with a total of 60,000 ATMs).

Customer and purchase data can be read off plastic cards, for example the magnetic stripes on the back of credit/debit cards. This data can be used to request authorization to transfer money between bank accounts, using a special-purpose network - e.g., the LINK network in the case of the large number of UK ATMs.

Millions of transfers can happen simultaneously, so they require precise interaction rules to avoid transactions with incorrect results.

Thursday, 10 September 2009

Distributed Systems - Defining A Distributed System