Thursday 10 September 2009

Distributed Systems - Defining A Distributed System

Definition of a Distributed System


distributed system is one in which independent, self-sufficient, often heterogeneousand autonomous,spatially separated components must use a common interconnect to exchange information in order to coordinate their actions and allow the whole to appear to its users as a single coherent system.



This is explained below.


Paradigmatic Examples


Some paradigmatic examples of Distributed Systems are:

  • The World Wide Web over the Internet

  • Mobile Telephony over cellular networks such as O2

  • Electronic funds transfer systems (EFTS) over special-purpose networks. This is mainly for Online Banking like Lloyds TSB, and for Credit/Debit card purchases or even via cash machines (ATM).


Other examples can include:

  • Email (Hotmail, MSN, Google)

  • Instant Messaging (MSN, Yahoo, Skype)

  • Videoconferencing (Skype)

  • Multiplayer gaming (World of Warcraft, GuildWars)


Functional and Non-Functional Reasons


There can be two reasons as to why someone may want to construct a Distributed System. These are FunctionalandNon-Functional.

Functional reasons are when a Distributed System makes it possible for us to do more things. These can include:

  • By making continuously-evolving, remote resources accessible for sharing.

  • By opening proprietary processes to external interaction in order to foster cooperation.


Non-Functional reasons are when a Distributed System makes it possible for us to do the same things in a better way. These can include:

  • By leading to better performance/cost ratios

  • By scaling effectively and efficiently if demand for resources changes significantly

  • By scaling through modular, incremental expansion and contraction

  • By attaining high levels of reliability and availability


These Non-Functional reasons can be shrunk down to:

  • More efficiently

  • More flexibly

  • More incrementally

  • More reliably

  • More often


The Benefit of Scale




As we all know, more really is more. This is why interconnecting many distributed systems has increased our ability to tackle problems - and come up with solutions - that even centralized systems in sequential mode cannot solve efficiently.

More users can do more work of a more valuable nature, more efficiently and more effectively with distributed systemsthan with centralized ones.

Independent / Self-Sufficient


When we describe a Distributed System as independent or self-sufficient, we mean that each component has its own:

  • Processor

  • State (IE, Memory  [RAM or HDD])

  • Resource Control and Management (e.g., Operating System (OS) such as Mac and Linux)


Autonomous


Distributed System described as autonomous we mean that each component may:

  • Change or,

  • Be changed of its own accord (ie, without previous agreement or notification).


Heterogeneous


Distributed System described as heterogeneous indicates that different components may have different capabilities (e.g., performance).

There are many sources of heterogeneous:

  • Different Hardware

  • Different Software

  • Different Software Interface

  • The above in combination, and more.


Because a heterogeneous system has such differences, this can cause interacting components to drift further apart in time. For example, electrical signals will take longer to get from on component to another, or they will become out of sync.

Failures also cause components to have to deal with a gap in their knowledge of the current system state.

Given a system, the more spatially distant the components, the more representative of a distributed system it becomes.

Characteristics of a Distributed System


From the definition of a distributed System given above, it is clear that in a distributed system:

  1. Computation is concurrent - when to instructions or events appear to operate/occur at the same time/in parallel.

  2. There is no shared state, such as memory (RAM or HDD) this implies that there can also be no global clock.

  3. Components may fail independently; this means that if one component fails, it will not affect other components in the system.


These 3 characteristics of distributed systems have stood the test of time as being fundamentally challenging for theoreticians, as well as for software architects, designers and engineers.

A Complex Consequence


If the spatial separation of components is significant, then the communication events have a non-negligibleduration. Meaning that the time it takes for any event to get from component A to component B doesmatter. This implies that communication time may become more dominate/significant than processing time.

Moreover, durations may vary for two instances of a communication event between the same 2 components. For example:

The time, t1, taken for a communication event to travel from component A to component B, will not always be equal to a time, t2, taken for the exact same communication event to travel to the exact same component A and B.



Therefore, this shows that components may show variable rates of information exchange.

Another Complex Consequence


Given the absence of a physical clock (global clock), and the presence of heterogeneity, different components effectively - may - perform even the same task with a significant variable duration. Therefore, this can imply thatasynchrony is inherent and synchronization requires special measures. This also implies thatcomponents may exhibit variable rates of processing.

Paradigmatic Examples Revisited


The Web


The Web was initially conceived as a means of a way for users to share linked collections of documents. This means that the Web is Functional.

The Web uses independent hosts to exchange webpages, emails, videos and audio. This will be done via Internet Protocols (IP).

A few examples of IPs are:

  • HTTP    =          Hyper Text Transfer Protocol

  • TCP      =          Transfer Control Protocol

  • FTP      =          File Transfer Protocol


The Web allows the user to have concurrent access to resources; these require the exchange of data to retain their state, for example cookies. These are used, as there is no distinction between a host that is slow and a host that isdown.

Mobile Telephony


For a long time, Mobile Phone Networks have been able to scale up and down in response to the increase in demand. For example, between the years 2006-2007 there was some 3 million net additions; this was the same between the years 2007-2008!

Mobile Telephony is mainly made up of devices - such as phones, tablets and netbooks - that can be used to roam a network. This allows the devices to exchange phone calls, images and video. All of this is done via a cellular network.

Mobile Telephony also allows a user to have concurrent message sending, this requires a customer-provider interaction to retrieve texts kept centrally.

ETFS


Electronic Funds Transfer Systems have a lot of member organizations and can be quite large, yet are stillhighly reliable and highly available. (e.g., LINK had 51 members with a total of 60,000 ATMs).

Customer and purchase data can be read off plastic cards, for example the magnetic stripes on the back of credit/debit cards. This data can be used to request authorization to transfer money between bank accounts, using a special-purpose network - e.g., the LINK network in the case of the large number of UK ATMs.

Millions of transfers can happen simultaneously, so they require precise interaction rules to avoid transactions with incorrect results.

1 comment:

  1. [...] I’m going to ‘borrow’ the description that Matt used in his COMP10052 posts (this post to be precise): A distributed system is one in which independent, self-sufficient, often [...]

    ReplyDelete