Understanding distributed systems (Part 1 – Overview of distributed systems)

Tram Ho

Today I am going to do a series of articles about the basics of distributed systems – a subject I learned while sitting at university.

The distributed system is developed on the following basis:

  • Due to the ever-increasing demand for sharing resources and information, existing operating systems have not been able to meet them. In the process of deploying the application of Informatics to life, computer networks are constantly developed, the resources of computers in the network (hardware, software) are increasingly expanded and upgraded. These resources are increasing rapidly, resulting in a tremendous growth in the need to share resources and information in a unified system. Centralized operating systems and pure network operating systems do not meet the demand for such growth.
  • The cost of workstations is quickly decreasing: making them more popular, the number and quality of workstations also increasing, thereby increasing the need for distributed processing.
  • The widespread use of networks: On the basis of networking to deploy the network operating system, creating an infrastructure of technical infrastructure (hardware, network connection, software) as a basis for developing distributed systems. .

And opening this series I would like to introduce Overview of distributed systems . Let’s find out!

Definition of distributed systems

  • The dispersion system can be considered as a calculation system with computational components distributed over different geographical locations.
  • A collection of independent, interdependent computers connected by a communication infrastructure.
  • There are different hardware and software infrastructures connected to each other by network technologies (computer networks), capable of coordinating and sharing resources.
  • Perform a common task.
  • Provide computational resources to users in a given form. Agree on the interface as well as how to access the service.
  • Users do not need to care about the details of the system.

Characteristics of distributed systems

The dispersion system has the following 4 characteristics:

  • Share resources
  • Transparency
  • Openness
  • Elasticity

1. Share resources

  • Computer resources are managed by a resource management program.
  • The resource management program has the ability to receive messages sent by other programs, convert them into physical access resources and then receive responses from physical resources and provide Reverse grant to the show
  • Benefits of sharing resources:
    • Saving investment costs => number of peripheral devices investing in computers => reducing investment rate per user
    • Allowing users to connect remote resources and different machines increases the availability of the system
  • Defect:
    • The program has network connections => security holes => reduces the security level of the system.
    • When the process of sharing information extends monitoring of all information that is allowed to be shared, it can find hidden information, so that information related to privacy may be exposed.

2. Transparency

Transparency is the ability to provide a logical context of the system for users, independent of physical infrastructure. The system is always unique to the user but it will hide the dispersion of the underlying distributed system.

Transparency is considered from many different angles:

  • Transparent access (Access) : Hide the differences in data representation and access to resources.
  • Location (Location): Hide the location of resources, the location of TN is not visible to the user.
  • During migration (Migration): Conceal the resources moved to another location.
  • Transparency: Concerns the resource relocation while it is being used.
  • Replication: Conceal the fact that data is provided from many different copies (often used extensively in distributed systems to increase system performance and availability)
  • Concurency : Conceals the fact that resources are accessed by multiple users concurrently.
  • During the failure (Failure) : Hide errors and recovery process of resources
  • Sustainability (Persistence) : Conceal whether resources / data are stored sustainably (disk) or not (RAM)

Ensuring transparency is one of the things that must be done to ensure the definition of the distributed system. However, to get transparency at the absolute level will lead to very high TN costs. Therefore not always towards absolute transparency

=> Need to consider which cases need to be transparent to save costs.

3. Openness

By definition we already know the distributed system is a collection of independent computers connected to each other by communication infrastructure, providing services to users as a single computer. In terms of physical division, the distributed system consists of many computers interacting with each other. On the basis of that physical system, many different levels of abstraction divide the distributed system into interacting components.

  • An open system is a system that allows components to be manufactured by different manufacturers and can be interchangeable, and also has the ability to allow new components to be added to the system.
  • Open distributed systems provide services according to the syntax and semantic specifications of services, or interfaces .
  • In the interface there are 2 components:
    • Component installation interface: responsible for providing services to other components
    • Components using interfaces: using services provided by components.

=> The condition for these 2 components to interact and coordinate is that they install and use the same interface.

  • To install and use an interface, the following conditions are required:
    • Full. If the interface does not specify fully for components to be used and installed, when using and installing the components will automatically add to the components of the interface for full => installers and people. use additional types => cannot communicate
    • Neutral: independent, independent of any technology, platform, infrastructure. It only defines the general interaction between the two components
  • Use interface language called IDL (Interface Definition Language) to ensure the neutral of the interface as well as more convenient to vaildate whether the interface is complete or not.

4. Elasticity

  • 1 is the ability of the system to respond to changes in the infrastructure of its surroundings
  • Elasticity is often considered from 3 angles:
    • Scale scaling : ensuring system responsiveness as the number of computers, the number of users, the number of user requests sent between computers increases
    • Geographic elasticity : ensures information exchange on the wide area network as with the local area network
    • Organizational elasticity : when the organization changes or the computer moves from the same organization to another organizational area => organize the system into domains so that when an organization needs to be changed, we only need to change the domain and change trust between those domains

Distributed system components

1. Hardware

  • The system can be single-processor or multi-processor. Currently, many microprocessor devices are widely used
  • In multi-microprocessor computing devices, microprocessors are usually connected to memory through a common information axis of the system. In very high performance computing computing systems with multiple CPUs and memory modules, the use of a common axis for all CPUs and memory CPUs leads to the fact that when a pair of memory CPUs use axes, another pair is unusable => long standby time => the system has many CPUs and memory modules will use different architectures such as architecture with 1 switch, architecture with switching system with high transmission speed. However, it requires high fabrication cost.
  • For smaller organizations, when people need high performance computing equipment and cannot afford to buy super computers, people choose the more efficient option than buying many minicomputer computers or super computers with lower performance, lower cost, and seeking to make the total performance of those computers approximately equal to the total performance of all sub-computers => called machines bundles (also known as computer classters)
  • Computer classter is used for many different purposes such as for calculating servers, hosting servers, …
  • In computer classter systems to be able to connect with each other normally the computers must be relatively similar => called a collection of identical computers
  • Where computers are used for different purposes but want to connect to each other => called a heterogeneous set of computers

2. Software

The distributed system is similar to the following operating system:

  • Resource management
  • Conceal complexity and heterogeneity

Software classified into 2 types:

  • tightly-coupled systems ( DOS ): distributed operating systems
  • loosely-coupled systems ( NOS ): network operating systems

Distributed operating system (DOS)

  • 1 is the only operating system installed on the hardware system of the distributed system
  • Provides a unified user interface
  • The user and the application developer do not need to take care of all the scattered details of the system
  • Achieving absolute transparency but sacrificing independence between devices that calculate the composition of the system
  • A distributed operating system must be able to adapt to various types of computers and local operating systems
  • No commercial version

Network operating system (NOS)

  • Provides some basic services so computers can connect with each other via shared computers (remote services)
  • Interested highest floor applications
  • Provide programs with mechanism for information exchange with each other such as TCP, UDP, Socket, …
  • Transparency is worse than distributed operating systems because we do not require too much in systems that only require operating systems that support the network.

Middleware

  • A good combination of DOS and NOS

  • Middleware services: transparent access, high-level information exchange facilities, identification services, sustainable storage services, …

I would like to summarize a bit about this part!

SystemdescriptionMain Goal
DOSThe OS is closely linked to a multicomputers hardware system (multi-microprocessor or synchronous computer)Crystal-clear
NOSNOS on local computersProvides local service for other computers
MiddlewareInstall basic services to execute, develop applicationsDistributed transparency

Comparison of distributed systems software

ItemDistributed OSNetwork OSMiddleware-based OS
Transparent levelHigh (or very tall)ShortHigh
An OS on buttonsYesYesYes
Number of OS versions1 (or N)NN
Information exchangeShared memory (or transfer notifications)FileDepending
Resource managementGlobal focus (or dispersion)Follow the buttonFollow the button
ElasticityMayHaveDepending
OpennessClosedOpenOpen

The above are some basic concepts of an overview of distributed systems. In the next article, I will introduce about models and architectures in the distributed system

Thanks for reading

References : Lecture on Executive Clown – DHBKHN

Share the news now

Source : Viblo