Class HostFaultInjection

java.lang.Object
org.cloudbus.cloudsim.core.CloudSimEntity
org.cloudsimplus.faultinjection.HostFaultInjection
All Implemented Interfaces:
Cloneable, Comparable<SimEntity>, Runnable, Identifiable, Nameable, SimEntity

public class HostFaultInjection extends CloudSimEntity
Generates random failures for the Pe's of Hosts inside a given Datacenter. A Fault Injection object usually has to be created after the VMs are created, to make it easier to define a function to be used to clone failed VMs.

The events happens in the following order:

  1. a time to inject a Host failure is generated using a given Random Number Generator;
  2. a Host is randomly selected to fail at that time using an internal Uniform Random Number Generator with the same seed of the given generator;
  3. the number of Host PEs to fail is randomly generated using the internal generator;
  4. failed physical PEs are removed from affected VMs, VMs with no remaining PEs and destroying and clones of them are submitted to the DatacenterBroker of the failed VMs;
  5. another failure is scheduled for a future time using the given generator;
  6. the process repeats until the end of the simulation.

When Host's PEs fail, if there are more available PEs than the required by its running VMs, no VM will be affected.

Considering that X is the number of failed PEs and it is lower than the total available PEs. In this case, the X PEs will be removed cyclically, 1 by 1, from running VMs. This way, some VMs may continue running with less PEs than they requested initially. On the other hand, if after the failure the number of Host working PEs is lower than the required to run all VMs, some VMs will be destroyed.

If all PEs are removed from a VM, it is automatically destroyed and a snapshot (clone) from it is taken and submitted to the broker, so that the clone can start executing into another host. In this case, all the cloudlets which were running inside the VM yet, will be cloned to and restart executing from the beginning.

If a cloudlet running inside a VM which was affected by a PE failure requires Y PEs but the VMs doesn't have such PEs anymore, the Cloudlet will continue executing, but it will spend more time to finish. For instance, if a Cloudlet requires 2 PEs but after the failure the VM was left with just 1 PE, the Cloudlet will spend the double of the time to finish.

NOTES:

  • Host PEs failures may happen after all its VMs have finished executing. This way, the presented simulation results may show that the number of PEs into a Host is lower than the required by its VMs. In this case, the VMs shown in the results finished executing before some failures have happened. Analysing the logs is easy to confirm that.
  • Failures inter-arrivals are defined in minutes, since seconds is a too small time unit to define such value. Furthermore, it doesn't make sense to define the number of failures per second. This way, the generator of failure arrival times given to the constructor considers the time in minutes, despite the simulation time unit is seconds. Since commonly Cloudlets just take some seconds to finish, mainly in simulation examples, failures may happen just after the cloudlets have finished. This way, one usually should make sure that Cloudlets' length are large enough to allow failures to happen before they end.

For more details, check Raysa Oliveira's Master Thesis (only in Portuguese).

Since:
CloudSim Plus 1.2.0
Author:
raysaoliveira
See Also:
  • SAP Blog: Availability vs Reliability TODO The class has multiple responsibilities. The fault injection mechanism must be separated from the fault recovery. The cloner methods are fault recovery.
  • Constructor Details

    • HostFaultInjection

      public HostFaultInjection(Datacenter datacenter)
      Creates a fault injection mechanism for the Hosts of a given Datacenter. The Hosts failures are randomly injected according to a UniformDistr pseudo random number generator, which indicates the mean of failures to be generated per hour, (which is also called event rate or rate parameter).
      Parameters:
      datacenter - the Datacenter to which failures will be randomly injected for its Hosts
      See Also:
    • HostFaultInjection

      public HostFaultInjection(Datacenter datacenter, StatisticalDistribution faultArrivalHoursGenerator)
      Creates a fault injection mechanism for the Hosts of a given Datacenter. The Hosts failures are randomly injected according to the given pseudo random number generator, that indicates the mean of failures to be generated per minute, (which is also called event rate or rate parameter).
      Parameters:
      datacenter - the Datacenter to which failures will be randomly injected for its Hosts
      faultArrivalHoursGenerator - a Pseudo Random Number Generator which generates the times Hosts failures will occur (in hours). The values returned by the generator will be considered to be hours. Frequently it is used a PoissonDistr to generate failure arrivals, but any ContinuousDistribution
  • Method Details

    • startInternal

      protected void startInternal()
      Description copied from class: CloudSimEntity
      Defines the logic to be performed by the entity when the simulation starts.
      Specified by:
      startInternal in class CloudSimEntity
    • processEvent

      public void processEvent(SimEvent evt)
      Description copied from interface: SimEntity
      Processes events or services that are available for the entity. This method is invoked by the CloudSim class whenever there is an event in the deferred queue, which needs to be processed by the entity.
      Parameters:
      evt - information about the event just happened
    • generateHostFault

      public void generateHostFault(Host host)
      Generates a fault for all PEs of a Host.
      Parameters:
      host - the Host to generate the fault to.
    • generateHostFault

      public void generateHostFault(Host host, int pesFailures)
      Generates a fault for a given number of random PEs of a Host.
      Parameters:
      host - the Host to generate the fault to.
      pesFailures - number of PEs that must fail
    • availability

      public double availability()
      Gets the Datacenter's availability as a percentage value between 0 and 1, based on VMs' downtime (the times VMs took to be repaired).
      Returns:
    • availability

      public double availability(DatacenterBroker broker)
      Gets the availability for a given broker as a percentage value between 0 and 1, based on VMs' downtime (the times VMs took to be repaired).
      Parameters:
      broker - the broker to get the availability of its VMs
      Returns:
    • getHostFaultsNumber

      public int getHostFaultsNumber()
      Gets the total number of faults happened for existing hosts. This isn't the total number of failed hosts because one host may fail multiple times.
      Returns:
    • getTotalFaultsNumber

      public long getTotalFaultsNumber()
      Gets the total number of faults which affected all VMs from any broker.
      Returns:
      See Also:
    • getTotalFaultsNumber

      public long getTotalFaultsNumber(DatacenterBroker broker)
      Gets the total number of Host faults which affected all VMs from a given broker or VMs from all existing brokers.
      Parameters:
      broker - the broker to get the number of Host faults affecting its VMs
      Returns:
      See Also:
    • meanTimeBetweenHostFaultsInMinutes

      public double meanTimeBetweenHostFaultsInMinutes()
      Computes the current Mean Time Between host Failures (MTBF) in minutes. Since Hosts don't actually recover from failures, there aren't recovery time to make easier the computation of MTBF for Host as it is directly computed for VMs.
      Returns:
      the current mean time (in minutes) between Host failures (MTBF) or zero if no failures have happened yet
      See Also:
    • meanTimeBetweenVmFaultsInMinutes

      public double meanTimeBetweenVmFaultsInMinutes()
      Computes the current Mean Time Between host Failures (MTBF) in minutes, which affected VMs from any broker for the entire Datacenter. It uses a straightforward way to compute the MTBF. Since it's stored the VM recovery times, it's possible to use such values to make easier the MTBF computation, different from the Hosts MTBF.
      Returns:
      the current Mean Time Between host Failures (MTBF) in minutes or zero if no VM was destroyed due to Host failure
      See Also:
    • meanTimeBetweenVmFaultsInMinutes

      public double meanTimeBetweenVmFaultsInMinutes(DatacenterBroker broker)
      Computes the current Mean Time Between host Failures (MTBF) in minutes, which affected VMs from a given broker. It uses a straightforward way to compute the MTBF. Since it's stored the VM recovery times, it's possible to use such values to make easier the MTBF computation, different from the Hosts MTBF.
      Parameters:
      broker - the broker to get the MTBF for
      Returns:
      the current mean time (in minutes) between Host failures (MTBF) or zero if no VM was destroyed due to Host failure
      See Also:
    • meanTimeToRepairVmFaultsInMinutes

      public double meanTimeToRepairVmFaultsInMinutes()
      Computes the current Mean Time To Repair failures of VMs in minutes (MTTR) in the Datacenter, for all existing brokers.
      Returns:
      the MTTR (in minutes) or zero if no VM was destroyed due to Host failure
    • meanTimeToRepairVmFaultsInMinutes

      public double meanTimeToRepairVmFaultsInMinutes(DatacenterBroker broker)
      Computes the current Mean Time To Repair Failures of VMs in minutes (MTTR) belonging to given broker. If a null broker is given, computes the MTTR of all VMs for all existing brokers.
      Parameters:
      broker - the broker to get the MTTR for or null if the MTTR is to be computed for all brokers
      Returns:
      the current MTTR (in minutes) or zero if no VM was destroyed due to Host failure
    • getDatacenter

      public Datacenter getDatacenter()
      Gets the datacenter in which failures will be injected.
      Returns:
    • setDatacenter

      protected final void setDatacenter(Datacenter datacenter)
      Sets the datacenter in which failures will be injected.
      Parameters:
      datacenter - the datacenter to set
    • addVmCloner

      public void addVmCloner(DatacenterBroker broker, VmCloner cloner)
      Adds a VmCloner that creates a clone for the last failed Vm belonging to a given broker, when all VMs of that broker have failed.

      This is optional. If a VmCloner is not set, VMs will not be recovered from failures.

      Parameters:
      broker - the broker to set the VM cloner Function to
      cloner - the VmCloner to set
    • getLastFailedHost

      public Host getLastFailedHost()
      Gets the last Host for which a failure was injected.
      Returns:
      the last failed Host or Host.NULL if not Host has failed yet.
    • getRandomRecoveryTimeForVmInSecs

      public double getRandomRecoveryTimeForVmInSecs()
      Gets a Pseudo Random Number used to give a recovery time (in seconds) for each VM that was failed.
      Returns:
    • getMaxTimeToFailInHours

      public double getMaxTimeToFailInHours()
      Gets the maximum time to generate a failure (in hours). After that time, no failure will be generated.
      See Also:
      • getMaxTimeToFailInSecs()
    • setMaxTimeToFailInHours

      public void setMaxTimeToFailInHours(double maxTimeToFailInHours)
      Sets the maximum time to generate a failure (in hours). After that time, no failure will be generated.
      Parameters:
      maxTimeToFailInHours - the maximum time to set (in hours)