A Backfill allows to recompute already computed time partitions in the past.
A Backfill allows to recompute already computed time partitions in the past.
Unique id for the backfill.
Start instant for the partitions to backfill.
End instant for the partitions to backfill.
Indicates the part of the graph to backfill.
The backfill priority. If minus than 0 it is less priority than the day to day workload. If more than 0 it becomes more prioritary and can delay the day to day workload.
Description (for audit logs).
Status of the backfill.
User who created the backfill (for audit logs).
A cuttle project is a workflow to execute with the appropriate scheduler.
A cuttle project is a workflow to execute with the appropriate scheduler. See the CuttleProject companion object to create projects.
Utility that allow to define compile time safe date literals.
Utility that allow to define compile time safe date literals. Meaning that compilation will fail if the date literal cannot be parsed into a UTC instant.
val start = date"2017-09-01T00:00:00Z"
Configure a Job as a TimeSeries job,
Configure a Job as a TimeSeries job,
The calendar partitions configuration for this job (for example hourly or daily).
The start instant at which this job must start being executed.
The batching parameters com.criteo.cuttle.timeseries.TimeSeriesBatching.
The maximum number of partitions the job can handle at once and a delay the scheduler will wait for partition to arrive.
The maximum number of partitions the job can handle at once and a delay the scheduler will wait for partition to arrive. If the size is defined
to a value more than 1
, the scheduler will wait for delay trigger Execution
with a scheduling context extended to by the @size.
the maximum number of joint intervals which are going to be run within single execution.
the delay for which the scheduler will wait for the new executions to arrive for current batch.
Represents calendar partitions for which a job will be run by the TimeSeriesScheduler.
Represents calendar partitions for which a job will be run by the TimeSeriesScheduler. See the companion object for the available calendars.
A TimeSeriesContext is passed to Executions initiated by the TimeSeriesScheduler.
A TimeSeriesContext is passed to Executions initiated by the TimeSeriesScheduler.
Start instant of the partition to compute.
End instant of the partition to compute.
If this execution is for a backfill, the Backfill informations are provided.
A TimeSeriesDependency qualify the dependency between 2 Jobs in a TimeSeries Workflow.
A TimeSeriesDependency qualify the dependency between 2 Jobs in a
TimeSeries Workflow. It can be configured to offset
the dependency.
Supposing job1 depends on job2 with dependency descriptor (offsetLow, offsetHigh). Then to execute period (low, high) of job1, we need period (low+offsetLow, high+offsetHigh) of job2.
the offset for the low end of the duration
the offset for the high end of the duration
A TimeSeriesScheduler executes the Workflow for the time partitions defined in a calendar.
A TimeSeriesScheduler executes the Workflow for the time partitions defined in a calendar. Each Job defines how it mnaps to the calendar (for example Hourly or Daily UTC), and the Scheduler ensure that at least one Execution is created and successfully run for each defined Job/Period.
The scheduler also allow to Backfill already computed partitions. The Backfill can be recursive or not and an audit log of backfills is kept.
A timeseries workflow
Create new projects using a timeseries scheduler.
TimeSeries utilities.
Define the available calendars.
Utilities for Workflow.
Defines an daily calendar starting at the specified instant, and using the specified time zone.
Defines an daily calendar starting at the specified instant, and using the specified time zone. Days are defined as complete calendar days starting a midnight and during 24 hours. If the specified timezone defines lightsaving it is possible that some days are 23 or 25 horus thus.
If the start instant does not match a round day (midnight), the calendar will actually start the next day immediatly following the start instant.
The optional end instant allows to specify a finite calendar that will stop on the end instant if it is a round day or at the start of the day otherwise.
The time zone for which these _days_ are defined.
The instant this calendar will start.
The optional instant this calendar will end.
Defines an implicit default dependency descriptor for TimeSeries graphs.
Defines an implicit default dependency descriptor for TimeSeries graphs.
The default is offsetLow = 0, offsetHigh = 0
.
Defines an hourly calendar starting at the specified instant.
Defines an hourly calendar starting at the specified instant. Hours are defined as complete calendar hours starting at 00 minutes, 00 seconds.
If the start instant does not match a round hour (0 minutes, 0 seconds), the calendar will actually start the next hour immediatly following the start instant.
The optional end instant allows to specify a finite calendar that will stop on the end instant if it is a round hour or at the start of the hour otherwise.
The instant this calendar will start.
The optional instant this calendar will end.
Convert a single job to Workflow of a single job.
Defines a monthly calendar.
Defines a monthly calendar. Months are defined as complete calendar months starting on the 1st day and during 28,29,30 or 31 days. The specified time zone is used to define the exact month start instant.
If the start instant does not match a round month (1st at midnight), the calendar will actually start the next month immediatly following the start instant.
The optional end instant allows to specify a finite calendar that will stop on the end instant if it is a round month or at the start of the month otherwise.
The time zone for which these months are defined.
The instant this calendar will start.
The optional instant this calendar will end.
Defines a N-hourly calendar starting at the specified instant.
Defines a N-hourly calendar starting at the specified instant.
It is just a generalization of the hourly scheduling allowing to schedule intervals of n hours.
The number of hours of the intervals. Can be any positive divider of 24 except 24.
The instant this calendar will start.
The optional instant this calendar will end.
Defines a weekly calendar.
Defines a weekly calendar. Weeks are defined as complete calendar weeks starting on a specific day of the week at midnight and lasting 7 days. The specified time zone is used to define the exact week start instant.
The start instant is used to define the first day of the week for the weeks.
If the start instant does not match a round week (midnight), the calendar will actually start the next week immediately following the start instant.
The optional end instant allows to specify a finite calendar that will stop on the end instant if it is a round week or at the start of the week otherwise.
The time zone for which these _weeks_ are defined.
The instant this calendar will start.
The optional instant this calendar will end.
A TimeSeries scheduler executes the Workflow for the time partitions defined in a calendar. Each Job defines how it mnaps to the calendar (for example Hourly or Daily UTC), and the Scheduler ensure that at least one Execution is created and successfully run for each defined Job/Period.
The scheduler also allow to Backfill already computed partitions. The Backfill can be recursive or not and an audit log of backfills is kept.