The default scheduler of Hadoop1 cannot deal with real-time jobs, so we add the real-time property to the job conf and change the scheduling stategy to ensure the real-time jobs.Several real-time strategies are applied and each are a scheduler here.
The strategy is realized as follows:
- add the real-time property to the job
- change the JobInProgressListener, and change the job queue in this file
- change the method assignTasks() in Scheduler.java, add a module that kills the job which miss the deadlie
- make the jar and change the configuration of Hadoop1
- restart the Hadoop1 cluster
The default scheduler schedules the jobs coming earlier
The job with earlier deadline will be scheduled first.
The job will be scheduled first with a less spare-time.The spare-time is the remaining computing time minus from deadline.So the remaing computing time should be estimated first.
The job will be scheduled first in the workflow with a shorter period.
The jobs are divided into CPU group and I/O group, and they will be scheduled alternately