Module backend.sources.sources_controller_start
Controller to start and manage the Sources Scraper.
This script manages the execution of the scraping processes by spawning
threads to run specified jobs. It identifies the correct working directory
and starts two processes:
- source(): Executes the scraping process.
- reset(): Resets any failed jobs.
The script determines the working directory based on the location of
the job_sources.py script.
Dependencies
- threading
- subprocess
- psutil
- time
- os
- sys
- inspect
- Custom libraries: lib_logger, lib_helper, lib_db
Classes
class SourcesController-
A controller class for managing the Sources Scraper.
Attributes
args:list- List of arguments for the
stopmethod (not used instart). db:object- Database object (not used in
startmethod).
Methods
init(): Initializes the SourcesController object. del(): Destructor for the SourcesController object. start(workingdir): Starts the scraper by launching two jobs in separate threads.
Initializes the SourcesController object.
Expand source code
class SourcesController: """ A controller class for managing the Sources Scraper. Attributes: args (list): List of arguments for the `stop` method (not used in `start`). db (object): Database object (not used in `start` method). Methods: __init__(): Initializes the SourcesController object. __del__(): Destructor for the SourcesController object. start(workingdir): Starts the scraper by launching two jobs in separate threads. """ def __init__(self): """ Initializes the SourcesController object. """ # Initialization logic here (if any) pass def __del__(self): """ Destructor for the SourcesController object. Prints a message when the SourcesController object is destroyed. """ print('Sources Controller object destroyed') def start(self, workingdir): """ Starts the Sources Scraper by opening two jobs in separate threads: - `source()`: Calls `job_sources.py` to start the scraping process. - `reset()`: Calls `job_reset_sources.py` to reset failed jobs. Args: workingdir (str): The directory containing the job scripts. """ def source(): """ Executes the job_sources.py script to start the scraping process. """ job = 'python ' + os.path.join(workingdir, "jobs", 'job_sources.py') os.system(job) def reset(): """ Executes the job_reset_sources.py script to reset failed jobs. """ job = 'python ' + os.path.join(workingdir, "jobs", 'job_reset_sources.py') os.system(job) # Start threads for the defined job functions process1 = threading.Thread(target=source) process1.start() process2 = threading.Thread(target=reset) process2.start()Methods
def start(self, workingdir)-
Starts the Sources Scraper by opening two jobs in separate threads:
source(): Callsjob_sources.pyto start the scraping process.reset(): Callsjob_reset_sources.pyto reset failed jobs.
Args
workingdir:str- The directory containing the job scripts.