Skip to content

config parser

Felix Hamborg edited this page Oct 5, 2017 · 3 revisions

news-please config parser

This guide explains config.pyand how the different configuration files are read.
config.py contains two classes:

Both of them are singleton-classes and have "special" initialisations, all getters of the classes return deepcopies of the objects.

CrawlerConfig

CrawlerConfig parses a normal cfg-file with "Sections" and "Options".

Usage

Import it as early as possible:

from config import CrawlerConfig

First instanciation: The class must only be instanciated once. So it has to be instanciated at the beginning of the program itself. Afterwards this one step is not neccessary anymore and will result in a warning.

cfg = CrawlerConfig.get_instance()
cfg.setup(<FILEPATH>)

Further usage (in any file that is called after the first instantiation):

cfg = CrawlerConfig.get_instance()

Methods

  • get_instance():
    Get the instance of the config-class. This is a singleton-class so CrawlerConfig.get_instance() is the right way to instanciate this class.

  • setup(_filepath_):
    The basic setup of the config file: Reading the file and parsing it to the intern object.

  • config():
    Get a deep-copy of the config-form. Returns 2-dimensional dict.

      config = cfg.config()
      
      config[<section>][<option>] = <value>
    
  • section(_section_):
    Gets a copy of a section. Returns a 1-dimensional dict.

      section = cfg.section(<section>)
      section[<option>] = <value>
    
  • set_section(_section_):
    Sets the current section to get options out of it.

  • option(_option_):
    Requires set_section to be called before.

      cfg.set_section(<section>)
      option = cfg.option(<option>)  
    
      # option == <value>  
    

JsonConfig

JsonConfig parses a special JSON-File with the following format:

{
  "base_urls" : {
    "url": "http://examp.le"
  }
}

Usage

Import it as early as possible:

from config import JsonConfig

First instanciation: The class must only be instanciated once. So it has to be instanciated at the beginning of the program itself. Afterwards this one step is not neccessary anymore and will result in a warning.

json = JsonConfig.get_instance()
json.setup(<FILEPATH>)

Further usage (in any file that is called after the first instanciation):

json = JsonConfig.get_instance()

Methods

  • get_instance():
    Get the instance of the json-config-class. This is a singleton-class so JsonConfig.get_instance() is the right way to instanciate this class.

  • setup(_filepath_):
    The basic setup of the json file: Reading the file and parsing it to the intern object.

  • config():
    Get a deep-copy of the whole parsed json-config-file.

      json_config = json.config()
    
  • load_json(_filepath_):
    Load the JSON-file located at filepath. Should normally not be used. Only for switching inbetween the files. It overwrites all values.

      json.load_json("../test.json");
    
  • get_url_array():
    Get all urls mentioned in the "base_url > url"-section of the file as an array. Returns them as a list.

      print(json.get_url_array())
    
      # Prints something like [u"http://examp.le", u"http://te.st"]