reader module

exception HTTPFormatError[source]

Bases: exceptions.Exception

Raised when the HTTP access log line is not recognized

class LogReader(log_path, sleeping_time=0.1, parse=True)[source]

Bases: threading.Thread

This thread object reads the given log file, and by default parses its lines. Then it sends them in a queue that is read by the Statistician.

To read the file, the LogReader read the lines until the EOF, then waits for a given time: sleeping_time.

Variables:
  • log_path (string) – The path to the log file. The program is terminated if the log file cannot be opened.
  • sleeping_time (float) – The time in second during which the program will sleep after the EOF.
  • parse (bool) – If True, the LogReader will parse the read line with the parse() function before it puts it in the Queue. If False, the Statistician will have to parse it itself.
  • total_nb_of_line_read (int) – Counts the number of lines that have been read since the beginning, including the empty and commented lines.
  • should_run (bool) – If False, the thread will shortly end stop its operation. Used to cleanly end the program.
  • output_queue (Queue) – The queue where the read lines will be put.
  • name (string) – The name of the thread: ‘log reader thread’
run()[source]

Opens the input log file at log_path, goes to the EOF, then try to read new lines. If new lines are detected, sends them to the output_queue for the Statistician (parsed or not parsed depending of self.parse).

When EOF, waits for sleeping_time and starts again.

Note

There are two printing systems: sys1 and sys2, used to send log messages

  • sys1 is used to print WARNING log messages when the LogReader is too slow: last_EOF is big
  • sys2 is used to print DEBUG log messages with the number of line read every second
state()[source]
Returns:Describes the present thread state
Return type:string
get_section(request)[source]

Return the section name from a HTTP request, or None if not a proper HTTP request

Examples

  • GET /test/index/ HTTP => /test
  • GET /te.st/index/ HTTP => /te.st
  • GET /test/index.html HTTP => /test
  • GET /test HTTP => ``/testv
  • GET /test.html HTTP => /
  • GET / HTTP => /
parse_line(line, parse_date=False)[source]

Parse a HTTP w3c formatted line and return a dictionary with the following keys: 'remote_host', 'remote_log_name', 'auth_user', 'date', 'request', 'status', 'bytes'

Note

  • status and bytes are converted to int
  • date can be converted in a datetime object, UTC-time, but by default the conversion is disable (it is slow)
Raises:HTTPFormatError